Problem
Today I want to cover about how to export data to a spreadsheet (Excel) file. I believe this is a very common feature in most of the application, to extract data from a database and then save the data into an Excel file. However, developers who worked on this feature before knows that this is no easy task.
Options
There are a few ways to export data into an Excel file.
Option #1: Microsoft JET OLE DB Provider
This is one of the component from Microsoft Data Access Engine (MDAC). You need to install MDAC on the server in order to get it works. The JET OLE DB Provider only provide the basic data export with standard data source. If you need more features such as including chart, formatting, formula, calculation, etc in the exported Excel file, see the other options below.
Option #2: Visual Studio Tool for Office
Visual Studio Tool for Office (VSTO) can be used for data export, and it also provide most Office related features, automation and integration. But, it relies on Microsoft Office COM library, the Primary Interop Assemblies (PIA). You need to install PIA on the server in order to get it works.
Option #3: Microsoft Office Interop
Microsoft Office Interop provide even more integration, and it is also relying on PIA. Although this uses the same PIA as VSTO, but their assembly namespaces are different. Microsoft Office Interop can be found and directly referenced from the Visual Studio Extension Library as they are installed in Global Assembly Cache (GAC) after you have installed Microsoft Office or PIA in your machine.
Option #4: Open XML SDK
This is the component that I want to share today which does not rely on PIA, which mean lesser feature than VSTO and Office Interop, but better flexibility than JET OLE DB Provider. You do not need to install anything on the server, just need to include the assembly library in your application, then it will work. How it works is this SDK generate the XML that following ECMA-376, Second Edition and ISO/IEC 29500 standard which is used to form a spreadsheet which is recognizable by any spreadsheet application such as Microsoft Office or Open Office.
Solution
First, we have to obtain and install the Open XML SDK. It is free, can be downloaded from HERE.
Run the Visual Studio, then create new or open existing project, then add the Open XML SDK reference which is located at the Extensions page from the Reference Manager.
Then, add Windows Base assembly which is required later.
Now, what I am trying to do now is to query a large set of record that I want from the database and export them to Excel file. So, here is how my data access component look like:
public class ContactDAC
{
public List<Contact>
Select()
{
List<Contact>
result = new List<Contact>();
using
(var con = new SqlConnection(ConfigurationManager.ConnectionStrings["default"].ConnectionString))
using
(var cmd = new SqlCommand("SELECT * FROM Contacts", con))
{
con.Open();
using
(var reader = cmd.ExecuteReader())
{
while
(reader.Read())
{
Contact
contact = new Contact();
contact.ID =
reader.GetInt64(0);
contact.Name =
reader.GetString(1);
contact.Number =
reader.GetString(2);
contact.Address =
reader.GetString(3);
contact.CreatedDate = reader.GetDateTime(4);
result.Add(contact);
}
}
}
return
result;
}
}
Then, I am creating a data export component which is a generic one, can support any entity type and export the data into Excel file.
Before that, I need this namespace reference:
using
DocumentFormat.OpenXml;
using
DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
public class DataExportComponent<T>
{
private static string
_tableName;
private static PropertyInfo[]
_properties;
static
DataExportComponent()
{
//using Reflection to
identify all the columns (properties) of my table (entity)
Type
type = typeof(T);
_tableName = type.Name;
_properties = type.GetProperties();
}
Following is the common way to create a new spreadsheet file:
public void
Export(string path, List<T>
entities)
{
//Create a new spreadsheet
file
//path is the Excel file
location and name
//My path is
C:\temp\OpenXML\contacts.xlsx
using
(SpreadsheetDocument spreadsheet =
SpreadsheetDocument.Create(path,
SpreadsheetDocumentType.Workbook))
{
WorkbookPart
workbookPart = spreadsheet.AddWorkbookPart();
workbookPart.Workbook = new Workbook();
WorksheetPart
worksheetPart = workbookPart.AddNewPart<WorksheetPart>();
SheetData
sheetData = new SheetData();
Sheets
sheets = new Sheets();
Sheet
sheet = new Sheet();
sheet.Id =
spreadsheet.WorkbookPart.GetIdOfPart(worksheetPart);
sheet.SheetId = 1; //sheet
Id, anything but unique
sheet.Name = _tableName; //sheet
name
sheets.Append(sheet);
Then, query the data from the database with the data access component above, with a List<Contact> type, iterate the List and then inject the value into cell column by column then repeat for row by row.
//loop row by row
depending on the List count
for
(int i = 0; i < entities.Count; i++)
{
Row
row = new Row();
row.RowIndex = (uint)i+1;
//RowIndex must be start with 1, since i = 0
//loop column by column
depending on the entity properties
for
(int column = 0; column <
_properties.Count(); column++)
{
Cell
cell = new Cell();
//Cell Reference is the Excel
row and column index,
//For example:
//First row first column: A1
//Fifth row thrid column: C5
cell.CellReference =
CreateCellReference(column+1) + row.RowIndex.ToString();
cell.DataType = CellValues.InlineString;
InlineString
inlineString = new InlineString();
Text
text = new Text();
text.Text =
_properties[column].GetValue(entities[i]).ToString();
inlineString.Text = text; //The
value you want to put into cell
cell.AppendChild(inlineString);
//Keep appending cell
(column) into the same row
row.AppendChild(cell);
}
//Keep appending row into the
data sheet
sheetData.Append(row);
}
worksheetPart.Worksheet
= new Worksheet(sheetData);
spreadsheet.WorkbookPart.Workbook.AppendChild<Sheets>(sheets);
workbookPart.Workbook.Save();
spreadsheet.Close();
The Cell Reference is a bit troublesome as we have to handle the column index increment automatically when the last index 'Z' is used. This is needed when your data contain more than 26 columns.
For example, if you have 27 columns data, at the 26th column, Z is used, then 27th column should be AA. Therefore, I used the following code to handle it for me.
private string
CreateCellReference(int column)
{
string
result = string.Empty;
//First is A
//After Z, is AA
//After ZZ, is AAA
char
firstRef = 'A';
uint
firstIndex = (uint)firstRef;
int
mod = 0;
while
(column > 0)
{
mod = (column - 1) % 26;
result += (char)(firstIndex
+ mod);
column = (column - mod) / 26;
}
return
result;
}
That's all for my generic data export component, now here is how I use it:
ContactDAC
dac = new ContactDAC();
List<Contact>
entities = dac.Select();
DataExportComponent<Contact>
com = new DataExportComponent<Contact>();
com.Export(@"C:\temp\OpenXML\contacts.xlsx",
entities);
Testing
I compiled and publish my application to a newly installed server which does not have any MDAC and Microsoft Office product installed, and then run my application. I notice the application run without any error, and I get my Excel file created with data at the location which I had specified in my program.
Challenge
If you have an ASP.NET application for client to export file, the common challenge that you may face is request timeout when the client is trying to export a very large file. In order to prevent the timeout error, we should make the front end application to process the file export in asynchronous manner. How? I have this concept but yet to implement:
In my next post, I may share how to implement this large file export concept if I am not lazy.
Enjoy exporting!