Sunday, June 29, 2014

How To Export Data To Excel File Using Open XML SDK

Problem

Today I want to cover about how to export data to a spreadsheet (Excel) file. I believe this is a very common feature in most of the application, to extract data from a database and then save the data into an Excel file. However, developers who worked on this feature before knows that this is no easy task.

Options

There are a few ways to export data into an Excel file.

Option #1: Microsoft JET OLE DB Provider

This is one of the component from Microsoft Data Access Engine (MDAC). You need to install MDAC on the server in order to get it works. The JET OLE DB Provider only provide the basic data export with standard data source. If you need more features such as including chart, formatting, formula, calculation, etc in the exported Excel file, see the other options below.

Option #2: Visual Studio Tool for Office

Visual Studio Tool for Office (VSTO) can be used for data export, and it also provide most Office related features, automation and integration. But, it relies on Microsoft Office COM library, the Primary Interop Assemblies (PIA). You need to install PIA on the server in order to get it works.

Option #3: Microsoft Office Interop

Microsoft Office Interop provide even more integration, and it is also relying on PIA. Although this uses the same PIA as VSTO, but their assembly namespaces are different. Microsoft Office Interop can be found and directly referenced from the Visual Studio Extension Library as they are installed in Global Assembly Cache (GAC) after you have installed Microsoft Office or PIA in your machine. 

Option #4: Open XML SDK

This is the component that I want to share today which does not rely on PIA, which mean lesser feature than VSTO and Office Interop, but better flexibility than JET OLE DB Provider. You do not need to install anything on the server, just need to include the assembly library in your application, then it will work. How it works is this SDK generate the XML that following ECMA-376, Second Edition and ISO/IEC 29500 standard which is used to form a spreadsheet which is recognizable by any spreadsheet application such as Microsoft Office or Open Office.

Solution

First, we have to obtain and install the Open XML SDK. It is free, can be downloaded from HERE.

Run the Visual Studio, then create new or open existing project, then add the Open XML SDK reference which is located at the Extensions page from the Reference Manager.



Then, add Windows Base assembly which is required later.



Now, what I am trying to do now is to query a large set of record that I want from the database and export them to Excel file. So, here is how my data access component look like:

public class ContactDAC
{
    public List<Contact> Select()
    {
        List<Contact> result = new List<Contact>();

        using (var con = new SqlConnection(ConfigurationManager.ConnectionStrings["default"].ConnectionString))
        using (var cmd = new SqlCommand("SELECT * FROM Contacts", con))
        {
            con.Open();
            using (var reader = cmd.ExecuteReader())
            {
                while (reader.Read())
                {
                    Contact contact = new Contact();
                    contact.ID = reader.GetInt64(0);
                    contact.Name = reader.GetString(1);
                    contact.Number = reader.GetString(2);
                    contact.Address = reader.GetString(3);
                    contact.CreatedDate = reader.GetDateTime(4);

                    result.Add(contact);
                }
            }
        }

        return result;
    }
}

Then, I am creating a data export component which is a generic one, can support any entity type and export the data into Excel file.

Before that, I need this namespace reference:

using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;

public class DataExportComponent<T>
{
    private static string _tableName;
    private static PropertyInfo[] _properties;

    static DataExportComponent()
    {
        //using Reflection to identify all the columns (properties) of my table (entity)
        Type type = typeof(T);
        _tableName = type.Name;
        _properties = type.GetProperties();
    }

Following is the common way to create a new spreadsheet file:

public void Export(string path, List<T> entities)
{
    //Create a new spreadsheet file
    //path is the Excel file location and name
    //My path is C:\temp\OpenXML\contacts.xlsx
    using (SpreadsheetDocument spreadsheet =
        SpreadsheetDocument.Create(path,
        SpreadsheetDocumentType.Workbook))
    {
        WorkbookPart workbookPart = spreadsheet.AddWorkbookPart();
        workbookPart.Workbook = new Workbook();

        WorksheetPart worksheetPart = workbookPart.AddNewPart<WorksheetPart>();
        SheetData sheetData = new SheetData();
        Sheets sheets = new Sheets();
        Sheet sheet = new Sheet();
        sheet.Id = spreadsheet.WorkbookPart.GetIdOfPart(worksheetPart);
        sheet.SheetId = 1; //sheet Id, anything but unique
        sheet.Name = _tableName; //sheet name
        sheets.Append(sheet);

Then, query the data from the database with the data access component above, with a List<Contact> type, iterate the List and then inject the value into cell column by column then repeat for row by row.

//loop row by row depending on the List count
for (int i = 0; i < entities.Count; i++)
{
    Row row = new Row();
    row.RowIndex = (uint)i+1; //RowIndex must be start with 1, since i = 0

    //loop column by column depending on the entity properties
    for (int column = 0; column < _properties.Count(); column++)
    {
        Cell cell = new Cell();
        //Cell Reference is the Excel row and column index,
        //For example:
        //First row first column: A1
        //Fifth row thrid column: C5
        cell.CellReference = CreateCellReference(column+1) + row.RowIndex.ToString();
                      
        cell.DataType = CellValues.InlineString;
                           
        InlineString inlineString = new InlineString();
        Text text = new Text();
        text.Text = _properties[column].GetValue(entities[i]).ToString();
        inlineString.Text = text; //The value you want to put into cell
        cell.AppendChild(inlineString);

        //Keep appending cell (column) into the same row
        row.AppendChild(cell);
    }
                   
    //Keep appending row into the data sheet
    sheetData.Append(row);
}

worksheetPart.Worksheet = new Worksheet(sheetData);

spreadsheet.WorkbookPart.Workbook.AppendChild<Sheets>(sheets);
workbookPart.Workbook.Save();

spreadsheet.Close();


The Cell Reference is a bit troublesome as we have to handle the column index increment automatically when the last index 'Z' is used. This is needed when your data contain more than 26 columns.

For example, if you have 27 columns data, at the 26th column, Z is used, then 27th column should be AA. Therefore, I used the following code to handle it for me.

private string CreateCellReference(int column)
{
    string result = string.Empty;
    //First is A
    //After Z, is AA
    //After ZZ, is AAA
    char firstRef = 'A';
    uint firstIndex = (uint)firstRef;
    int mod = 0;

    while (column > 0)
    {
        mod = (column - 1) % 26;
        result += (char)(firstIndex + mod);
        column = (column - mod) / 26;
    }

    return result;
} 

That's all for my generic data export component, now here is how I use it:

ContactDAC dac = new ContactDAC();
List<Contact> entities = dac.Select();

DataExportComponent<Contact> com = new DataExportComponent<Contact>();
com.Export(@"C:\temp\OpenXML\contacts.xlsx", entities);

Testing

I compiled and publish my application to a newly installed server which does not have any MDAC and Microsoft Office product installed, and then run my application. I notice the application run without any error, and I get my Excel file created with data at the location which I had specified in my program.

Challenge

If you have an ASP.NET application for client to export file, the common challenge that you may face is request timeout when the client is trying to export a very large file. In order to prevent the timeout error, we should make the front end application to process the file export in asynchronous manner. How? I have this concept but yet to implement:


In my next post, I may share how to implement this large file export concept if I am not lazy.

Enjoy exporting!

2 comments:

  1. Nice article, thanx, a little snag, I get BA instead of AB

    ReplyDelete
  2. I would suggest you to try ZetExcel as it is really helpful

    ReplyDelete

Send Transactional SMS with API

This post cover how to send transactional SMS using the Alibaba Cloud Short Message Service API. Transactional SMS usually come with One Tim...