Wednesday, February 24, 2021

Generating Excel documents in C#

From a .NET project I needed to generate an Excel document containing a single sheet containing titles, numbers and dates with simple but pleasant formatting. I found there are two ways of doing this without using 3rd party products.

Open XML

This is the "low level" option. Add NuGet package DocumentFormat.OpenXml to your project and following the instructions in the Open XML SDK documentation pages.

The advantage of this option is that you can create and manipulate modern Office documents without any external dependencies.

The disadvantage of this option is the staggering complexity of manipulating document elements at the lowest level. It's like writing Office assembly language.

I tried to create an Excel document with simple formatting and I almost died of exhaustion from the effort. I was going reasonably well until I had to create styles, and all of the examples I could find on how to build sets of styles were so verbose, fragile and complicated that I threw in the towel.

Using Open XML is an option for specialists who need to manipulate Office documents without dependencies and have the staff or time to research how to do it properly. Web searches hint that there are so many quirks and "gotchas" that there is danger of madness.

COM Interop


April 2023 Note -- In the following section I describe how to create a new sheet and format and fill cells. It turned out in my particular case that it was much simpler to use Excel to manually format a skeleton document in exactly the way I desired, setting cell formats, columns widths, fonts, etc, then load the document and simply put values into the cells. This technique is highly recommended if you have sheets with a regular and structured layout, such as a report with columns. For other more complex cases though, you'll have to fall back to verbose code for cell formatting.

After abandoning the Open XML option I attempted to use classic COM Interop to manipulate an Excel document. I used COM to read Excel sheets about 10 years ago and I remembered that it was only moderately tricky.

Add a project reference to the Microsoft.Office.Interop.Excel PIA to get access to the COM interfaces and classes for working with Excel documents.

This option is certainly much easier than using Open XML, but there's an initially steep learning curve before you "get the feel of it" and become productive. It took me 6 hours of solid work and intense web searching to create a simple but nicely formatted document. Some of the COM interfaces are strongly-typed and you get helpful intellisense in Visual Studio, but many properties are of type object and there is no way of figuring out what to do without running many web searches in the hope that you will find a working sample. Seemingly simple tasks like setting cell borders, background colours or columns widths requires obscure bits of code that would be nearly impossible to discover without the help of samples from other poor people who have cracked the secrets.

📌 Important Note — COM references must be released or copies of Excel will remain running in your user Process. In the following code you will see careful use of ReleaseComObject to release every COM object. Take special note of the symmetry of the startup and shutdown code. For more information see Release Excel COM Objects.

Following are snippets of useful COM code that took me hours to discover.

Create a new customised Excel document

_Application app = new Application();
Workbooks books = app.Workbooks;
_Workbook book = books.Add();
Sheets sheets = book.Sheets;
_Worksheet sheet = book.ActiveSheet as _Worksheet;
sheet.Name = "Product Versions";
sheet.PageSetup.Orientation = XlPageOrientation.xlLandscape;
app.StandardFontSize = 10;

Create a style

Style headstyle = book.Styles.Add("HeadStyle");
headstyle.HorizontalAlignment = XlHAlign.xlHAlignCenter;
headstyle.Font.Bold = true;
headstyle.Interior.Color = System.Drawing.ColorTranslator.ToOle(System.Drawing.Color.PowderBlue);
// --repeat--

Create a row of 4 heading text cells

Range range = sheet.Range["A1", "D1"];
range.Style = headstyle;
range = sheet.Range["A1"];
range.Cells.NumberFormatLocal = "@";   // This means Text
range.ColumnWidth = 24;    // First column is wider
Marshal.ReleaseComObject(range);
sheet.Cells[1,1] = "Heading A1";
sheet.Cells[1,2] = "B1";
// --etc--

Set some cells as dates

range = sheet.Range["B3", "H3"];
range.Style = footstyle;
Marshal.ReleaseComObject(range);
range.Cells.NumberFormat = "dd-MMM-yyyy";
sheet.Cells[3, 2] = UpdatedTime.ToOADate();

Setting cell borders

Setting cell background colours causes the borders to be overwritten. If you want cell borders around background coloured cell then you have to set the borders manually.

range = sheet.Range["C1", "D12"];
range.Cells.Borders.Color = System.Drawing.ColorTranslator.ToOle(System.Drawing.Color.Gainsboro);
range.Cells.Borders.LineStyle = XlLineStyle.xlContinuous;
Marshal.ReleaseComObject(range);

Graceful shutdown

Finally, the workbook is saved to a file and everything is closed and released (even sheets which wasn't used). I prefer not to set app.Visible = true, but simply use Process.Start to "launch" the file afterwards. Start Task Manager or procexp.exe and ensure that an instance of Excel can be seen starting and then ending when your Interop code runs.

string savename = @"C:\temp\Sample.xlsx";
if (File.Exists(savename)) File.Delete(savename);
book.SaveAs(savename);
book.Close();
app.Quit();
Marshal.ReleaseComObject(sheet);
Marshal.ReleaseComObject(sheets);
Marshal.ReleaseComObject(book);
Marshal.ReleaseComObject(books);
Marshal.ReleaseComObject(app);
Process.Start(savename);

By piecing together bits of the above code it should be possible to create an Excel document that has custom formatting for font size, cell background colours, cell content category types, cell borders.

Handling of cells and ranges is really clumsy and I'm wondering if there's a better way than referencing single cells with the Cells[row,col] syntax and ranges with Range["A1","B2"] format. Mixing the formats is quite irritating.

There are many other appearances and techniques I haven't tried, like text colour, merging cells, font styles (italic, bold, etc). It's pretty likely that just about anything that's possible in Excel can be done via COM Interop as well, but finding out how will surely require meticulous reading of the documentation or hours of web searching for working samples.

Be careful of the order of code. I found that some code would be frustratingly ineffective until I carefully reordered a couple of lines. I was often just shuffling code around until it worked.

I haven't figured out exactly how to mimic the formatting of cells as you do in Excel with the Ctrl+1 dialog. The number formats make little sense at the moment.

Good luck. It's the least worst way of making Excel documents without 3rd party libraries.

No comments:

Post a Comment