Transforming XML into MS Excel XML

MS Excel understands XML?

If you need to export xml to a Microsoft Excel friendly format, you could stress over the HSSF (Horrible Spread Sheet Format, for the uninitiated) format with apache’s POI framework or you could transform your xml into an format Excel understands. This approach will allow you to decorate your cells with stylized fonts and borders; what it will not allow you to do is create or add complex objects like charts, graphs or pictures. This xml format is a watered down version of excel. If you require the ability to embed images, graphs and complex objects, have a look at Apache’s framework.

Alright, Show me some code

Let’s take a look at the xml we’re going to be using:

<Report caption="Reporting">
	<block 	caption="Staff Memeber Report" 
		userIdLabel="User Id" 
		accountNameLabel="Account Name"
		createDateLabel="Date Created"
		emailLabel="Email">

		<staffMember id="00000" 
			accountName="accountName1"
			createDate="2009-01-02" 
			accountEmail="someone1@domain.com"/>
		<staffMember id="00001"
			accountName="accountName2"
			createDate="2009-02-17" 
			accountEmail="someone2@domain.com"/>
		<staffMember id="00002"
			accountName="accountName3"
			createDate="2009-03-14" 
			accountEmail="someone3@domain.com"/>

	</block>
</Report>

Pretty Straight forward xml, optimized for shorter xpath expressions.

The Magic XSL

<?xml version="1.0" encoding="ISO-8859-1"?>
<?mso-application progid="Excel.Sheet"?>
<xsl:stylesheet version="1.0" 
	xmlns:html="http://www.w3.org/TR/REC-html40"
	xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
	xmlns="urn:schemas-microsoft-com:office:spreadsheet"
	xmlns:o="urn:schemas-microsoft-com:office:office" 
	xmlns:x="urn:schemas-microsoft-com:office:excel"
	xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">

	<xsl:template match="/">

		<Workbook>
			<Styles>
				<Style ss:ID="Default" ss:Name="Normal">
					<Alignment ss:Vertical="Bottom" />
					<Borders />
					<Font />
					<Interior />
					<NumberFormat />
					<Protection />
				</Style>
				<Style ss:ID="s21">
					<Font ss:Size="22" ss:Bold="1" />
				</Style>
				<Style ss:ID="s22">
					<Font ss:Size="14" ss:Bold="1" />
				</Style>
				<Style ss:ID="s23">
					<Font ss:Size="12" ss:Bold="1" />
				</Style>
				<Style ss:ID="s24">
					<Font ss:Size="10" ss:Bold="1" />
				</Style>
			</Styles>

			<Worksheet ss:Name="{//Report/@caption}">
				<Table>
					<Column ss:AutoFitWidth="0" ss:Width="85" />
					<Column ss:AutoFitWidth="0" ss:Width="115" />
					<Column ss:AutoFitWidth="0" ss:Width="115" />
					<Column ss:AutoFitWidth="0" ss:Width="160" />
					<Column ss:AutoFitWidth="0" ss:Width="115" />
					<Column ss:AutoFitWidth="0" ss:Width="85" />
					<Column ss:AutoFitWidth="0" ss:Width="85" />
					<Column ss:AutoFitWidth="0" ss:Width="160" />

					<Row ss:AutoFitHeight="0" ss:Height="27.75">
						<Cell ss:StyleID="s21">
							<Data ss:Type="String">Example Spreadsheet</Data>
						</Cell>
					</Row>
					<Row ss:AutoFitHeight="0" ss:Height="18">
						<Cell ss:StyleID="s22">
							<Data ss:Type="String">
								<xsl:value-of select="//Report/@caption" />
							</Data>
						</Cell>
					</Row>
					<Row>
						<Cell>
							<Data ss:Type="String">
							</Data>
						</Cell>
					</Row>

					<xsl:call-template name="staffReport" />


				</Table>
			</Worksheet>

		</Workbook>

	</xsl:template>


	<xsl:template name="staffReport">

		<Row ss:AutoFitHeight="0" ss:Height="18">
			<Cell ss:StyleID="s23">
				<Data ss:Type="String">
					<xsl:value-of select="//Report/block/@caption" />
				</Data>
			</Cell>
		</Row>
		<Row>
			<Cell ss:StyleID="s24">
				<Data ss:Type="String">
					<xsl:value-of select="//Report/block/@userIdLabel" />
				</Data>
			</Cell>
			<Cell ss:StyleID="s24">
				<Data ss:Type="String">
					<xsl:value-of select="//Report/block/@accountNameLabel" />
				</Data>
			</Cell>
			<Cell ss:StyleID="s24">
				<Data ss:Type="String">
					<xsl:value-of select="//Report/block/@createDateLabel" />
				</Data>
			</Cell>
			<Cell ss:StyleID="s24">
				<Data ss:Type="String">
					<xsl:value-of select="//Report/block/@emailLabel" />
				</Data>
			</Cell>
		</Row>

		<xsl:for-each select="//Report/block/staffMember">

			<Row>
				<Cell>
					<Data ss:Type="String">
						<xsl:value-of select="@id" />
					</Data>
				</Cell>
				<Cell>
					<Data ss:Type="String">
						<xsl:value-of select="@accountName" />
					</Data>
				</Cell>
				<Cell>
					<Data ss:Type="String">
						<xsl:value-of select="@createDate" />
					</Data>
				</Cell>
				<Cell>
					<Data ss:Type="String">
						<xsl:value-of select="@accountEmail" />
					</Data>
				</Cell>
			</Row>

		</xsl:for-each>
	</xsl:template>

</xsl:stylesheet>

The overall XSL structure is pretty much the same as any other XSL. I broke up the report into two main components: the generic, enclosing, Workbook xsl, and the main staffMember xsl template. The enclosing Workbook xsl has the report metadata and sets up the overall layout while the staffMember template loops through the staffMember xml nodes, outputting one row of data per node.

Styled Text

Let’s take a look at the styles mechanism:

<Styles>
	<Style ss:ID="Default" ss:Name="Normal">
		<Alignment ss:Vertical="Bottom" />
		<Borders />
		<Font />
		<Interior />
		<NumberFormat />
		<Protection />
	</Style>
	<Style ss:ID="s21">
		<Font ss:Size="22" ss:Bold="1" />
	</Style>
	...
</Styles>

Notice there is a “Defualt” style, which offers a venue to lay out default styles for all your cells. Then you have unique style definitions like ss:ID=”s21″ which define a font size and weight:

<Font ss:Size="22" ss:Bold="1" />

Size is measured in Points, so take that into account as you determine the size you would like to use. The Bold=”1″ flags the style to render as Bold weight, as oppose to regular, non bold which would be Bold=”0″. If you wanted to change the font you could add ss:FontName=”Tahoma”. A particular style is linked to a cell by adding the style ID as a cell attribute like this:

<Cell ss:StyleID="s22">
	<Data ss:Type="String">some stylized text</Data>
</Cell>

where the ss:StyleID matches the style definition’s ss:ID.

Sizing Columns

Note that you can add multiple Worksheets – all you need to do is add more Worksheet XML nodes, and stick data in them. You can initialize the starting column widths by using the Column nodes under the Table node:

<Column ss:AutoFitWidth="0" ss:Width="85" />
<Column ss:AutoFitWidth="0" ss:Width="115" />

If AutoFitWidth is set to true, it will auto size the columns to whatever appropriate width the numeric or date values consume. Text is not automagically resized. When it’s flagged to 0, and a Width is specified, it will resize to whatever Width is set to. When set to true (1), and a Width is present it will set the width to the specified value, and auto size if the cell data is larger than the Width.

Simple Formulas

You can also embed Excel formulas as part of the XSL so your spreadsheet can come pre-wired with formulas. I didnt include any in this example but I’ll go over an example snippet of code:

<Cell ss:Index="2" ss:Formula="=SUM(R[-3]C,R[-2]C,R[-1]C)">
	<Data ss:Type="Number"></Data>
</Cell>

ss:Formula=”=SUM(R[-3]C,R[-2]C,R[-1]C)” might look a little strange, since you’re probably used to the =SUM(A12,A13,A14) type of notation used from the nomal gui. The XML notation is merely a mechanism for locating which cells to add up in this particular sum. R corresponds to the relative row, and C corresponds to the relative column. So, R[-3] means the row 3 spaces above the current cell, and C means the current cell (since there is no “[x]” notation). If we wanted to include the cell 2 rows down, and 4 columns to the left we could express that as R[2]C[-4]. Simple x/y coordinates. For more on formulas, have a closer look at Microsoft’s ss:Cell documentation.

The Rendered Spreadsheet

That’s pretty much all there is to it. The xml isn’t perfect, but its definitely more presentable than regular csv files without getting in the way for anyone that needs to work with the actual data. Here’s a screen shot for the aetheists:

xml rendered for excel

XML rendered as MS Excel output via xslt

Source Files
report.xml
report.xsl
rendered.xml (change extension to .xml, and open with MS Excel)

Resources
Microsoft overview on Excel XML structure
Microsoft XML Node reference
Wikipedia Article on Office XML formats. Yep Word also has an XML format.

Sidenote:

When looking at the MS Excel documentation be aware that they didn’t declare:

xmlns="urn:schemas-microsoft-com:office:spreadsheet"

but instead

xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"

So their Workbook xsl has ss: preceding every node, when compared to my workbook xsl.

Comments (5)

  1. 9:53 AM, March 7, 2012Dileep  / Reply

    I am using the same code for testing but i am getting below error. may i know what was the problem

    Post processing starting…
    [INFO] Using org.apache.xerces.parsers.SAXParser as SAX2 Parser
    [INFO] FOP 0.20.5rc3a
    [INFO] Using org.apache.xerces.parsers.SAXParser as SAX2 Parser
    [INFO] building formatting object tree
    [INFO] setting up fonts
    [ERROR] Unsupported element encountered: Workbook (Namespace: urn:schemas-microsoft-com:office:spreadsheet). Source context: file:/C:/DOCUME~1/dileep1/LOCALS~1/Temp/pro14F.xml (line: 2, col: 264)
    [ERROR] Expected XSL-FO (root, page-sequence, etc.), SVG (svg, rect, etc.) or elements from another supported language.
    [ERROR] file:/C:/DOCUME~1/dileep1/LOCALS~1/Temp/pro14F.xml:2:264 Root element must be root, not urn:schemas-microsoft-com:office:spreadsheet:Workbook
    …post processing finished

    • 1:02 PM, July 25, 2012Ant  / Reply

      “Root element must be root, not urn:schemas-microsoft-com:office:spreadsheet:Workbook”

      Looks like the xml isn’t properly formatted. The root elementas indicated must be Root. Perhaps post your xml?

  2. 7:32 AM, August 27, 2012Dip beit  / Reply

    Exactly what i was looking for Antonio. Thanks a million.
    Just one more help needed. I need to insert an image to my excel.
    Can you please guide me through?
    Being a newbie to this, i am not able to find the apache framework resources you were refering to.

  3. 9:29 AM, January 28, 2014Ebin  / Reply

    your explanation of simple formulas made my day… 🙂 Thanks for the article.. was searching for a solution of row, column summation..

Leave a Reply

Allowed Tags - You may use these HTML tags and attributes in your comment.

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>