« Posts under xml

Transforming XML into MS Excel XML

MS Excel understands XML?

If you need to export xml to a Microsoft Excel friendly format, you could stress over the HSSF (Horrible Spread Sheet Format, for the uninitiated) format with apache’s POI framework or you could transform your xml into an format Excel understands. This approach will allow you to decorate your cells with stylized fonts and borders; what it will not allow you to do is create or add complex objects like charts, graphs or pictures. This xml format is a watered down version of excel. If you require the ability to embed images, graphs and complex objects, have a look at Apache’s framework.

Alright, Show me some code

Let’s take a look at the xml we’re going to be using:

<Report caption="Reporting">
	<block 	caption="Staff Memeber Report" 
		userIdLabel="User Id" 
		accountNameLabel="Account Name"
		createDateLabel="Date Created"
		emailLabel="Email">

		<staffMember id="00000" 
			accountName="accountName1"
			createDate="2009-01-02" 
			accountEmail="someone1@domain.com"/>
		<staffMember id="00001"
			accountName="accountName2"
			createDate="2009-02-17" 
			accountEmail="someone2@domain.com"/>
		<staffMember id="00002"
			accountName="accountName3"
			createDate="2009-03-14" 
			accountEmail="someone3@domain.com"/>

	</block>
</Report>

Pretty Straight forward xml, optimized for shorter xpath expressions.

The Magic XSL

<?xml version="1.0" encoding="ISO-8859-1"?>
<?mso-application progid="Excel.Sheet"?>
<xsl:stylesheet version="1.0" 
	xmlns:html="http://www.w3.org/TR/REC-html40"
	xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
	xmlns="urn:schemas-microsoft-com:office:spreadsheet"
	xmlns:o="urn:schemas-microsoft-com:office:office" 
	xmlns:x="urn:schemas-microsoft-com:office:excel"
	xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">

	<xsl:template match="/">

		<Workbook>
			<Styles>
				<Style ss:ID="Default" ss:Name="Normal">
					<Alignment ss:Vertical="Bottom" />
					<Borders />
					<Font />
					<Interior />
					<NumberFormat />
					<Protection />
				</Style>
				<Style ss:ID="s21">
					<Font ss:Size="22" ss:Bold="1" />
				</Style>
				<Style ss:ID="s22">
					<Font ss:Size="14" ss:Bold="1" />
				</Style>
				<Style ss:ID="s23">
					<Font ss:Size="12" ss:Bold="1" />
				</Style>
				<Style ss:ID="s24">
					<Font ss:Size="10" ss:Bold="1" />
				</Style>
			</Styles>

			<Worksheet ss:Name="{//Report/@caption}">
				<Table>
					<Column ss:AutoFitWidth="0" ss:Width="85" />
					<Column ss:AutoFitWidth="0" ss:Width="115" />
					<Column ss:AutoFitWidth="0" ss:Width="115" />
					<Column ss:AutoFitWidth="0" ss:Width="160" />
					<Column ss:AutoFitWidth="0" ss:Width="115" />
					<Column ss:AutoFitWidth="0" ss:Width="85" />
					<Column ss:AutoFitWidth="0" ss:Width="85" />
					<Column ss:AutoFitWidth="0" ss:Width="160" />

					<Row ss:AutoFitHeight="0" ss:Height="27.75">
						<Cell ss:StyleID="s21">
							<Data ss:Type="String">Example Spreadsheet</Data>
						</Cell>
					</Row>
					<Row ss:AutoFitHeight="0" ss:Height="18">
						<Cell ss:StyleID="s22">
							<Data ss:Type="String">
								<xsl:value-of select="//Report/@caption" />
							</Data>
						</Cell>
					</Row>
					<Row>
						<Cell>
							<Data ss:Type="String">
							</Data>
						</Cell>
					</Row>

					<xsl:call-template name="staffReport" />


				</Table>
			</Worksheet>

		</Workbook>

	</xsl:template>


	<xsl:template name="staffReport">

		<Row ss:AutoFitHeight="0" ss:Height="18">
			<Cell ss:StyleID="s23">
				<Data ss:Type="String">
					<xsl:value-of select="//Report/block/@caption" />
				</Data>
			</Cell>
		</Row>
		<Row>
			<Cell ss:StyleID="s24">
				<Data ss:Type="String">
					<xsl:value-of select="//Report/block/@userIdLabel" />
				</Data>
			</Cell>
			<Cell ss:StyleID="s24">
				<Data ss:Type="String">
					<xsl:value-of select="//Report/block/@accountNameLabel" />
				</Data>
			</Cell>
			<Cell ss:StyleID="s24">
				<Data ss:Type="String">
					<xsl:value-of select="//Report/block/@createDateLabel" />
				</Data>
			</Cell>
			<Cell ss:StyleID="s24">
				<Data ss:Type="String">
					<xsl:value-of select="//Report/block/@emailLabel" />
				</Data>
			</Cell>
		</Row>

		<xsl:for-each select="//Report/block/staffMember">

			<Row>
				<Cell>
					<Data ss:Type="String">
						<xsl:value-of select="@id" />
					</Data>
				</Cell>
				<Cell>
					<Data ss:Type="String">
						<xsl:value-of select="@accountName" />
					</Data>
				</Cell>
				<Cell>
					<Data ss:Type="String">
						<xsl:value-of select="@createDate" />
					</Data>
				</Cell>
				<Cell>
					<Data ss:Type="String">
						<xsl:value-of select="@accountEmail" />
					</Data>
				</Cell>
			</Row>

		</xsl:for-each>
	</xsl:template>

</xsl:stylesheet>

The overall XSL structure is pretty much the same as any other XSL. I broke up the report into two main components: the generic, enclosing, Workbook xsl, and the main staffMember xsl template. The enclosing Workbook xsl has the report metadata and sets up the overall layout while the staffMember template loops through the staffMember xml nodes, outputting one row of data per node.

Styled Text

Let’s take a look at the styles mechanism:

<Styles>
	<Style ss:ID="Default" ss:Name="Normal">
		<Alignment ss:Vertical="Bottom" />
		<Borders />
		<Font />
		<Interior />
		<NumberFormat />
		<Protection />
	</Style>
	<Style ss:ID="s21">
		<Font ss:Size="22" ss:Bold="1" />
	</Style>
	...
</Styles>

Notice there is a “Defualt” style, which offers a venue to lay out default styles for all your cells. Then you have unique style definitions like ss:ID=”s21″ which define a font size and weight:

<Font ss:Size="22" ss:Bold="1" />

Size is measured in Points, so take that into account as you determine the size you would like to use. The Bold=”1″ flags the style to render as Bold weight, as oppose to regular, non bold which would be Bold=”0″. If you wanted to change the font you could add ss:FontName=”Tahoma”. A particular style is linked to a cell by adding the style ID as a cell attribute like this:

<Cell ss:StyleID="s22">
	<Data ss:Type="String">some stylized text</Data>
</Cell>

where the ss:StyleID matches the style definition’s ss:ID.

Sizing Columns

Note that you can add multiple Worksheets – all you need to do is add more Worksheet XML nodes, and stick data in them. You can initialize the starting column widths by using the Column nodes under the Table node:

<Column ss:AutoFitWidth="0" ss:Width="85" />
<Column ss:AutoFitWidth="0" ss:Width="115" />

If AutoFitWidth is set to true, it will auto size the columns to whatever appropriate width the numeric or date values consume. Text is not automagically resized. When it’s flagged to 0, and a Width is specified, it will resize to whatever Width is set to. When set to true (1), and a Width is present it will set the width to the specified value, and auto size if the cell data is larger than the Width.

Simple Formulas

You can also embed Excel formulas as part of the XSL so your spreadsheet can come pre-wired with formulas. I didnt include any in this example but I’ll go over an example snippet of code:

<Cell ss:Index="2" ss:Formula="=SUM(R[-3]C,R[-2]C,R[-1]C)">
	<Data ss:Type="Number"></Data>
</Cell>

ss:Formula=”=SUM(R[-3]C,R[-2]C,R[-1]C)” might look a little strange, since you’re probably used to the =SUM(A12,A13,A14) type of notation used from the nomal gui. The XML notation is merely a mechanism for locating which cells to add up in this particular sum. R corresponds to the relative row, and C corresponds to the relative column. So, R[-3] means the row 3 spaces above the current cell, and C means the current cell (since there is no “[x]” notation). If we wanted to include the cell 2 rows down, and 4 columns to the left we could express that as R[2]C[-4]. Simple x/y coordinates. For more on formulas, have a closer look at Microsoft’s ss:Cell documentation.

The Rendered Spreadsheet

That’s pretty much all there is to it. The xml isn’t perfect, but its definitely more presentable than regular csv files without getting in the way for anyone that needs to work with the actual data. Here’s a screen shot for the aetheists:

xml rendered for excel

XML rendered as MS Excel output via xslt

Source Files
report.xml
report.xsl
rendered.xml (change extension to .xml, and open with MS Excel)

Resources
Microsoft overview on Excel XML structure
Microsoft XML Node reference
Wikipedia Article on Office XML formats. Yep Word also has an XML format.

Sidenote:

When looking at the MS Excel documentation be aware that they didn’t declare:

xmlns="urn:schemas-microsoft-com:office:spreadsheet"

but instead

xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"

So their Workbook xsl has ss: preceding every node, when compared to my workbook xsl.

Java, XML and XStream

What’s an object/xml serializaing/deserializaing library?

If you’ve never worked with an object/xml serializer and are considering writing your own from scratch, you may want to consider using a library like XStream. XStream is very good at moving java into xml and back. It allows a high level of control over how the xml can be organized and structured and even allows the user to create their own converters for even more flexibility.

But still, why use something like this when you can be perfectly happy writing your own data conversion scheme? The problem really boils down to flexibility, and in reinventing the wheel. Ninety percent of the time you’re already interrogating a datasource (like some rdbm system like oracle, postgres or mysql) and will be using some kind of TransferObject or maybe an Entity persistence scheme built around pojos. If you write your own serializing engine from scratch by mapping pojos to dom4j nodes, constructing Document objects and then using them for stuff like xsl transformations, you end up missing out on a great tool.

It may not seem obvious right now, but a homegrown serializer is the kind of thing you can write once and forget about and then months or years down the line, when it comes time to update your data model or expand its framework, you end up rebuilding all the dom4j stuff from scratch. Unless you take the lazy route and append and new xml to the root node to save yourself the entire node refactor. Maybe simple objects with one or two simple nested objects wont seem like much, but if your object becomes anything close to approaching a complex lattice, then going back and tweaking the entire structure when you want to expand or refactor your xml can become quite perilous. Especially if you want to make your xml as xpath friendly as possible.

Edit:
As Felipe Gaucho has been kind enough to point out, Xstream only writes a text string as the serialized object. It will not preform any validation on your XML, so you’re left on your own validate it post serialization. Something like JAXP comes to mind to tackle XSD based validation, or JiBX if you’re looking for Data Binding.

So what does XStream do for me?

Consider these objects:

 public class MyClass {

	protected MyObject object;
	
}
public class MyObject {

	protected ArrayList Field;
	
}

XStream lets you do something like this if you want to serialize an object like MyClass to xml:

 XStream xstream = new XStream();
String myClassXML= xstream.toXML(myClassObject);

and if you want to go from xml back to a java object you can do this:

 XStream xstream = new XStream();
MyClass myClassObject= xstream.fromXML(myClassXML);

As you can see, all the plumbing goes away and you are now free to concentrate on writing the rest of your application. And if you want change your object model, consolidate nodes or rearrange the structure of your xml, all you have to do is update your pojo and your xml immediately will reflect the updated changes in the data model on serialization.

It should be noted that to completely deserialize xml, your object needs to correctly map all the data in the xml. If you have trouble deserializing try building a mock object and populating it with sample values and then serialize it to xml; then you can compare the test xml to what your actual xml is and make your changes.

Alilasing

XStream does not require any configuration, although the xml produced out of the box will likely not be the easiest to read. It will serialize objects into xml nodes according to their package names, usually making them very long as we can see from the following example:

<com.package.something.MyClass>
	<com.package.something.MyObject>
		<List>
			<com.package.something.Field/>
			<com.package.something.Field/>
		</List>
	</com.package.something.MyObject>
</com.package.something.MyClass>

Luckily XStream has a mechanism we can use to alias these long package names. It goes something like this:

XStream xstream = new XStream();
xstream.alias("MyClass", MyClass.class);
xstream.alias("MyObject", MyObject.class);
xstream.alias("Field", Field.class);

Adding an alias like this will let your xml come across nice and neat like this:

 <MyClass>
	<MyObject>
		<List>
			<Field/>
			<Field/>
		<List>
	</MyObject>
</MyClass>

Attributes

If you want to make a regular text node an attribute, you can use this call to configure it:

 xstream.useAttributeFor(Field.class, "name");

This will change make your xml change from this:

 <MyClass>
	<MyObject>
		<List>
			<Field/>
				<name>foo</name>
			<Field/>
		<List>
	</MyObject>
</MyClass>

into

 <MyClass>
	<MyObject>
		<List>
			<Field name="foo"/>
			<Field/>
		<List>
	</MyObject>
</MyClass>

ArrayList (implicit collections)

ArrayLists are a little tricker. This is what they look like out of the box:

 ...
	<MyObject>
		<List>
			<Field/>
			<Field/>
		</List>
	<MyObject>
...

Note theres an extra “List” node enclosing the List elements name “Field”. If we want to get rid of that node so that Field is right under Object, we could tell XStream to map an implicit collection by doing the following:

 xstream.addImplicitCollection(MyObject.class, "Field", "Field", Field.class);

where the addImplicitCollection method signature is the following:

 /**
	 * Appends an implicit collection to an object for serializaion
	 * 
	 * @param ownerType - class owning the implicit collection (class owner)
	 * @param fieldName - name of the field in the ownerType (Java field name)
	 * @param itemFieldName - name of the implicit collection (XML node name)
	 * @param itemType - item type to be aliases be the itemFieldName (class owned)
	 */
	public void addImplicitCollection(Class ownerType,
            String fieldName,
            String itemFieldName,
            Class itemType) 

Adding this implicit collection configuration will streamline the xml so that it looks like this now:

 
<MyClass>
	<MyObject>
		<Field/>
		<Field/>
	</MyObject>
</MyClass>

Notice the “List” node is gone, and “Field” is now directly under “MyObject”. You can find the complete documentation on the XStream website here.

There are plenty of more tricks you can use to configure/format your xml, and there are plenty of examples listed on the XStream website, but these three points here should cover the basics to get you started.

5 ways to make XML more XPath friendly

As java developers we should always do what we can to optimize outbound xml from our side of the fence. I mean, its our job to build and design awesome, elegant and efficient software whenever possible right? We have our data and we want to serialize it into xml, how can we make our xml as efficient and xpath friendly as possible?

1) Keep the structure simple

Consolidate data nodes whenever possible before marshaling your object into xml. You really don’t want to have to resort to using xpath for any unnecessary lookups across nodes. Keep those lookups confined to the persistence layer where they belong. I realize this may not always be possible, but keeping it to a minimum will really help the xsl developers by not forcing them to create ridiculous xpath expressions in order to link discretely separate data nodes. We want to keep the xpath logic as simple as possible.

Consider the following xml document:

<root>
	<house id="1" color="white">
	<room name="livingRoom" houseId="1" windows="3"/>
	<room name="kitchen" houseId="1" windows="2"/>
	<room name="bedroom" houseId="1" windows="4"/>
	<room name="bathroom" houseId="1" windows="1"/>
</root>

If this is how your xml is structured, and you wanted to transform this data with xsl you are creating all kinds of extra work and could end up causing an unnecessary performance bottleneck for your application. If you wanted to do something like lay out the total number of rooms for the house with id of 1, your xpath would have to look something like this:

count(/root/house[@id=1]/room)

You are now making your xsl developer implement logic to count all the rooms for a particular house node, using an xpath function and selector conditional logic to filter a global count. Just because you can use the count function does not mean you should use it every chance you get. This xpath expression will traverse all of your xml and count the number of nodes whose house node is 1, and return the total number of room nodes. Its not much of a problem if your xml is one or two house nodes deep, but what if you had a like 10, 20 or even 30 house nodes or more? If you are processing numbers like these, and then you span these node traversal across say a hundred requests, you would be doing something like 3,000 traversals. What if instead you used an xml structure like this:

<root>
	<house id="1"  totalRooms="4" color="white">
		<room name="livingRoom" windows="3"/>
		<room name="kitchen" windows="2"/>
		<room name="bedroom" windows="4"/>
		<room name="bathroom" windows="1"/>
	</house>
</root>

In this example we attached the room count to the house node as an attribute. This way, our xpath expression ends up looking like this:

/root/house/@totalRooms

No count function, no selector conditional logic to filter; you end up with a simple, basic xpath expression. You’re doing a single lookup instead of an entire 3,000 node traversal while collecting a node count that has to be calculated as the transformation is processing. Let the data/persistence layer perform those types of counts and populate all the data, and let your xsl/xpath lay out your data. Keep the structure simple. If this is not possible, you might be doing something wrong.

2) Avoid namespaces whenever possible

Namespaces are important, yes. But if you are trying to express a java object as xml, prefer using a different name altogether than attaching namespaces. This really comes down to creating a more specific, descriptive naming scheme for your application’s objects. If you are using namespaces just for the heck of it, I’d urge you not to. You end up adding lots of unnecessary noise to your xpath, and having those colons all over the place in your xml can make the document look like its been processed through a cheese grater. Anyone trying to read your xml will cringe at the thought and will want to pass it off to the new guy as part of the hazing ritual. Consider the following xml:

<root>
	<myHouseNamespace:house id="1"  totalRooms="4" color="white">
		<myHouseNamespace:room name="livingRoom" windows="3">
			<myHouseNamespace:furniture name="sofa" type="leather"/>
			<myHouseNamespace:furniture name="table"/>
			<myHouseNamespace:furniture name="lamp"/>
		</myHouseNamespace:room>
		<myHouseNamespace:room name="kitchen" windows="2"/>
		<myHouseNamespace:room name="bedroom" windows="4"/>
		<myHouseNamespace:room name="bathroom" windows="1"/>
	</myHouseNamespace:house>
</root>

This makes your xpath look like this:

/root/myHouseNamespace:house/myHouseNamespace:room[@name=’livingroom’]/@windows

I don’t know about you, but this xpath expression is hard to read with all the “myHouseNamespace:” junk all over the place. And we only went 2 nodes deep into the tree. A third level down would have marched the xpath expression across the width of this blog! Who loves mile long xpath expressions that add side scrollbars to your text editor on your widescreen monitor? NO ONE.

3) Use attributes wherever appropriate

There is really no difference between using an attribute and a single text node to express a simple value in xml. In other words, there is nothing different between this:

<root>
	<house>
		<id>1</id>
		<totalRooms>4</totalRooms>
		<color>white</color>
	</house>
</root>

and this:

<root>
	<house id="1"  totalRooms="4" color="white"/>
</root>

So why prefer attributes? Because it makes your xml document legible. Having a ton of single child text nodes adds a lot of noise to your document especially if they are digit or single word attributes. It becomes easy to mix up your text values with real, slightly more complex nodes of data in your xml tree. Readability is important and keeping your xml as efficient expressed as possible without sacrificing readability is paramount to making it understandable and easier to work with.

4) Clearly name your xml nodes

Use names that actually describe the data being represented. Stay away from these kinds of names:

<root>
	<h id="1"  totalRooms="4" color="white">
		<r name="livingRoom" windows="3">
			<f name="sofa" type="leather"/>
			<f name="table"/>
			<f name="lamp"/>
		</r>
		<r name="kitchen" windows="2"/>
		<r name="bedroom" windows="4"/>
		<r name="bathroom" windows="1"/>
	</h>
</root>

Sure, your xpath will be much shorter :

/root/h/r[@name=’livingroom’]/f[@name=’sofa’]/@type

But unless you’re very, very comfortable with the single letter naming convention, you might end up having a hard time keeping track of all the nodes since they’re small and easy to overlook. Descriptive, concise names help make your xml easier to learn or come back to if the data is named in a clear, self explanatory way. Ideally, your xml should have names that the untrained eye should be able to pick up, and make sense of the basic structure with little preparation.

5) Make your xml as human readable as possible

This point encapsulates all the others. XML is a language meant to be tied very closely to data, and our ability to understand that data will allow us as developers to mold it into whatever we want it to be. If you’ve ever had to sit through a very complex piece of xml, you’ll realize that forcing anyone to have to muck through an unconventional structure and non-intuitive names ends up breaking up the xslt development into three phases:

1) figuring out/understanding the xml
2) implementing the xsl
3) figuring out/understanding the xml, and what was just implemented

The longer the developer has to sit there and meddle with #1, and #3, the more time is lost in bringing your product to the next release. We want to spend as little time as possible figuring out or compensating for poorly structured xml so the real implementation work can be completed, and we can move on to the next thing.

In other words

The bottom line is if you structure your xml so that its easy to understand by humans, and it doesn’t cut corners by passing data lookups or counts to the xsl developer, your application will become much more efficient, well written, and easier to work with. This is a good place to exercise the separation of concerns principle, let the data/persistence layer do what it does best, and let the xml/xslt layer do what it does best. XSLT is a relatively expensive process, but even with infinite computing resources we should always strive to make the most of out whatever resources we can allocate.

Basic Ant scripts

What’s an Ant script? Do I need bug spray?

Ant is a scripting tool commonly used to build, compile and deploy projects. This is in no way an all encompassing inventory of what Ant can do. It is extensible and its instructions are expressed in an xml format whose nodes comprise a framework of abilities designed to make menial tasks automated.

From a developer’s perspective, the most basic Ant tasks are the compile, package and copy tasks. All java projects must do these three things many, many, many times during a development cycle. Its very boring and tedious if you do it by hand through the command prompt. First you’d have to type in a bunch of commands like “javac [ options ] [ sourcefiles ] [ @argfiles ]” detailing all the class paths you want to use, all the source files, and all then the other supporting parameters you need to enter to get it to compile your program correctly. If you’re only writing one class, its probably not that bad. But when you have hundreds of classes, multiple projects and dependencies, and a slew of directories to configure and lay out for compiling, it quickly becomes ridiculous. In fact, I would claim that it becomes ridonkulous.

Ant lets you define tasks that break up these chores into a series of chained events. An Ant task is broken up into what are called “targets”. Each target is meant to perform one task, or unit of work. If we break up the compilation/deploy process it could look something like this:

  1. clean out the scrub/temporary directory
  2. compile all the java files into class files
  3. package up all the class files into a jar file, or some kind of deployable artifact
  4. copy the new jar file to the deploy directory

We can define each one of these steps with an Ant task. This is a good thing because it allows us to chain them like stairs, one task leading into the next. If any one of the tasks fail, the script would fail and Ant would tell us where the problem happened (with line number, and the exact problem or exception).

Here are what these steps might look like:

1) Clean up the build directories

<!-- *************** -->
<!-- * Preparation * -->
<!-- *************** -->

<target name="prepare" depends="clean">
	<mkdir dir="${build.dir}"/>
	<mkdir dir="${build.dir}/jars"/>
	<mkdir dir="${build.dir}/openscope"/>
</target>

Here, the “clean” depends attribute references a previous ant target that deletes all these scrub directories. This “prepare” target creates the scrub directories we’re going to use in our build. mkdir creates a directory.

2) Compile all the java files into class file

<!-- *************** -->
<!-- * Compilation * -->
<!-- *************** -->	

<target name="compile" depends="prepare">
	<javac destdir="${build.dir}/openscope"
			debug="on"
			deprecation="on"
			optimize="off">
		<src path="${src.dir}"/>
	<classpath refid="build.classpath"/>
	</javac>

	<copy todir="${build.dir}/openscope">
		<fileset dir="${props.dir}">
			<include name="*.properties"/>
		</fileset>
	</copy>
</target>

Ant compiles things with the “javac” target. It takes a few parameters and optional flags we can use to customize the actual compile command. This task also copies any properties files into the scrub directory.

3) Package up all the class files into a jar file, or some kind of deployable artifact

<!-- *************** -->
<!-- *   Building  * -->
<!-- *************** -->

<!-- Package the logic module -->
<target name="package-logic" depends="compile">
	<jar jarfile="${build.dir}/jars/${logic.file}">
		<fileset dir="${build.dir}/openscope">
			<include name="com/openscope/**"/>
			<include name="*.properties"/>
		</fileset>

		<metainf dir="${resources.dir}">
			<include name="persistence.xml"/>
		</metainf>
	</jar>

	<copy todir="${basedir}/deploy/${ear.file}">
		<fileset dir="${build.dir}/jars">
			<include name="${logic.file}"/>
		</fileset>
	</copy>
</target>

<target name="build-war" depends="package-logic">

	<jar jarfile="${build.dir}/jars/${war.file}"
		basedir="${basedir}/deploy/${ear.file}/${war.file}"/>

</target>

The “jar” task jars up the contents of a directory. We can add files to the META-INF directory with a file include directive under the “metainf” task as part of the “jar” task.

4) Copy the new jar file to the deploy directory

<!-- **************** -->
<!-- * Make the Ear * -->
<!-- **************** -->

<!-- Creates the application ear file. -->
<target name="assemble-app" depends="package-logic,build-war">

	<ear destfile="${build.dir}/${ear.file}"
		basedir="${basedir}/deploy/${ear.file}"
		appxml="application.xml"
	>

	<manifest>
		<attribute name="Built-By"
			value="Openscope Networks"/>
		<attribute name="Implementation-Vendor"
			value="Openscope Networks"/>
		<attribute name="Implementation-Title"
			value="Webminders"/>
		<attribute name="Implementation-Version"
			value="0.1"/>
	</manifest>

	</ear>

</target>

The “ear” task as you can imagine packages up an ear file for deployment. It works very similar to the jar task and offers a few more optional tasks that relate directly to the ear file. Find more tasks on the Ant documentation page.

If you put these basic steps together and add some properties, you will end up with a simple ant script that can build most of your java projects. Customization of course, is where the power of any scripting tool will end up earning its keep. Java docs can be generated as part f the builds, FindBugs can do code analaysis, deployable artifacts can be ftp/scp’d across the network, heck you can even write you own Ant task to do whatever automated unit of work you want to define.

Resources:
Here’s the complete ant script that I use for one of my simple projects.
The Ant task documentation page

All things XPath

What’s XPath?

So XPath is really just a means of accessing data on xml nodes. XML has structure, with node and branches and attributes; XPath is the notation we use to express which nodes, branches or attributes we want to access. XSL makes use of this notation in order to change the structure and layout of an XML document. The process of changing one form of xml into another is called an XSL Transformation, or xslt for short.

The Basics!

If you’ve ever laid out a web page, YOU ALREADY KNOW XPATH! More specifically, if you have ever used an image in a webpage, you already some understanding of how xpath works. Take the following example:

<img src=”/images/my_image.gif“/>

The value of the “src” attribute is an xpath expression. It means the following: select the image that resides in the folder named “images” at the root of the web server’s domain, and choose the image named “my_image.gif”. In this example, the server’s directory structure is the XML we are navigating, and the image location is the data we want to transform to reuse in a new location.

If you’ve ever used a command prompt or terminal, you’ll notice that xsl is a little more intuitive because of expressions like this:

../siblingElement = go up one directory and pick the “siblingElement”
./currentElement = pick node “currentElement” in the current location
/rootElement = pick the node named “rootElement” at the base of the xml structure
node/childElement = pick the node named “childElement” inside the node named “node” which is in the current location

In addition, there are ways of accessing node attributes with the “@” sign:

< blockquote >../siblingElement/@id = go up one directory and pick the attribute “id” belonging to the node named the “siblingElement”
./currentElement/@name = pick attribute named “name” belonging to the node “currentElement” in the current location
/rootElement/@rootId = pick the attribute named “rootId” belonging to the node named “rootElement” at the base of the xml structure
node/childElement /@count= pick the attribute named “count” belonging to the node named “childElement” inside the node named “node” which is in the current location

Conditional Selectivity

XPath allows you to use conditional logic to navigate your xml document. Consider the following xml:

<xml>
	<node name="first"/>
	<node name="second"/>
</xml>

What kind of expression would you use to say “pick the node with the attribute named ‘first'”? This kind of logic is not only possible, but aslo can allow for extremely complicated xpath expressions. This is how to do it:

/xml/node[@name = 'first']

This literally means – get all the nodes in the xml where the /xml/node ‘s attribute named “name” has the value “first”. It will return ALL xml nodes that satisfy that requirement. The reason xpath expressions can become so complicated by this ability is that poorly constructed xml will force people to cross link attribute references between this branch and that so a really wordy xpath expression needs to be used to used as the selector.

Tangent!

This is my terrible segue way into some comments on xml! when constructing xml, please structure it as simply as possible with the forethought that xml should be extensible, as concise as possible, and xpath friendly as you can build it. Structure is the most important thing your xml schema should express and some genuine forethought and planning there will really pay off in the long run.

Advanced XPath

XPath has a lot of other commands that can be used for selecting nodes. I’ll list some interesting constructs below:

Select the node name “poll” with the namespace “ns”
ns:poll

Select all attributes named “id” for all the elements named “example”
poll/@id

Select the first poll node
poll[1]

Select all poll elements whose id is “100” and whose count is > 0:
example[@id = “100” and @count &gt; 0]
Notice it uses the html entity &amp;, this is because angle brackets break xml, specifically “>” breaks the current xml as the parser would not know it was text and not a “close node definition” instruction

Select the nearest poll ancestor of the current element:
ancestor::poll[1]
if you drop the [1] from the expression, it seems to choose the farthest node

An example of XPath unions:
pollVote | pollVote/vote
this means pick all nodes that satisfy either selector expression “pollvote” or “pollVote/vote”