« Archives in January, 2010

Ejb3 basics: Entities

Entity Beans? Better than 2.1, I promise.

Ejb3 Entity beans are a type of enterprise java bean construct used to model data used by the ejb framework. The basic idea is to manipulate simple java objects, which represent in concrete terms your database data, and then have the framework handle as much of the plumbing as possible when you persist the data. Persisting means to store for later use in some data repository – usually some kind of database. By persisting these entities within the ejb framework we are able to abstract out tasks like updating a table and its associated foreign key table elements, perform queries and caching that automatically handles stuff like pre-populating java objects, and lots of the other boring stuff. In short, using entities in your application will allow you to work more on implementing business logic and less on wiring and mapping DAOs to TransferObjects. For the sake of completeness, the other two other important types of ejb beans should be mentioned: the Session and Message driven beans. In case it wasn’t obvious, ejb3 is only possible with java 1.5+ since that’s the release that initially introduced annotations into the java language.

One of the great things about ejb3 is that entities and persistence got a major overhaul from 2.1 = MAJOR SEXY TIME. Ejb3 does a really good job of simplifying the object model by using annotations in pojos to mark up entities. You can now model your entire data structure in terms of plain old java objects and their relationships, and the persistence engine will go and create all the necessary tables and sequencers and supporting schema elements.

Example Entity

Here’s an example of a bidirectional one to many relationship between User and Contact. Consider the following class:

package com.examples.entities;  

import java.io.Serializable;
import java.util.ArrayList;
import java.util.List;

import javax.persistence.*;

@Entity
@Table(name="tb_user")
@SequenceGenerator(name = "sq_user",sequenceName = "sq_user", initialValue=1)
public class User implements Serializable {

	private static final long serialVersionUID = 1L;
	
	@Id
	@GeneratedValue(strategy=GenerationType.SEQUENCE, generator="sq_user")
	protected Long id;

	@Column(name="user_name", nullable=false, length=32)
	protected String username;
	protected String password;
	protected String email;	
	
	@OneToMany(mappedBy="user")
	@JoinTable(name="tb_user_contact") 
	protected List<Contact> contacts = new ArrayList<Contact>();
 
	public Long getId() {
		return id;
	}
	public void setId(Long id) {
		this.id = id;
	}	
	
	public String getUsername() {
		return username;
	}
	public void setUsername(String username) {
		this.username = username;
	}
	public String getPassword() {
		return password;
	}
	public void setPassword(String password) {
		this.password = password;
	}	
	public String getEmail() {
		return email;
	}
	public void setEmail(String email) {
		this.email = email;
	}
	
	public List<Contact> getContacts() {
		return contacts;
	}
	public void setContacts(List<Contact> contacts) {
		this.contacts = contacts;
	}

}

This is a fairly common type of entity. Going from top to bottom, lets take a look at the annotations used and examine what they do.

@Entity

@Entity is the annotation that marks this particular java class as an ejb entity. This tells the persistence engine to load up this class and its associated annotations and use it as a model for data in the database. Technically this is the only annotation required in the class for a very simple entity, but there are other annotations we can use to customize and declare more complex relationships.

@Table(name=”tb_user”)

@Table lets you name the table modeled by your pojo. Its just a simple way to keep things organized in the database. If you don’t specify the table name it will default to the class name.

@SequenceGenerator(name = “sq_user”,sequenceName = “sq_user”, initialValue=1)

@Sequence lets you set up the sequence used for the primary key generation. This is required when you choose GeneratorType.SEQUENCE as your primary key generator. The name must match the @GeneratedValue’s name value. This is how the persistence engine knows how to map the sequence to the column.

@Id

@Id indicates that the following class method or field will map the table’s primary key.

@GeneratedValue(strategy=GenerationType.SEQUENCE, generator = “sq_user”)

@GeneratedValue maps the type of primary key incrementing strategy to use when adding new records to the database. Here are the possible strategies:

  • GenerationType.AUTO
    This indicates that the persistence engine will decide what incrementing strategy to use. Lazy man multiple vendor option.
  • GenerationType.IDENTITY
    This indicates that the persistence engine should use the identity column for incrementing. Vendors that can use this a ones that set “AUTO-INCREMENT” value type of flag to true. MySQL is one example of a vendor that can use this type.
  • GenerationType.SEQUENCE
    This tells the persistence engine to use a sequence to manage the increment values when inserting new values into the table. Postgres is an example of a vendor that uses sequences.
  • GenerationType.TABLE
    This tells the persistence engine to use a separate table to track increments on the primary key. This is more of a general strategy than a vendor specific implementation.

@Column(name=”user_name”, nullable=false, length=32)

@Column allows you to define column attributes for each class field. You can choose to define all of the possible relevant attributes or just the ones that you want to define. Other possible attributes are:

  • columnDefinition=”varchar(512) not null”
    Allows you to define native sql to your column definition
  • updatable=false
    Sets the column to allow updates or not. If it is not explicitly set to false, it will default to true, allowing updates to happen.
  • precision=10
    Decimal precision
  • scale=5
    Decimal scale
  • unique=true
    Defines if the column should contain only unique values.
  • table=”tb_user”
    Maps the table name for which this column belongs.

@OneToMany(mappedBy=”user”)

@OneToMany lets the persistence engine know that this field or method has a one to many type of relationship with the mapped object and the mappedBy attribute lets the persistence engine know the foreign key used when mapping the relationship. It will then set up any necessary relationship tables needed to express the relationship. This would normally include creating a separate table to hold all the key mappings.

@JoinTable(name=”tb_user_contact”)

@JoinTable lets you define the join table’s properties. In this case we’re using it to name the join table mapping the one to many relationship. A more complete @JoinTable annotation looks like this:

	@OneToMany(mappedBy="user")
	@JoinTable(
	    name="tb_user_contact",
	    joinColumns=@JoinColumn(name="user_id",referencedColumnName="id"),
	    inverseJoinColumns=@JoinColumn(name="contact_id",referencedColumnName="id")
	)
	public List<Contact> getContacts() {
		return contacts;
	}

This covers the owning class, here’s the class being pwnt:

import javax.persistence.*;

@Entity
@Table(name="tb_contact")
public class Contact {

	@Id
	@GeneratedValue(strategy=GenerationType.IDENTITY)
	protected Long id;
	protected String email;	

	@ManyToOne
	protected User user;
	
	
	public Long getId() {
		return id;
	}
	public void setId(Long id) {
		this.id = id;
	}
	public String getEmail() {
		return email;
	}
	public void setEmail(String email) {
		this.email = email;
	}
	public User getUser() {
		return user;
	}
	public void setUser(User user) {
		this.user = user;
	}
	
}

@ManyToOne

@ManyToOne annotation implies the connecting foreign key used in the bidirectional mapping. When the persistence engine reads all the entities in and starts generating all the sql to model the object model it will generate three tables between these two java classes. One table “tb_user” will represent the user class, “tb_contact” will represent the contact class, and finally, “tb_user_contact” which represents the relationship mapping table. This annotation is what turns a unidirecitonal relationship into a bidirectional relationship. Here’s an example:

	@ManyToOne
	public User getUser() {
		return user;
	}

@ManyToMany

@ManyToMany describes the many to many association between entities. It is used in conjunction with the @JoinTable annotation to define the mapping table used for storing all the relationships. Here’s an example:

	@ManyToMany
	@JoinTable(name="tb_user_contact")
	public List<Contact> getContacts() {
		return contacts;
	}

and then in the Contact class we would have:

	@ManyToMany(mappedBy="contacts")
	public User getUser() {  
 		return user;  
	}  

The owning entity will always have the @JoinTable, and the owned entity will always have the @ManyToMany(mappedBy=?) annotation.

These are just a few things that can be done with ejb3. I would suggest sifting through the java 5 javadocs to get a better feel for the other possible annotations.

For more reading:
Javax Persistence API
Java 5 Persistence Tutorial
Official Java Persistence FAQ

XML, Xalan, Endorsed dirs and &..

So recently, we’ve been working on a project that makes use of OpenSAML. As it turns out OpenSAML required newer Xalan libraries (2.7.1 to be precise), the kind that don’t ship with the older incarnation of jboss we are using for the project – version 4.02. Some of you might be more familiar with the jboss system properties and will know there’s a property jboss used specifically to override the standard xml libraries that ship with the jdk/jre (entry in the console output bolded and marked with a *). Jboss will allow you to pass in as a parameter the location for a variable known as “java.endorsed.dirs.” The purpose of this property is to map the file path to the Xalan libraries you would like for jboss to use as the Xalan implementation during runtime.

-Djava.endorsed.dirs=file://path/to/your/xalan/libraries

So if you have other installed applications running in different instances, you wont have to upgrade every instance you’re running concurrently, instead you can override a specific instance’s use of Xalan libraries by using this parameter in the run script. I’m not quite sure what version of Xalan ships with jboss 4.02, but when we upgraded the first thing we noticed was that any xml text in like “&amp;” rendered as “&amp;amp;” post xslt instead of rendering as “&” (presumably a fix set forth as Xalan matured):

<xsl:param name="url">
	http://www.some-url.com/path.do?parameter=value&otherParameter=otherValue
</xsl:param>

turned into

http://www.some-url.com/path.do?parameter=value&amp;amp;otherParameter=otherValue

If you intend to upgrade your Xalan libraries I would think that you might need to do some type of regression testing to make sure upgrading these xml centric libraries doesn’t inadvertently wind up breaking xml dependent sections of your application. It should be noted if you randomly toss upgraded xalan jars into your application you’re bound to run into all kinds of crazy exceptions. I’ve seen jboss complain about login-conf.xml, missing class libraries, weird servlet allocation exceptions, class not founds – all kinds of misleading problems that seem unrelated to xalan jar collisions or wierded out dependancies.

Bottom line is if you need to upgrade Xalan, stick to using java.endorsed.dirs, and pass in the -Djava.endorsed.dirs param into the jboss run script if you want to override a specific instance.

War deployment file structure

What’s a war deployment, do I need my own army?

When it comes to deploying an web based application we have a few options on the table. Well only one really if you stick to J2EE standards, not counting Ear deployments which also deploy web apps via wars. Outside the world of J2EE though, it becomes a crap shoot based on the web framework you’re using. Maybe you ftp your files manually, edit html directly on the server, or upload all your files and rename the folders so the new code is live and the old code is no longer accessible. In the J2EE world, we use deployable artifacts like war files. A war file is basically a collection of files structured in a certain way, that is zipped up. A war file can also be exploded, which means it’s simply not zipped up. So what’s a war look like?

webapp.war
	|-- images/
	|   `-- banner.jpg
	|-- index.html
	`-- WEB-INF/
		|-- jsps/
		|   |-- public/
		|   |   `-- login.jsp
		|   `-- private/
		|       |-- application.jsp
		|       `-- settings.jsp
	    |-- lib/
	    |   `-- some-library.jar
	    |-- classes/
	    |   `-- compiled.class
	    `-- web.xml

There are 2 sections which pretty much divide up the entire archive. All the stuff directly inside the root / of the war file, and then everything that’s inside the WEB-INF directory. The key difference between the two is one is publicly accessible while the other one has protected access; it’s a violation of the spec for an application server to allow public exposure to anything in the WEB-INF folder of your application.

Context

Your war file has an application context. An application context is the reserved namespace your web application has in relation to the application server’s qualified domain name. For example, if on startup you bound jboss to the localhost domain your server’s fully qualified url would be:

http://localhost:8080/

This represents the root of your application server. If you are deploying a single war/a single web application, by default your application will take on the context name of the war file. So in our example above, if we wanted to access webapp.war’s deployed application we would need to call it this way:

http://localhost:8080/webapp

Jboss only!

Out of the box, jboss comes with a default ROOT.war application in the deploy directory that links to other jboss web applications. One great thing about jboss is you can set up your configuration instance to deploy whatever components you want, meaning you can remove this ROOT.war file and use your own as the context root. You would need to replace the default ROOT.war file with the contents of your war file to make your application use the same context. This is kind of messy though, so I would recommend just removing the ROOT.war file and instead stick a jboss-web.xml file in your war’s WEB-INF directory configured like this:

 
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE jboss-web PUBLIC "-//JBoss//DTD Web Application 2.3//EN" 
    "http://www.jboss.org/j2ee/dtd/jboss-web_3_0.dtd">

<jboss-web>

   <context-root>/</context-root>

</jboss-web>

The context-root element here basically tells jboss to load up the war file into the root context of the application server, so calls to “http://localhost:8080/” will be processed by your war file. There’s also a way to map virtual hosts in jboss, discussed in another article, Virtual hosting with Jboss.

Compiled Resources

The other 2 things that need to go into the WEB-INF directory are the WEB-INF/lib and WEB-INF/classes directories. The /lib directory is a place where you put all of your web application’s third party jar files, as well as any jar’d up version of your custom code and resources. If you choose not to jar up all your custom code and resources you can then stick all your .class files and application resources in the WEB-INF/classes directory. From my point of view, its cleaner to just jar everything up and stickem in the /lib directory. Naked class files and resources are so.. messy.. But that’s just my opinion. It’s important to note, if you have empty /lib or /class directories you don’t need to include them in your deployment, they are only required if you are going to stick resources in there.

Static Resources

Now that you’ve figured out all your application resources, you can then then stick all your static resources in the root of your war file. I should point out that there are two sides of the fence about how you should proceed about this though; purists think everything but the basics need to be obscured from the user to prevent them from hacking urls/jsps (by sticking jsps in the WEB-INF, hiding them from exposure) while other folks don’t really care. I think the folks don’t really care that much because they’re using a web framework that hides the true jsp paths and file names. If you’re not using a web framework, it better be for a good reason. You then might want to consider obscuring the jsps in the WEB-INF.

That’s pretty much all there is to a war’s file structure. When it comes time to deploy, most of the time the war file is deployed as a zipped up archive. Jboss also supports the notion of exploded wars, which is basically just an unzipped war file. Exploded wars are like a double edges sword though – if you deploy as an exploded war you get the benefit of not having to redeploy the entire application if you want to fix something like text on a page. Be wary though, circumventing a build process is never a good idea. Build process is there for a reason, its purpose is to track code, updates and and make sure only tested code is released.

Java, XML and XStream

What’s an object/xml serializaing/deserializaing library?

If you’ve never worked with an object/xml serializer and are considering writing your own from scratch, you may want to consider using a library like XStream. XStream is very good at moving java into xml and back. It allows a high level of control over how the xml can be organized and structured and even allows the user to create their own converters for even more flexibility.

But still, why use something like this when you can be perfectly happy writing your own data conversion scheme? The problem really boils down to flexibility, and in reinventing the wheel. Ninety percent of the time you’re already interrogating a datasource (like some rdbm system like oracle, postgres or mysql) and will be using some kind of TransferObject or maybe an Entity persistence scheme built around pojos. If you write your own serializing engine from scratch by mapping pojos to dom4j nodes, constructing Document objects and then using them for stuff like xsl transformations, you end up missing out on a great tool.

It may not seem obvious right now, but a homegrown serializer is the kind of thing you can write once and forget about and then months or years down the line, when it comes time to update your data model or expand its framework, you end up rebuilding all the dom4j stuff from scratch. Unless you take the lazy route and append and new xml to the root node to save yourself the entire node refactor. Maybe simple objects with one or two simple nested objects wont seem like much, but if your object becomes anything close to approaching a complex lattice, then going back and tweaking the entire structure when you want to expand or refactor your xml can become quite perilous. Especially if you want to make your xml as xpath friendly as possible.

Edit:
As Felipe Gaucho has been kind enough to point out, Xstream only writes a text string as the serialized object. It will not preform any validation on your XML, so you’re left on your own validate it post serialization. Something like JAXP comes to mind to tackle XSD based validation, or JiBX if you’re looking for Data Binding.

So what does XStream do for me?

Consider these objects:

 public class MyClass {

	protected MyObject object;
	
}
public class MyObject {

	protected ArrayList Field;
	
}

XStream lets you do something like this if you want to serialize an object like MyClass to xml:

 XStream xstream = new XStream();
String myClassXML= xstream.toXML(myClassObject);

and if you want to go from xml back to a java object you can do this:

 XStream xstream = new XStream();
MyClass myClassObject= xstream.fromXML(myClassXML);

As you can see, all the plumbing goes away and you are now free to concentrate on writing the rest of your application. And if you want change your object model, consolidate nodes or rearrange the structure of your xml, all you have to do is update your pojo and your xml immediately will reflect the updated changes in the data model on serialization.

It should be noted that to completely deserialize xml, your object needs to correctly map all the data in the xml. If you have trouble deserializing try building a mock object and populating it with sample values and then serialize it to xml; then you can compare the test xml to what your actual xml is and make your changes.

Alilasing

XStream does not require any configuration, although the xml produced out of the box will likely not be the easiest to read. It will serialize objects into xml nodes according to their package names, usually making them very long as we can see from the following example:

<com.package.something.MyClass>
	<com.package.something.MyObject>
		<List>
			<com.package.something.Field/>
			<com.package.something.Field/>
		</List>
	</com.package.something.MyObject>
</com.package.something.MyClass>

Luckily XStream has a mechanism we can use to alias these long package names. It goes something like this:

XStream xstream = new XStream();
xstream.alias("MyClass", MyClass.class);
xstream.alias("MyObject", MyObject.class);
xstream.alias("Field", Field.class);

Adding an alias like this will let your xml come across nice and neat like this:

 <MyClass>
	<MyObject>
		<List>
			<Field/>
			<Field/>
		<List>
	</MyObject>
</MyClass>

Attributes

If you want to make a regular text node an attribute, you can use this call to configure it:

 xstream.useAttributeFor(Field.class, "name");

This will change make your xml change from this:

 <MyClass>
	<MyObject>
		<List>
			<Field/>
				<name>foo</name>
			<Field/>
		<List>
	</MyObject>
</MyClass>

into

 <MyClass>
	<MyObject>
		<List>
			<Field name="foo"/>
			<Field/>
		<List>
	</MyObject>
</MyClass>

ArrayList (implicit collections)

ArrayLists are a little tricker. This is what they look like out of the box:

 ...
	<MyObject>
		<List>
			<Field/>
			<Field/>
		</List>
	<MyObject>
...

Note theres an extra “List” node enclosing the List elements name “Field”. If we want to get rid of that node so that Field is right under Object, we could tell XStream to map an implicit collection by doing the following:

 xstream.addImplicitCollection(MyObject.class, "Field", "Field", Field.class);

where the addImplicitCollection method signature is the following:

 /**
	 * Appends an implicit collection to an object for serializaion
	 * 
	 * @param ownerType - class owning the implicit collection (class owner)
	 * @param fieldName - name of the field in the ownerType (Java field name)
	 * @param itemFieldName - name of the implicit collection (XML node name)
	 * @param itemType - item type to be aliases be the itemFieldName (class owned)
	 */
	public void addImplicitCollection(Class ownerType,
            String fieldName,
            String itemFieldName,
            Class itemType) 

Adding this implicit collection configuration will streamline the xml so that it looks like this now:

 
<MyClass>
	<MyObject>
		<Field/>
		<Field/>
	</MyObject>
</MyClass>

Notice the “List” node is gone, and “Field” is now directly under “MyObject”. You can find the complete documentation on the XStream website here.

There are plenty of more tricks you can use to configure/format your xml, and there are plenty of examples listed on the XStream website, but these three points here should cover the basics to get you started.

Using Jboss Datasource files, JBoss 5.1

If you’re using jboss and you’re storing database connection info in a properties file, you might be doing something wrong. Specifically, you’re probably not using the data source files jboss ships with to configure all that plumbing.

What’s a Datasource file?

Simply put, its a file that contains all the connection properties an application needs in order to connect to a database in xml format. Here’s an example:

<datasources>  
	<local-tx-datasource>
        <jndi-name>DefaultDS</jndi-name>
        <connection-url>jdbc:postgres://dbUrl:5432/schema</connection-url>
        <driver-class>org.postgresql.Driver</driver-class>
	    <user-name>username</user-name>  
	    <password>password</password>
		<metadata>
			<type-mapping>PostgreSQL</type-mapping>
		</metadata>	    
	</local-tx-datasource>
</datasources>

So, the “jndi-name” node gives this datasource configuration the jndi name it will be bound to while the server is running. We’ll need this later since we’ll be using jndi to fetch the datasource configuration. The “connection-url” is the jdbc connection string used to connect to the database; the “driver-class” sets the java driver used to load the database connectors; and the username/password nodes are self explanatory. The “metadata” node is optional, but its good to specify which mapping type jboss should use for this datasource; its job is to set which sql dialect constructs are used to map sql and data during an actual jdbc call. This file itself can be named anything, with the only limitation being that it needs to end with “-ds.xml”. The suffix pretty much flags jboss and lets it know to process the contents of the file as a datasource type. This file needs to be in the deploy directory for it to be loaded up.

Great, now we have a datasource file set up, how do we access it?

If you’re using ejb3, your persistence.xml file references the ds configuration by jndi-name. I’m pretty sure the hibernate configuration files do the same. If you’re using jdbc in your own homegrown persistence layer – gah, read on. Remember how we mapped the ds connection info with a jndi-name? We now get to use the InitialContext object to fetch the ds configuration by name via a jndi lookup:

/*
 *	Returns a java.sql.Connection object, given the ds name to lookup
 *
*/
private Connection getDatasourceConnection(String jndiName) throws Exception { 

	// load the driver first
	Class.forName("org.postgresql.Driver"); 
	String dsFile = "java:/DefaultDS"; 

	// sloppy, but just to show it can be done
	if(jndiName != null) {
		dsFile = jndiName;
	}

	InitialContext ic = new InitialContext(); 
	DataSource ds = (DataSource) ic.lookup(dsFile); 

	// returns an object of java.sql.Connection
	return ds.getConnection(); 
}

What’s powerful about this is because you’re using a jndi name to lookup the datasource, you can set up multiple datasources and just call them by name when you need them. Send in the name of your ds config, and you get back a fully loaded java.sql.Connection object, ready to fire up jdbc calls. But.. jdbc was so.. Y2004…

TANGENT!

If you’re using a legacy jdbc system (hard coded connection strings, ugly bundled properties files to hold connection settings, etc), porting your application to use datasource files could prove to be a worthwhile refactor, its so much cleaner. If you’re building an entirely new application though, take a look at ejb3 persistence, hibernate, they’re mature persistence frameworks that do a lot of the heavy lifting for you. There are others but these two stand out.

5 ways to make XML more XPath friendly

As java developers we should always do what we can to optimize outbound xml from our side of the fence. I mean, its our job to build and design awesome, elegant and efficient software whenever possible right? We have our data and we want to serialize it into xml, how can we make our xml as efficient and xpath friendly as possible?

1) Keep the structure simple

Consolidate data nodes whenever possible before marshaling your object into xml. You really don’t want to have to resort to using xpath for any unnecessary lookups across nodes. Keep those lookups confined to the persistence layer where they belong. I realize this may not always be possible, but keeping it to a minimum will really help the xsl developers by not forcing them to create ridiculous xpath expressions in order to link discretely separate data nodes. We want to keep the xpath logic as simple as possible.

Consider the following xml document:

<root>
	<house id="1" color="white">
	<room name="livingRoom" houseId="1" windows="3"/>
	<room name="kitchen" houseId="1" windows="2"/>
	<room name="bedroom" houseId="1" windows="4"/>
	<room name="bathroom" houseId="1" windows="1"/>
</root>

If this is how your xml is structured, and you wanted to transform this data with xsl you are creating all kinds of extra work and could end up causing an unnecessary performance bottleneck for your application. If you wanted to do something like lay out the total number of rooms for the house with id of 1, your xpath would have to look something like this:

count(/root/house[@id=1]/room)

You are now making your xsl developer implement logic to count all the rooms for a particular house node, using an xpath function and selector conditional logic to filter a global count. Just because you can use the count function does not mean you should use it every chance you get. This xpath expression will traverse all of your xml and count the number of nodes whose house node is 1, and return the total number of room nodes. Its not much of a problem if your xml is one or two house nodes deep, but what if you had a like 10, 20 or even 30 house nodes or more? If you are processing numbers like these, and then you span these node traversal across say a hundred requests, you would be doing something like 3,000 traversals. What if instead you used an xml structure like this:

<root>
	<house id="1"  totalRooms="4" color="white">
		<room name="livingRoom" windows="3"/>
		<room name="kitchen" windows="2"/>
		<room name="bedroom" windows="4"/>
		<room name="bathroom" windows="1"/>
	</house>
</root>

In this example we attached the room count to the house node as an attribute. This way, our xpath expression ends up looking like this:

/root/house/@totalRooms

No count function, no selector conditional logic to filter; you end up with a simple, basic xpath expression. You’re doing a single lookup instead of an entire 3,000 node traversal while collecting a node count that has to be calculated as the transformation is processing. Let the data/persistence layer perform those types of counts and populate all the data, and let your xsl/xpath lay out your data. Keep the structure simple. If this is not possible, you might be doing something wrong.

2) Avoid namespaces whenever possible

Namespaces are important, yes. But if you are trying to express a java object as xml, prefer using a different name altogether than attaching namespaces. This really comes down to creating a more specific, descriptive naming scheme for your application’s objects. If you are using namespaces just for the heck of it, I’d urge you not to. You end up adding lots of unnecessary noise to your xpath, and having those colons all over the place in your xml can make the document look like its been processed through a cheese grater. Anyone trying to read your xml will cringe at the thought and will want to pass it off to the new guy as part of the hazing ritual. Consider the following xml:

<root>
	<myHouseNamespace:house id="1"  totalRooms="4" color="white">
		<myHouseNamespace:room name="livingRoom" windows="3">
			<myHouseNamespace:furniture name="sofa" type="leather"/>
			<myHouseNamespace:furniture name="table"/>
			<myHouseNamespace:furniture name="lamp"/>
		</myHouseNamespace:room>
		<myHouseNamespace:room name="kitchen" windows="2"/>
		<myHouseNamespace:room name="bedroom" windows="4"/>
		<myHouseNamespace:room name="bathroom" windows="1"/>
	</myHouseNamespace:house>
</root>

This makes your xpath look like this:

/root/myHouseNamespace:house/myHouseNamespace:room[@name=’livingroom’]/@windows

I don’t know about you, but this xpath expression is hard to read with all the “myHouseNamespace:” junk all over the place. And we only went 2 nodes deep into the tree. A third level down would have marched the xpath expression across the width of this blog! Who loves mile long xpath expressions that add side scrollbars to your text editor on your widescreen monitor? NO ONE.

3) Use attributes wherever appropriate

There is really no difference between using an attribute and a single text node to express a simple value in xml. In other words, there is nothing different between this:

<root>
	<house>
		<id>1</id>
		<totalRooms>4</totalRooms>
		<color>white</color>
	</house>
</root>

and this:

<root>
	<house id="1"  totalRooms="4" color="white"/>
</root>

So why prefer attributes? Because it makes your xml document legible. Having a ton of single child text nodes adds a lot of noise to your document especially if they are digit or single word attributes. It becomes easy to mix up your text values with real, slightly more complex nodes of data in your xml tree. Readability is important and keeping your xml as efficient expressed as possible without sacrificing readability is paramount to making it understandable and easier to work with.

4) Clearly name your xml nodes

Use names that actually describe the data being represented. Stay away from these kinds of names:

<root>
	<h id="1"  totalRooms="4" color="white">
		<r name="livingRoom" windows="3">
			<f name="sofa" type="leather"/>
			<f name="table"/>
			<f name="lamp"/>
		</r>
		<r name="kitchen" windows="2"/>
		<r name="bedroom" windows="4"/>
		<r name="bathroom" windows="1"/>
	</h>
</root>

Sure, your xpath will be much shorter :

/root/h/r[@name=’livingroom’]/f[@name=’sofa’]/@type

But unless you’re very, very comfortable with the single letter naming convention, you might end up having a hard time keeping track of all the nodes since they’re small and easy to overlook. Descriptive, concise names help make your xml easier to learn or come back to if the data is named in a clear, self explanatory way. Ideally, your xml should have names that the untrained eye should be able to pick up, and make sense of the basic structure with little preparation.

5) Make your xml as human readable as possible

This point encapsulates all the others. XML is a language meant to be tied very closely to data, and our ability to understand that data will allow us as developers to mold it into whatever we want it to be. If you’ve ever had to sit through a very complex piece of xml, you’ll realize that forcing anyone to have to muck through an unconventional structure and non-intuitive names ends up breaking up the xslt development into three phases:

1) figuring out/understanding the xml
2) implementing the xsl
3) figuring out/understanding the xml, and what was just implemented

The longer the developer has to sit there and meddle with #1, and #3, the more time is lost in bringing your product to the next release. We want to spend as little time as possible figuring out or compensating for poorly structured xml so the real implementation work can be completed, and we can move on to the next thing.

In other words

The bottom line is if you structure your xml so that its easy to understand by humans, and it doesn’t cut corners by passing data lookups or counts to the xsl developer, your application will become much more efficient, well written, and easier to work with. This is a good place to exercise the separation of concerns principle, let the data/persistence layer do what it does best, and let the xml/xslt layer do what it does best. XSLT is a relatively expensive process, but even with infinite computing resources we should always strive to make the most of out whatever resources we can allocate.

Virtual hosting with Jboss 5.1

How do I map a web application to a url in jboss?

If you have multiple web apps deployed in a single jboss instance, you’ll probably want to figure out an effective way to tell them apart when you try to access them from a browser. On startup jboss can be configured to bind to a single url which will act as the default host for all the deployed applications. You can then set up a separate context for each web app you are running. If they’re totally separate applications though, it might not make sense to use a single url and break them out by context. In Jboss you can set up virtual hosts to solve this dilemma. Here’s how to set this up:

WEB-INF/jboss-web.xml

In your web application you’ll want to add an xml file named “jboss-web.xml” to your WEB-INF folder. This is the file that’s going to map both the web application’s context and host in jboss.

<jboss-web>
    <context-root>/</context-root>
    <virtual-host>www.first-application.com</virtual-host>
</jboss-web>

This configuration sets the application’s context to “/” (essentially the root of the default domain), and it also maps the virtual host configuration to “www.first-application.com”. Note that it wont matter if you deploy this configuration from within an ear (embedded war file) or as a standalone war file, as only wars are meant to respond to web requests. Let’s also add the second configuration to the other war file’s WEB-INF/jboss-web.xml:

<jboss-web>
    <context-root>/</context-root>
    <virtual-host>www.second-application.com</virtual-host>
</jboss-web>

Next, we’ll need to add the virtual host configurations to jboss’ server.xml.

jbossweb.sar/server.xml

Now we need to edit jboss’ server.xml file, adding the virtual host mappings:

<Server>
   <Service name="jboss.web"
      className="org.jboss.web.tomcat.tc5.StandardService">
       
      <!-- A HTTP/1.1 Connector on port 8080 -->
      <Connector port="8080" address="${jboss.bind.address}"
                 maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
                 enableLookups="false" redirectPort="8443" acceptCount="100"
                 connectionTimeout="20000" disableUploadTimeout="true"/>

      <Engine name="jboss.web" defaultHost="www.first-application.com">
         <Realm className="org.jboss.web.tomcat.security.JBossSecurityMgrRealm"
          certificatePrincipal="org.jboss.security.auth.certs.SubjectDNMapping"
            />
         <Logger className="org.jboss.web.tomcat.Log4jLogger"
                 verbosityLevel="WARNING"
                 category="org.jboss.web.localhost.Engine"/>

            <Host name="www.first-application.com" autoDeploy="false"
                  deployOnStartup="false" deployXML="false">
                <Alias>dev.first-application.com</Alias>
                <Alias>qa.first-application.com</Alias>   
                <Alias>test.first-application.com</Alias>    
                <Valve className="org.apache.catalina.valves.AccessLogValve"
					   prefix="localhost_access_log." 
					   suffix=".log"
					   pattern="common" 
					   directory="${jboss.server.log.dir}" 
					   resolveHosts="false" />
            </Host>   

            <Host name="www.second-application.com" autoDeploy="false" 
                  deployOnStartup="false" deployXML="false">
                <Alias>dev.second-application.com</Alias>
                <Alias>qa.second-application.com</Alias>   
                <Alias>test.second-application.com</Alias>    

                <Valve className="org.apache.catalina.valves.AccessLogValve"
					   prefix="localhost_access_log." 
					   suffix=".log"
					   pattern="common" 
					   directory="${jboss.server.log.dir}" 
					   resolveHosts="false" />
            </Host>

      </Engine>
   </Service>
</Server>

Note the alias map other optional domains that jboss would listen for as aliases of the keyed domains “www.first-application.com” and “www.second-application.com”. All this means is jboss will redirect requests for processing to the respective web application’s whose virtual-host config maps to this host configs.

In order for all of this to work however, we need to make sure dns is set up to handle these domain requests. On a local development machine, you’ll want to edit your hosts file (on windows: c:/WINDOWS/System32/drivers/etc/hosts, on linux: /etc/hosts) and add these entries:

127.0.0.1		www.first-application.com
127.0.0.1		www.second-application.com

This way when you type in one of the domains into your browser, it’ll forward the request to jboss’ bound IP. likewise, if you have jboss bound to a specific domain name/IP on boot, you would have to map that domain/ip in your hosts file just like in the example above.

Now you should be able to fire up jboss, type in either domain into the browser, and have jboss redirect those requests to the corresponding deployed war file.

Install Fedora

Fedora?

“Fedora is a Linux-based operating system that showcases the latest in free and open source software. Fedora is always free for anyone to use, modify, and distribute. It is built by people across the globe who work together as a community: the Fedora Project. The Fedora Project is open and anyone is welcome to join.” – from the Fedora homepage

I’m using this to run jboss 5.1, postgres, mysql and all my other goodies. Why Fedora? No real reason in particular, other than I wanted to have experience working with more than one flavor of linux, since I use CentOS at work. CentOS is more of an enterprise OS, it’s objective being to provide a free version of the higher end Red Hat Enterprise Server, minus a big chunk of the costs. Back to fedora – basically its the experimental-ish, advanced stomping grounds for Red Hat Enterprise Linux. Red Hat only supports its own branded version of linux, while Fedora is more of a community driven project, with releases churned out every 6 months.

I’m currently using Fedora 11, even though Fedora 12 is already out as I type this. I’m not sure if im going to update anytime soon, as OS upgrades can always be scary and stuff can be expected to break or stop working.

Installing fedora

Setting up Fedora is fairly straight forward. Download it, burn it onto a DVD, stick it into computer you want to install on, and reboot. If you are set up to boot from the disc media, it will run the Fedora installer program thing and give you a few options. I wound up performing a full install and created new hard disk partitions, essentially wiping everything and starting from scratch. You will create a root account password during the setup – remember to write it down for future use, you will be making lots of admin level type of changes in the near future.

Once you are installed, you’ll need to know how to log in as the root user:

[user@bedrock ~]# su –

you will be prompted for the root password, enter it and you’ll be in.

[root@bedrock ~]#

Now you’re ready to start linuxing. Enjoy!

Ejb3 Basics: Bean Managed Transactions

I’m Lazy, why would I want to do my own transaction management?

While its true that the ejb3 container is usually pretty smart about persisting and handling transactions, its also not as smart as a real human being and probably isn’t able to handle complex database transactions and rollbacks. This is where bean managed transactions come in. By handling your own transactions you can avoid some major pitfalls.

The first problem with Container Managed Transactions (CMT) is there’s a time limit imposed by the container (although it’s a configurable timeout). If you are performing a lengthy task and are persisting the results (like FTPing files over to a third party vendor, persisting the results of a jms queue distribution, or committing across two different data sources – What the heck is a datasource?) then you will want to have control over when the bean commits the transactions. This frees you from a time constraint and also allows you to decide when you want to commit and what rollback strategy you want to use for error management. If you stick to using CMT during these kinds of database transactions, the container will invariably end up flipping out and start throwing all kinds of errors because out of the box the container is usually set to handle short duration transactions (like 5 minutes). Likewise, if a transaction is taking upwards of say, 30 seconds, it might be a sign that the application’s relationship with the database might not be as efficient as possible.

In a perfect world, CMT management streamlines the process for quick and easy transactions like selects, inserts and updates; the type that have all the data ready to insert and don’t have any weird secondary transaction dependencies that need to be completed before committing or inserting. For other weird transaction management handling, user Bean Managed Transactions like this:

/**
 * set up the transaction maangement style for this class
 *
 */
@TransactionManagement(TransactionManagementType.BEAN)
public class MyClass... {


	// set the session context reference, we're going to use it soon
	@Resource private SessionContext sessionContext;

	public void setReminder(ReminderForm reminder) {

		UserTransaction utx = sessionContext.getUserTransaction();  
		try {  

			// begin the transaction
			utx.begin();  

			/**
			 * persist your object
			 * 
			 */
			em.persist(something);

			// attempt to commit the transaction
			utx.commit();

		} catch(Exception e) {

			// something went wrong, lets try to rollback the transaction
			utx.setRollbackOnly();
			log.error("problem with the database transaction: " + 
				e.getMessage());

		}
	}

}

Using those SessionContext objects with this type of construct you can create as many transactions as you want. The container’s transaction management is hands off during the entire body of this class, but you can override individual method with the CMT flag if you must.

Basic Ant scripts

What’s an Ant script? Do I need bug spray?

Ant is a scripting tool commonly used to build, compile and deploy projects. This is in no way an all encompassing inventory of what Ant can do. It is extensible and its instructions are expressed in an xml format whose nodes comprise a framework of abilities designed to make menial tasks automated.

From a developer’s perspective, the most basic Ant tasks are the compile, package and copy tasks. All java projects must do these three things many, many, many times during a development cycle. Its very boring and tedious if you do it by hand through the command prompt. First you’d have to type in a bunch of commands like “javac [ options ] [ sourcefiles ] [ @argfiles ]” detailing all the class paths you want to use, all the source files, and all then the other supporting parameters you need to enter to get it to compile your program correctly. If you’re only writing one class, its probably not that bad. But when you have hundreds of classes, multiple projects and dependencies, and a slew of directories to configure and lay out for compiling, it quickly becomes ridiculous. In fact, I would claim that it becomes ridonkulous.

Ant lets you define tasks that break up these chores into a series of chained events. An Ant task is broken up into what are called “targets”. Each target is meant to perform one task, or unit of work. If we break up the compilation/deploy process it could look something like this:

  1. clean out the scrub/temporary directory
  2. compile all the java files into class files
  3. package up all the class files into a jar file, or some kind of deployable artifact
  4. copy the new jar file to the deploy directory

We can define each one of these steps with an Ant task. This is a good thing because it allows us to chain them like stairs, one task leading into the next. If any one of the tasks fail, the script would fail and Ant would tell us where the problem happened (with line number, and the exact problem or exception).

Here are what these steps might look like:

1) Clean up the build directories

<!-- *************** -->
<!-- * Preparation * -->
<!-- *************** -->

<target name="prepare" depends="clean">
	<mkdir dir="${build.dir}"/>
	<mkdir dir="${build.dir}/jars"/>
	<mkdir dir="${build.dir}/openscope"/>
</target>

Here, the “clean” depends attribute references a previous ant target that deletes all these scrub directories. This “prepare” target creates the scrub directories we’re going to use in our build. mkdir creates a directory.

2) Compile all the java files into class file

<!-- *************** -->
<!-- * Compilation * -->
<!-- *************** -->	

<target name="compile" depends="prepare">
	<javac destdir="${build.dir}/openscope"
			debug="on"
			deprecation="on"
			optimize="off">
		<src path="${src.dir}"/>
	<classpath refid="build.classpath"/>
	</javac>

	<copy todir="${build.dir}/openscope">
		<fileset dir="${props.dir}">
			<include name="*.properties"/>
		</fileset>
	</copy>
</target>

Ant compiles things with the “javac” target. It takes a few parameters and optional flags we can use to customize the actual compile command. This task also copies any properties files into the scrub directory.

3) Package up all the class files into a jar file, or some kind of deployable artifact

<!-- *************** -->
<!-- *   Building  * -->
<!-- *************** -->

<!-- Package the logic module -->
<target name="package-logic" depends="compile">
	<jar jarfile="${build.dir}/jars/${logic.file}">
		<fileset dir="${build.dir}/openscope">
			<include name="com/openscope/**"/>
			<include name="*.properties"/>
		</fileset>

		<metainf dir="${resources.dir}">
			<include name="persistence.xml"/>
		</metainf>
	</jar>

	<copy todir="${basedir}/deploy/${ear.file}">
		<fileset dir="${build.dir}/jars">
			<include name="${logic.file}"/>
		</fileset>
	</copy>
</target>

<target name="build-war" depends="package-logic">

	<jar jarfile="${build.dir}/jars/${war.file}"
		basedir="${basedir}/deploy/${ear.file}/${war.file}"/>

</target>

The “jar” task jars up the contents of a directory. We can add files to the META-INF directory with a file include directive under the “metainf” task as part of the “jar” task.

4) Copy the new jar file to the deploy directory

<!-- **************** -->
<!-- * Make the Ear * -->
<!-- **************** -->

<!-- Creates the application ear file. -->
<target name="assemble-app" depends="package-logic,build-war">

	<ear destfile="${build.dir}/${ear.file}"
		basedir="${basedir}/deploy/${ear.file}"
		appxml="application.xml"
	>

	<manifest>
		<attribute name="Built-By"
			value="Openscope Networks"/>
		<attribute name="Implementation-Vendor"
			value="Openscope Networks"/>
		<attribute name="Implementation-Title"
			value="Webminders"/>
		<attribute name="Implementation-Version"
			value="0.1"/>
	</manifest>

	</ear>

</target>

The “ear” task as you can imagine packages up an ear file for deployment. It works very similar to the jar task and offers a few more optional tasks that relate directly to the ear file. Find more tasks on the Ant documentation page.

If you put these basic steps together and add some properties, you will end up with a simple ant script that can build most of your java projects. Customization of course, is where the power of any scripting tool will end up earning its keep. Java docs can be generated as part f the builds, FindBugs can do code analaysis, deployable artifacts can be ftp/scp’d across the network, heck you can even write you own Ant task to do whatever automated unit of work you want to define.

Resources:
Here’s the complete ant script that I use for one of my simple projects.
The Ant task documentation page