Sunday, December 30, 2007

The Rules of XML at The Publisher's Arms

Good Evening Ladies & Gents,

Been a little while since the last posting, what with the Christmas and New Year holidays on us, we have been a little stretched behind the bar here.

After talking to a couple of our regulars here at The Publisher's, it seems that a few people are having some issues when it comes to writing well-formed and valid XML for Bursting Control Files and Data Templates. On further investigation, I discovered that a lot of people are unaware of the correct Rules of XML. The Publisher engines will not accept XML that does not adhere to the Rules of XML.

To help people out, I decided to jot the rules down on a beer mat for the regulars to share, and now I am typing them up here.

Defintions

  • Well-Formed XML - XML is well-formed when it conforms to the very strict, but simple rules laid out in the W3C XML Specification.
  • Valid XML - XML is valid when it conforms to the rules dictated by the W3C XML Specification and also the rules specified by a DTD or a Schema. (Note: All Bursting Control Files and Data Templates are validated against a schema when they are uploaded to a Data Definition)

The Rules of XML

  1. Close on the way out - All XML elements must have a closing tag.
  2. Keep an eye on your cases - XML elements are case sensitive. "Message" is not the same as "message". Open and Closing tags must be exactly the same.
  3. Get nested - XML elements must be properly nested. This means all Child Elements must be closed before the parent element can be closed.
  4. Root marks the spot - An XML document must have one and only one root element. All other XML elements must be within this root. The only exception is Processing Instructions which must be at the start of the document, above the root.
  5. Quote my attributes - All XML element attributes must have quotes around the values. You may use either single (') or double (") quotes, but be consistent throughout your document to make it easier for someone else to modify in the future.
  6. Preserve my white-space - All white-space is preserved in an XML document. If you add a line break in your data, it will remain. (Note: When viewing XML through IE, it will suppress the white-space, but it is still there if you view the document in Notepad!)
  7. Going to a naming convention - Finally, there are a couple of restrictions around the names for XML elements:
  • Element names can contain numbers, letters and other special characters such as (,.#
  • Element names MUST NOT start with a number or a punctuation character
  • Element names MUST NOT start with XML, or any upper or lower case combination of these three letters (xml, or XmL)
  • Element names may not contain any white-spaces

Do be careful when naming elements and including special characters, for example if you were to call an element element.name1, when this is parsed it could be interpreted as name1 being an attribute of the class element, which would obviously be incorrect! Similarly for the element name a+b, XML Publisher will interpret this as a summation of the elements a and b and will subsequently error or produce a blank.

As you can see, we have very large beer mats here at The Publisher's Arms, but if you adhere to these very simple rules you should have fewer problems when creating your Bursting Control Files and Data Templates.

Ding, ding... drink up please... give us your glasses then show us your a*ses!!

Remember people. don't drink and drive!!

Cj

No comments: