XML Tutorial XML Development TutorialXML test link
XML Application Development Tutorial by Dale Cover, Orbidex XML Developer      Home
About XML | Why Care About  XML? | What is XML? | How is XML Used? | Who Needs XML? | Contact Info | Tutorials

 


What's new at XML Data WorldProfessional XML Developer's Tutorial


XML Developer's Tutorial

This document should serve as a starting point for any one who is interested in xml and it's related technologies from a programmers point of view. After going through this tutorial, you will be able to identify the many different components of the xml development environment and understand how to deploy them. The following tutorial will cover XML, XSL, and XSLT and give examples written on both Microsoft and Java development platforms. This is part 1 of the turorial. Today you will learn about XML and XSLT. You will go to work tomorrow having written an XML document, an XSLT document and will have transformed that document for display. All tools and source code will be provided. All you need to do is spend a little time.

• What should I know?
To work with XML, an understanding of web based, client/server application development is a must. The language in witch you choose to develop, however, isn't important. Our examples will be written in VBScript and Java. Visual Basic and Java Script could also be used. In fact, they have similar syntax and are identical in concept. You will see that a little bit goes a long way. That's because many of the world wide webs technologies have been developed in the open source arena and over seen by the World Wide Web Consortium (W3C). The W3C, along with many contributors, defines specifications and guidelines for application development while encourging interroperability if disparate systems. XML is no exception. In fact, it is this paradigm that XML supports so well. Keep this in mind when going through this turorial. The concepts are the most important things to remember here. NOT syntax.

XML Extensible Markup Language
XML is a language derrived from The Standard Generalized Markup Language (SGML) as is the familiar Hyper Text Markup Language (HTML). SGML is complex and very difficult to conform to. It due to this complexity that it is ruled out, for many organizations, as a possible solution. HTML, which I'm sure your familiar with, is easy. It defines a particular set of tags and their purpose; making learning HTML very straight forward. HTML tags, however, only communicate the visual display of data. For example, the specification defines a <font/> tag which takes parameters for the font size and color. It specifies <table/>, <tr/>,<td/> tags which express the arrangement of data in a table of rows and columns. HTML is great for expressing the layout and display of data.

Contrary to what many believe, HTML is not being replaced by XML. What XML is doing is helping to define other languages that are similar to HTML. For example, the Wireless Application Protocol (WAP), which is said to be an application of XML, defines a Wireles Markup Language (WML), which is similar to HTML in that it's many tags are used to communicate visual layout to the Processor. Processor being the program that interprets the WML, be it a palm pilot or a cell phones internal "browser".

Let's go back a few sentences and discuss what it means to be "an application of XML". Well, WML is actually a valid and well formed XML document. It looks like HTML, that's because when the tags that are defined are for a very similar purpose, that is, to be displayed in some type of browser.

For example:

<? xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN  
"http://www.wapforum.org/DTD/wml_1.1.xml">

<wml>
	<card id="test">
		<a href="http://www.xmldataworld.com"
            >Hello Wap user!!!</a>

	</card>

</wml>


The above document has HTML like tags with one in particular that should be familiar; "<a/>" It is a hyperlink with the display Hello Wap User. Notice the <? xml version="1.0"?> tag. It is called an XML declaration. It declares that the document is in fact an XML document and that it's version is the current version 1.0. The next line is the doc type definition. This is a document that defines all valid tags or elements and their attributes. These doctypes are publicly available and can be referenced from your programs to validate your document, such as the one above. If any of the tags are illegal, that is, if any combination of tags and attributes in your document are not defined in the DTD, the document is said to be invalid. When used in your program, invalid data will cause errors to be thrown. Have you ever used an HTML validator and wondered how it knows what's leagal or illegal? That's the power of the DTD. Click this link to check out the HTML 3.2 DTD to get a look. Basically, it declares entities, elements and their associated attributes. If you understand DTD's you could learn HTML just from this document alone as it is the primary source. Note however that the use of a DTD for validation purposes is optional. In XML, the DTD is to ensure validity of a document but it is not necessary.

Now that you understand DTD's and what it means to be a valid XML document, let me add another charracterristic of XML to the list; "Well Formed". Think of well formdness as really, really good HTML code where everything is properly nested and attributes all have single or double quotations. All XML documents MUST be well formed. Well formedness is simple. All tags MUST be properly nested within each other. All tags MUST have a start and end tag like this: <mytag></mytag> OR <mytag/>. The second option is useful if the element's contents is empty.

For example:
Use this when the element is full:
	<name>John is my name</name>

Use this or the short cut below it if the element is empty:
	<name></name>                                 
	<name/>

This would be wrong:
	<tstDocument>
		<name>John is my Name
	</tstDocument>
		</name>

This would be correct:  ( 2 names and 2 empty name tags )
	<tstDocument>
		<name sex="male">John is my Name</name>
		<name sex="female">Mary is my Name</name>
		<name/>
		<name></name>				
	</tstDocument>
Notice the enclosing <tstDocument/> tags. Similar to <HTML/> tags and <wml/> tags. These are the ROOT elements and all other tags MUST BE CHILDREN or DECENDENTS of the root. That is, all other elements must be nested within the root or within any element that's already under the root.

To find out more about the XML specification go to http://www.w3.org/TR/REC-xml.

Consider the following XML Document:

<?xml version="1.0"?>
<mydocument>

<paragraph>
      XML is defined as the the "Extensible 
      Markup Language". This makrkup langauge 
      is a universal programming language 
      for describing and structuring data in 
      its own independant form. No longer will 
      data need to rely on specific formatting 
      from the  various applications you may use.  
      XML can be used to define unlimited languages 
      for specific industries and applications 
      and even promises to simplify and lower 
      the cost of data interchange 
      and publishing in a Web environment. 
</paragraph>

<paragraph>
      XML is a text-based syntax that is readable by 
      both computer and humans. XML offers data 
      portability and reusability across different 
      platforms and devices. It is also flexible and 
      extensible, allowing new additions, definitions 
      and "tags" to  be added without breaking 
      an existing document structure. 
</paragraph>

</mydocument>

This XML document is well formed but it is not valid because we didn't reference any DTD describing it's valid contents. Recall that a DTD is optional. However, we do have an XML declaration. It is mandatory and must always be present. Next we have the root or document element <mydocument/>. Contained within this root element are two <paragraph/< elements. These two elements each contain charracter data which are the contents of the respective elements. That's it!!!!!!!! An perfect XML document. Be sure to check out the XML specification to find out more. http://www.w3.org/TR/REC-xml. The actual construction is very simple, just as our sample document is simple. However, if you want to ensure that you know all there is to know, be sure to go to the source.

A savy programmer with a keen eye would have noticed that our xml document contains elements that don't seem to describe mark up and they seem to have been made up on the fly. That's because the tags I used were made up right before creating this article. That's possible because XML doesn't have any particular set of fixed tags. There's no DTD for our new XML document unless we create one. Otherwise, we could use any tag we wanted. Why, because we will be writing the processing application and we know what those tags mean to us. We know that when ever we see a <mydocument/> tag that it contains <paragraph/> elements. We know that those <paragraph/> elements contain the actual paragraph data. Our XML application is used to tell us about our data rather than how our data should look. That's why, if you recall earlier, XML is not a replacement for HTML. An application of XML could very well have been written as a replacement. WML is, to me, HTML with less functionality to accomodate smaller less powerful wireless clients. We could write another specification just like HTML using XML if we felt it was necessary. Actually, it's already been done. XHTML is an XML based application which is a rewrite of HTML. It is a valid XML document with prefefined tags mimicking HTML. One of the advangages is that it can be manipulated by XML parsers which we will be discussing shortly. It also takes advantage of DTD validation. For most purposes, HTML will do just fine, especially that it's supported by most of the worlds web browsers. XHTML however is also supported because to the browser, it is HTML, with very, very good coding. Keep that in mind because in this next section, were going to transform our XML document into HTML. Since XHTML is actually XML, you might get some bright ideas.

• * XSL Extensible Stylesheet Language
The Extensible Style Sheet Language is yet another application of XML which allows programmers to express style sheets. XSL expresses how data should be laid out in a browser, hand held, or physical pages such as a catalogue or pamphlet. What we are concerned with here, since we are web programmers, is XSL Transformations ( XSLT ). XSLT is a way to transform our XML document into other XML documents. For our purposes, XHTML or HTML. Below is an example XSL style sheet:

<?xml version="1.0"?> 

<xsl:stylesheet xmlns:xsl="http://www.w3.org/
1999/XSL/Transform" version="1.0">
<xsl:output method="html" indent="yes"/>

<xsl:template match="mydocument">
	<xsl:value-of select="paragraph"/>
</xsl:template> 

</xsl:stylesheet>

Notice that it is a well formed XML document containing the XML declaration on the first line. The next line is what we call a name space. It states that any tag or element in this document that is prefixed with "xsl:" belongs to this namespace. The URI makes this namespace unique since any one can create a name space. At Orbidex, a possible namespace could be: xmlns:orb="http://www.orbidex.com/XML/Public/DTDS" Any tag from our name space would look like this: <orb:item/> There fore, if you created your own <item/> tag, it wouldn't be ambiguous to the processor. After the namespace, we have other elements such as <xsl:output/>. Use this to set the content-type along with other options. All elements and valid tags in the XSL Namespace can be found at http://www.w3.org/Style/XSL. Again, to ensure you know all there is to know, check this site out. This is the W3C so if you go here, you don't necessarilly have to go anywhere else. What is extreemely important here is the <template/> element. It defines a template that matches any <mydocument> element and gets the value of the first paragraph element it encounters. When our mydocument.xml file is transformed with this mydocument.xsl file, the browser should display the contents of the first <paragraph/> element encountered aftering matching the <mydocument> element.

• LETS DO IT!

What we need.
All we need is Interned Explorer 5.0 or above to run this example. We need IE5 because it is currently the only browser that supports XSLT. If our clients don't have IE5 or above, don't worry, we can transform the data on the server into HTML or XHTML and then send it. This eliminates any inconpatibilities and makes XML cross browser ( Actually cross whatever ) compatible.

You can go to Microsoft.com and get an IE5 upgrade or you can just learn from this example. Later on in this tutorial, as asformentioned, we will be doing server side transformations using VBScript and Java.

• BEGIN

Step 1
Download this XML Document and put it in any directory
--------
myDocument.xml

Step 2
Download this XSL Document and put it in the samed directory
--------
mydocument.xsl

Step 3
Open mydocument.xml in Internet Explorer 5 or above and view the contents of the first <paragraph/> element contained within the <mydocument/> element.

Go to the XSL Transformations site and find out more tags. Add as many elements as you wish to these documents and learn, learn, learn.

XML Software, Resources, and Development Tools

 

 
150 Chestnut St. Providence RI 02903 | 1-877-ORBIDEX | xml@orbidex.com

© 2001 - Orbidex Inc. - All Rights Reserved