|
This document should serve as a starting point
for any one who is interested in xml and it's related technologies
from a programmers point of view. After going through this
tutorial, you will be able to identify the many different
components of the xml development environment and understand
how to deploy them. The following tutorial will cover XML,
XSL, and XSLT and give examples written on both Microsoft
and Java development platforms. This is part 1 of the turorial.
Today you will learn about XML and XSLT. You will go to work
tomorrow having written an XML document, an XSLT document
and will have transformed that document for display. All tools
and source code will be provided. All you need to do is spend
a little time.
What should I know?
To work with XML, an understanding of web based, client/server
application development is a must. The language in witch you
choose to develop, however, isn't important. Our examples
will be written in VBScript and Java. Visual Basic and Java
Script could also be used. In fact, they have similar syntax
and are identical in concept. You will see that a little bit
goes a long way. That's because many of the world wide webs
technologies have been developed in the open source arena
and over seen by the World Wide Web Consortium (W3C). The
W3C, along with many contributors, defines specifications
and guidelines for application development while encourging
interroperability if disparate systems. XML is no exception.
In fact, it is this paradigm that XML supports so well. Keep
this in mind when going through this turorial. The concepts
are the most important things to remember here. NOT syntax.
XML
Extensible Markup Language
XML is a language derrived from The Standard Generalized
Markup Language (SGML) as is the familiar Hyper Text Markup
Language (HTML). SGML is complex and very difficult to conform
to. It due to this complexity that it is ruled out, for many
organizations, as a possible solution. HTML, which I'm sure
your familiar with, is easy. It defines a particular set of
tags and their purpose; making learning HTML very straight forward.
HTML tags, however, only communicate the visual display of data.
For example, the specification defines a <font/> tag which
takes parameters for the font size and color. It specifies <table/>,
<tr/>,<td/> tags which express the arrangement of
data in a table of rows and columns. HTML is great for expressing
the layout and display of data.
Contrary to what many believe, HTML is not being replaced by
XML. What XML is doing is helping to define other languages
that are similar to HTML. For example, the Wireless Application
Protocol (WAP), which is said to be an application of XML, defines
a Wireles Markup Language (WML), which is similar to HTML in
that it's many tags are used to communicate visual layout to
the Processor. Processor being the program that interprets the
WML, be it a palm pilot or a cell phones internal "browser".
Let's go back a few sentences and discuss what it means to be
"an application of XML". Well, WML is actually a valid and well
formed XML document. It looks like HTML, that's because when
the tags that are defined are for a very similar purpose, that
is, to be displayed in some type of browser.
For example:
<? xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN
"http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
<card id="test">
<a href="http://www.xmldataworld.com"
>Hello Wap user!!!</a>
</card>
</wml>
The above document has HTML like tags with one in particular
that should be familiar; "<a/>" It is a hyperlink with
the display Hello Wap User. Notice the <? xml version="1.0"?>
tag. It is called an XML declaration. It declares that the document
is in fact an XML document and that it's version is the current
version 1.0. The next line is the doc type definition. This
is a document that defines all valid tags or elements and their
attributes. These doctypes are publicly available and can be
referenced from your programs to validate your document, such
as the one above. If any of the tags are illegal, that is, if
any combination of tags and attributes in your document are
not defined in the DTD, the document is said to be invalid.
When used in your program, invalid data will cause errors to
be thrown. Have you ever used an HTML validator and wondered
how it knows what's leagal or illegal? That's the power of the
DTD. Click this link to check out the HTML 3.2 DTD to get a
look. Basically, it declares entities, elements and their associated
attributes. If you understand DTD's you could learn HTML just
from this document alone as it is the primary source. Note however
that the use of a DTD for validation purposes is optional. In
XML, the DTD is to ensure validity of a document but it is not
necessary.
Now that you understand DTD's and what it means to be a valid
XML document, let me add another charracterristic of XML to
the list; "Well Formed". Think of well formdness as really,
really good HTML code where everything is properly nested and
attributes all have single or double quotations. All XML documents
MUST be well formed. Well formedness is simple. All tags MUST
be properly nested within each other. All tags MUST have a start
and end tag like this: <mytag></mytag> OR <mytag/>.
The second option is useful if the element's contents is empty.
For example:
Use this when the element is full:
<name>John is my name</name>
Use this or the short cut below it if the element is empty:
<name></name>
<name/>
This would be wrong:
<tstDocument>
<name>John is my Name
</tstDocument>
</name>
This would be correct: ( 2 names and 2 empty name tags )
<tstDocument>
<name sex="male">John is my Name</name>
<name sex="female">Mary is my Name</name>
<name/>
<name></name>
</tstDocument>
Notice the enclosing <tstDocument/> tags. Similar to <HTML/>
tags and <wml/> tags. These are the ROOT elements and
all other tags MUST BE CHILDREN or DECENDENTS of the root. That
is, all other elements must be nested within the root or within
any element that's already under the root.
To find out more about the XML specification go to http://www.w3.org/TR/REC-xml.
Consider the following XML Document:
<?xml version="1.0"?>
<mydocument>
<paragraph>
XML is defined as the the "Extensible
Markup Language". This makrkup langauge
is a universal programming language
for describing and structuring data in
its own independant form. No longer will
data need to rely on specific formatting
from the various applications you may use.
XML can be used to define unlimited languages
for specific industries and applications
and even promises to simplify and lower
the cost of data interchange
and publishing in a Web environment.
</paragraph>
<paragraph>
XML is a text-based syntax that is readable by
both computer and humans. XML offers data
portability and reusability across different
platforms and devices. It is also flexible and
extensible, allowing new additions, definitions
and "tags" to be added without breaking
an existing document structure.
</paragraph>
</mydocument>
This XML document is well formed but it is not valid because
we didn't reference any DTD describing it's valid contents.
Recall that a DTD is optional. However, we do have an XML declaration.
It is mandatory and must always be present. Next we have the
root or document element <mydocument/>. Contained within
this root element are two <paragraph/< elements. These
two elements each contain charracter data which are the contents
of the respective elements. That's it!!!!!!!! An perfect XML
document. Be sure to check out the XML specification to find
out more. http://www.w3.org/TR/REC-xml. The actual construction
is very simple, just as our sample document is simple. However,
if you want to ensure that you know all there is to know, be
sure to go to the source.
A
savy programmer with a keen eye would have noticed that our
xml document contains elements that don't seem to describe mark
up and they seem to have been made up on the fly. That's because
the tags I used were made up right before creating this article.
That's possible because XML doesn't have any particular set
of fixed tags. There's no DTD for our new XML document unless
we create one. Otherwise, we could use any tag we wanted. Why,
because we will be writing the processing application and we
know what those tags mean to us. We know that when ever we see
a <mydocument/> tag that it contains <paragraph/>
elements. We know that those <paragraph/> elements contain
the actual paragraph data. Our XML application is used to tell
us about our data rather than how our data should look. That's
why, if you recall earlier, XML is not a replacement for HTML.
An application of XML could very well have been written as a
replacement. WML is, to me, HTML with less functionality to
accomodate smaller less powerful wireless clients. We could
write another specification just like HTML using XML if we felt
it was necessary. Actually, it's already been done. XHTML is
an XML based application which is a rewrite of HTML. It is a
valid XML document with prefefined tags mimicking HTML. One
of the advangages is that it can be manipulated by XML parsers
which we will be discussing shortly. It also takes advantage
of DTD validation. For most purposes, HTML will do just fine,
especially that it's supported by most of the worlds web browsers.
XHTML however is also supported because to the browser, it is
HTML, with very, very good coding. Keep that in mind because
in this next section, were going to transform our XML document
into HTML. Since XHTML is actually XML, you might get some bright
ideas.
* XSL Extensible Stylesheet Language
The Extensible Style Sheet Language is yet another application
of XML which allows programmers to express style sheets. XSL
expresses how data should be laid out in a browser, hand held,
or physical pages such as a catalogue or pamphlet. What we are
concerned with here, since we are web programmers, is XSL Transformations
( XSLT ). XSLT is a way to transform our XML document into other
XML documents. For our purposes, XHTML or HTML. Below is an
example XSL style sheet:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/
1999/XSL/Transform" version="1.0">
<xsl:output method="html" indent="yes"/>
<xsl:template match="mydocument">
<xsl:value-of select="paragraph"/>
</xsl:template>
</xsl:stylesheet>
Notice
that it is a well formed XML document containing the XML declaration
on the first line. The next line is what we call a name space.
It states that any tag or element in this document that is
prefixed with "xsl:" belongs to this namespace. The URI makes
this namespace unique since any one can create a name space.
At Orbidex, a possible namespace could be: xmlns:orb="http://www.orbidex.com/XML/Public/DTDS"
Any tag from our name space would look like this: <orb:item/>
There fore, if you created your own <item/> tag, it
wouldn't be ambiguous to the processor. After the namespace,
we have other elements such as <xsl:output/>. Use this
to set the content-type along with other options. All elements
and valid tags in the XSL Namespace can be found at http://www.w3.org/Style/XSL.
Again, to ensure you know all there is to know, check this
site out. This is the W3C so if you go here, you don't necessarilly
have to go anywhere else. What is extreemely important here
is the <template/> element. It defines a template that
matches any <mydocument> element and gets the value
of the first paragraph element it encounters. When our mydocument.xml
file is transformed with this mydocument.xsl file, the browser
should display the contents of the first <paragraph/>
element encountered aftering matching the <mydocument>
element.
LETS DO IT!
What we need.
All we need is Interned Explorer 5.0 or above to run this
example. We need IE5 because it is currently the only browser
that supports XSLT. If our clients don't have IE5 or above,
don't worry, we can transform the data on the server into
HTML or XHTML and then send it. This eliminates any inconpatibilities
and makes XML cross browser ( Actually cross whatever ) compatible.
You can go to Microsoft.com and get an IE5 upgrade or you
can just learn from this example. Later on in this tutorial,
as asformentioned, we will be doing server side transformations
using VBScript and Java.
BEGIN
Step 1
Download this XML Document and put it in any directory
--------
myDocument.xml
Step 2
Download this XSL Document and put it in the samed directory
--------
mydocument.xsl
Step 3
Open mydocument.xml in Internet Explorer 5 or above and view
the contents of the first <paragraph/> element contained
within the <mydocument/> element.
Go to the XSL Transformations site and find out more tags.
Add as many elements as you wish to these documents and learn,
learn, learn.
XML
Software, Resources, and Development Tools
|