XslGen is a simple, yet effective tool for web site content management. Develop HTML layout in your favorite HTML editor, and keep your content in XML. XslGen lets you seamlessly fuse your HTML templates and complex XML content into a sophisticated web site!
Having your HTML decorations separated from content gives you lots of benefits:
No run-time server-side programming. For content management of your site you do not need to program, debug and maintain any server-side scripting. You just have your HTML and content.
Single manual change - automatic update. Once you decide to change the view of details of your site (for example, a heading picture or navigator) you change a single HTML template, and all your changes will be propagated to all relevant pages of your site automatically.
Your content gets more consistent. If you have your content in XML file, XML parser will automatically check if the markup of your data has any errors. Additionaly, you can use any XML Schema processor for more thorough vailidity check of your data.
Content is more manageable. Finally, content which is kept as pure XML is not ambiguous as HTML markup. Content preparation and handling may be consistently automated.
Technically, XslGen provides simple HTML-embeddable language which provides easier-to-use shortcuts to XSLT instructions.
With XslGen, site maintenance is simple:
Develop HTML layout (so called "template") in your favorite HTML editor.
Providing simple instructions in the template you tell where to take content data, and how to generate complex tables.
XslGen generates XSLT specs out of the templates, and then any XSLT processor creates your site in a snap.
Updating layout changes in your whole site could not be easier: change single template for many pages or change XML content without worries on the HTML layout.
For creating web sites with simple content you may know nothing about XSLT, and even for complex site generation only minimal XSLT knowledge is required.
Installation of XslGen by itself is very simple: you just copy XslGen executable (xslgen for Linux, or xslgen.exe for Windows) to any directory from your path variable.
To build a site you also need to install your favorite XSLT processor (in our examples we use James Clark's XT), and optionally install GNU Make utility (that may be useful for automated regeneration of your web pages; see the Section called Makefile: Take It All Together in Chapter 3).
You will also need to copy tidy.conf file from XslGen distrubution to your working (project) directory. This file contains options which is required by XslGen to correctly process HTML templates; if you are familiar with tidy, you may add your own options. (Tidy is a HTML normalization program, written by Dave Ragget. See http://www.w3.org/People/Raggett/tidy).
If you are aware how to install an XSLT processor (and make, if you intend to use it), you may skip the rest of this chapter.
In the tutorial for XslGen we use James Clark's processor XT. If you want to use it also, you firstly need to install Java. See Java™ 2 Installation Notes
Download XT (for example from Bill Lindsey's XT Home). Run XT with xt.sh (possibly modified for your Java installation) from demo directory in the distribution.
Note: In the XslGen Tutorial we used xt script, renamed from xt.sh.
In the tutorial for XslGen we used James Clark's processor XT. If you want to use it, you firstly need to install Java. See Java™ 2 Installation Notes
Download XT (for example from Bill Lindsey's XT Home). Run XT with xt.bat from demo directory in the distribution.
You can get GNU Make Windows port from MinGW collection.
To install merely copy make.exe to any directory from your path environment variable.
Suppose, that you have XslGen and XT installed as prescribed in Chapter 2. In your favorite HTML editor you have developed the template for the web-site that looks like this (see examples/example1/index_template.html):
Note: Lets have a convention, that our HTML template files have .html or .xhtml suffix, and generated HTML ones have .htm suffix. This will help us quickly remove generated files if needed.
XslGen will now help us to have our HTML decorations unaffected whenever we change the content of web-site. And this content should be kept in XML files, that are marked up as it is convenient for you.
For this first simple example we want a very simple content "Hello, World" to show up in the second row of our template table. For this we prepare an XML file content.xml that will keep our content:
Note, that the names of the tags are arbitrary. In this example we just picked up the most intuitive tag names.
All right, and now we specify XslGen data instruction, with parameter: page/content, that is (actually XPath instruction) pointing at content location. See Figure 3-3
Figure 3-3. HTML Template With XslGen Instruction
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<TITLE></TITLE>
<META HTTP-EQUIV="Content-Type"
CONTENT="text/html;charset=iso-8859-1">
<LINK REL="stylesheet" type="text/css" href="../jetfleour.css">
</HEAD>
<BODY BGCOLOR=#FFFFFF LEFTMARGIN=0 TOPMARGIN=0
MARGINWIDTH=0 MARGINHEIGHT=0>
<TABLE WIDTH=600 BORDER=0 align=center CELLPADDING=0 CELLSPACING=0>
<TR>
<TD>
<IMG SRC="images/header.gif" WIDTH=600 HEIGHT=72 ALT=""></TD>
</TR>
<TR>
<TD>
<H3></H3>
<!-- Content comes here -->
xa-data: path=`page/content`; (1)
</TD>
</TR>
</TABLE>
</BODY>
</HTML>
In the current directory we now run the following command:
xslgen index_template.html index_template.xsl
If everything went fine XslGen silently created index_template.xsl XSLT specification, that prescribes an XSLT processor how to generate HTML web page. So that now we run XT with the following command:
xt content.xml index_template.xsl index.htm
So, if xt was properly installed then in example1 directory you will have index.htm generated out of HTML template and XML content that looks like this:
If you look at the sources of generated page you may note that the HTML became pretty neat: all the tags are in lowercase, and all the attribute values are in double quotes. The trick is that before processing HTML template XslGen runs Dave Raggett's HTML TIDY utility (see http://www.w3.org/People/Raggett/tidy). The parameters for tidy XslGen reads from tidy.conf in the current directory. You may play with tidy parameters, and, for example, make XslGen verbose about possible errors in the HTML template. Make sure that there is tidy.conf in your working directory; if not, you may copy tidy.conf provided in XslGen distribution.
Once we managed to get simple example up and running, we now will provide more complex content to our web-page, and elaborate the structure of the HTML template.
So, in our web page we want to have:
Window caption title.
Page title (for simplicity, contains window caption text).
Navigator stripe (that keeps links to major pages).
Body content.
Lets consider the following XML content file (see examples/examples2):
Figure 3-5. Content XML File
<!DOCTYPE page [ (1) <!ENTITY % lat1 SYSTEM "../xhtml/xhtml-lat1.ent"> %lat1; ]> <page> <caption>JetFleour: Contacts</caption> (2) <navigator> (3) <center> <a CLASS="nav" HREF="index.htm">Contacts</a> <a CLASS="nav" HREF="tours/index.htm">Tours</a> </center> </navigator> <content> (4) <div class="content"> <pre> <b>Address:</b> JetFleour Tours, Main St, 1021, Some-City, WA, 1299 </pre> <dl> <dt><b>Phone</b>:</dt> <dd>+1(555)555-33-22</dd> <dt><b>FAX</b>:</dt> <dd>+1(555)555-33-44</dd> <dt><b>E-mail</b>:</dt> <dd>info@jetfleour.com</dd> </dl> </div> </content> </page>
Our HTML template now looks like this (see index_template.html in example2):
Figure 3-6. HTML Template for Complex Content
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML class="base=`page`"> (1) <HEAD> <TITLE>xa-data: path=`caption`;</TITLE> <META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=iso-8859-1"> <LINK REL="stylesheet" type="text/css" href="../jetfleour.css"> </HEAD> <BODY BGCOLOR=#FFFFFF LEFTMARGIN=0 TOPMARGIN=0 MARGINWIDTH=0 MARGINHEIGHT=0> <TABLE WIDTH=600 BORDER=0 CELLPADDING=0 CELLSPACING=0 align="center"> <TR> <TD><IMG SRC="images/header.gif" WIDTH=600 HEIGHT=71></TD> </TR> <TR> <TD BACKGROUND="images/navigator.gif" WIDTH=600 HEIGHT=21 VALIGN="top"> xa-content: path=`navigator`; (2) </TD> </TR> <TR> <TD> <h3>xa-data: path=`caption`;</h3> xa-content: path=`content`; (3) </TD> </TR> </TABLE> </BODY> </HTML>
In this case base defines the base path, so the path values in the following XslGen instructions are considered relative to this base. For example, xa-data: path=`caption`; takes the data from path: page/caption.
These instructions are mostly similar, but xa-data is intended for raw-string content (like captions or title), while xa-content is better for XML-marked up content.
Note: If you are familiar with XSLT, then it may be interesting for you, that xa-data XslGen instruction generates XSLT value-of instruction, while xa-content generates XSLT copy-of instruction.
Just like in example1, after issuing two commands in example2 directory:
xslgen index_template.html index_template.xsl xt content.xml index_template.xsl index.htm
we generate the following index.htm web-page, that looks like this:
In our site we actually want to generate many pages using still the same HTML template, just providing different XML content files. Let us, for example, have the following requirements for our site:
The site has two pages: "Contacts" and "Tours".
Both pages are generated from single template index_template.html.
index_template.html is placed in the top dir of the site (in our case: example3).
Page "Contacts" and its content file content.xml is also placed in example3 directory.
Page "Tours" and its content file content.xml is placed in example3/tours.
The given requirements give the following impact on our template and content files:
HTML files generated from index_template.html residing in different directories must correctly refer to relative directory example3/images.
Both content files use the same navigator stripe. We would like to reuse in the both content files.
That is how we (finally!) get our HTML template invariant to the paths of the shared components.
Figure 3-8. Path-Invariant HTML Template
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML class="base=`page`">
<HEAD>
<TITLE>xa-data: path=`caption`;</TITLE>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=iso-8859-1">
<LINK REL="stylesheet" type="text/css"
href="../jetfleour.css"
class="attr=`href concat('../', path, 'jetfleour.css')`">(1)
</HEAD>
<BODY BGCOLOR=#FFFFFF LEFTMARGIN=0 TOPMARGIN=0
MARGINWIDTH=0 MARGINHEIGHT=0>
<TABLE WIDTH=600 BORDER=0 CELLPADDING=0 CELLSPACING=0
align="center">
<TR>
<TD><IMG src="images/header.gif"
class="attr=`src concat(path,'images/header.gif')`" (2)
WIDTH=600 HEIGHT=71 ALT=""></TD>
</TR>
<TR>
<TD background="images/navigator.gif"
class="attr=`background concat(path, (3)
'images/navigator.gif'); class 'myclass'`"
WIDTH=600 HEIGHT=21 VALIGN="top">
xa-content: path=`navigator`;
</TD>
</TR>
<TR>
<TD>
<h3>xa-data: path=`caption`;</h3>
xa-content: path=`content`;
</TD>
</TR>
</TABLE>
</BODY>
</HTML>
You can see, that these attibutes (link, src, background) are also provided in the template. Of course they will be overriden after processing, but they are convenient for template preview in your HTML editor or browser.
Note: Sometimes you need generation of several attributes for one tag. In this case you should list them, followed by semicolon, like:
<TD class="attr=`background concat(path, 'images/navigator.gif');class 'myclass'`">
Note: If you are aware of XPath, then you may actually recognize that XPath function concat is used. You are free to use any XPath expressions for content and attribute generation.
I would propose the following content file for tours directory:
Figure 3-9. Content File for Path-Invariant Template
<!DOCTYPE page [ <!ENTITY % lat1 SYSTEM "../../xhtml/xhtml-lat1.ent"> %lat1; <!ENTITY path "../"> (1) <!ENTITY navigator SYSTEM "../navigator.xml"> (2) ]> <page> <caption>JetFleour: Tours</caption> <path>&path;</path> <navigator> (3) &navigator; </navigator> <content> <div class="content"> The list of available tours will be published in example4. </div> </content> </page>
However values of href attributes of navigators differ, because HTML pages are located in different directories. But this is why we define path entity! Check out navigator.xml to see how path parameterize it.
Now, when it is clear why our HTML template and content files the way they are, lets generate the two page web site. For this in example3 directory you just run batch script: ./generate if you are in Linux, or generate.bat if you are in Windows. The script runs tree commands:
xslgen index_template.html index_template.xsl
xt content.xml index_template.xsl index.htm
xt tours/content.xml index_template.xsl tours/index.htm
Note: If your site is more than just a couple of pages, then running script that always regenerates all the pages may be way too slow. This problem is solved by using make utility, see the Section called Makefile: Take It All Together.
In this chapter we'll see how XslGen helps generating quite complex tables, understand conditionals and also get familiar with alternative XslGen instruction notation. We need the alternative notation because if you prefer to create your Web Sites in XHTML (which is HTML written in XML syntax), then the templates should be written using different (more consistent and readable) syntax.
Ok, lets try to create a nice table in the tours page (see example4). Suppose that we have the following XML document that provides the tour data (see tours.xml):
Figure 3-10. Tour XML Data
<tours> <tour available="yes"> (1) <name>Costa Brava Sun</name> <description> A great vacation tour, excellent for Catalonia sightseeing. </description> <offers> <pack> <price>$ 999</price> <duration>7 days</duration> </pack> <pack> <price hot='yes'>$ 1049</price> (2) <duration>14 days</duration> </pack> </offers> </tour> <tour available="no"> (3) <name>Thailand Diving</name> <description> The tour has yet to be organized. Joe has to contact JetFleourAsia for updates. </description> </tour> <tour available="yes"> (4) <name>Penghu Islands Windsurfing</name> <description> Reveal Taiwan, enjoy windsurfing! </description> <offers> <pack> <price>$ 2059</price> <duration>7 days</duration> </pack> <pack> <price>$ 2499</price> <duration>14 days</duration> </pack> </offers> <duration>14 days</duration> </tour> </tours>
This time for generating a table we provide the following XHTML template to illustrate XML XslGen syntax:
Figure 3-11. XHTML Template for Tours Table
<table xmlns:xa="http://www.syntext.com/schemas/xslgen" (1) xa:base="tours" cellpadding="0" width="100%"> <tr xa:base="tour" (2) xa:cond="@available='yes'"> (3) <td> <table width="100%" class="content" cellpadding="3"> <tr bgcolor="#f0f0f0"> <td width="20%"> <h4><xa:data path="name"/></h4> </td> </tr> <tr> <td width="20%"> <p><xa:data path="description"/></p> </td> </tr> <tr> <td> <table width="100%" class="bodyText"> <tr> <td><b>Price</b></td> <td><b>Duration</b></td> </tr> <tr xa:base="offers/pack" > (4) <td><font xa:attr="color?price[@hot='yes'] 'red'">(5) <xa:data path="price"/></font></td> <td><xa:data path="duration"/></td> </tr> </table> </td> </tr> </table> </td> </tr> </table>
Note: This is also pure XPath syntax.
Having this template and content file, we will generate tours.htm, that will be included to the content of index.htm using XML entity mechanism (see index.xml):
Figure 3-12. Content File for Tours Table Decoration
<!DOCTYPE page [ <!ENTITY path "./"> <!ENTITY navigator SYSTEM "navigator.xml"> <!ENTITY tours SYSTEM "tours.htm"> (1) ]> <page> <caption>JetFleour: Tours</caption> <path>&path;</path> <navigator> &navigator; </navigator> <content> &tours; <h6><center><a href="tours.htm">PRINTER FRIENDLY (2) </a></center></h6> </content> </page>
The set of commands for this page generation is the following (see generate or generate.bat):
Figure 3-13. Commands to Generate Table
xslgen index_template.html index_template.xsl xslgen -x tours_template.xhtml tours_template.xsl (1) xt tours.xml tours_template.xsl tours.htm (2) xt content.xml index_template.xsl index.htm
Finally our web-page looks like in the picture below:
And the printer-friendly page looks pretty nice also:
When your site gets larger, regeneration of HTML pages becomes pretty ineffective with a straightforward script. Manual regeneration is a pain and pretty inconsistent. But thanks to make utility, this problem might be handled pretty graciously.
In example5 we take all our pieces together: the "Contacts" page, the "Tours" page, and the "Tours" table. The structure of directories is the same as in example4. Generation of our pages will be driven by Makefile that allows maketo guess which templates and web pages should be regenerated (because of modification time). Using this Makefile example, you will be able to easily add new pages to your own Makefiles.
Note: Writing makefiles sometimes is an art, and we do not intend to teach you all the tricks. But we hope that this example will be sufficient for building your own sites.
Figure 3-16. Makefile for JetFleour Site
XT = xt (1) MAIN_T = index_template.xsl (2) TOURS_T = tours/tours_template.xsl (3) .SUFFIXES: .html .htm .xsl .xml .xhtml # This rule should list all the pages to generate all: index.htm tours/index.htm tours/tours.htm (4) clean: rm `find . -name "*.xsl" -or -name "*.htm" -or -name "*.xhtm"`(5) # Rules generating templates .xhtml.xsl: xslgen -x $< $@ (6) .html.xsl: xslgen $< $@ (7) # Rules generating pages index.htm: content.xml $(MAIN_T) navigator.xml (8) $(XT) content.xml $(MAIN_T) $@ (9) tours/tours.htm: tours/tours.xml $(TOURS_T) $(XT) tours/tours.xml $(TOURS_T) $@ (10) tours/index.htm: tours/content.xml $(MAIN_T) navigator.xml tours/tours.htm $(XT) tours/content.xml $(MAIN_T) $@ (11)
| Warning |
Make sure that the first symbol on the second line of a rule is TAB! Otherwise the rule will just not work! |
Now, when you execute in the example5 directory simple command:
make
then make utility will figure out what files need regeneration and will run the necessary commands for them.
There are two notations for XslGen instructions. First one is so called HTML notation, which is used in HTML files, that are composed with editors unable to handle namespaces. The second one (more consistent) is so called XHTML notation, which uses XML namespace notation.
If you use XHTML notation, then in your templates you should define the following namespace: xmlns:xa="http://www.syntext.com/schemas/xslgen"
Insert string-value of the data from location specified by path; path is relative to the current base. Attribute path is optional, its value defaults to ".".
Insert XML-tree data from location specified by path; path is relative to the current base. Attribute path is optional, its value defaults to ".".
This attribute defines starting point (root element) for XslGen instruction processing. Defined base is valid inside the scope of enclosing element if it is not redefined by base attribute in sub-elements; in such case starting point is built relatively to the enclosing element's base. It is possible to override this rule and use absolute addressing for base redefinitions by using XPath reference to root "//".
If path points to element with cardinality more than 1, then some template rules will be instantiated several times (for example several table rows <tr>x in HTML table template will be generated). Instantiation means that current element and all it's children will be instantiated.
Attribute-name specifies element attribute name which is to be generated, xpath-expr is an XPath expression which specifies generated attribute value. Given expression must point to a data node of XML tree or be a character literal (enclosed in single-quotes).
Optionally, attribute-name may have attached XPath conditional expression cond-expr, separated from attribute-name by question mark; in such case attribute will be generated only if this conditional expression evaluates to true.
xslgen [-v] [-h] [-x] [-X] [-t] [-d mask] {template_file} {xsl_output_file}
Prints version number.
Preprocess HTML, dump XHTML tree (instead of XSLT) and exit.
Take XHMTL as input.
Generate stylesheet for XHTML output.
Run tidy and exit.
HTML or XHTML template file name.
XSLT output file name.
To process XML sources that contain non-latin1 characters it is recommended to use the following approach:
Provide XML sources (or convert them) to UTF-8 encoding.
Make sure that tidy.conf contains utf-8 for char-encoding: option.