XSLT: Difference between revisions
Mr. Shoeless (talk | contribs) |
Mr. Shoeless (talk | contribs) |
||
Line 16: | Line 16: | ||
Most current operating systems have an XSLT processor installed. For example, [[Windows XP]] comes with the MSXML3 library, which includes an XSLT processor. Earlier versions may be upgraded and there are many alternatives, see the ''[[#External links|External Links]]'' section. |
Most current operating systems have an XSLT processor installed. For example, [[Windows XP]] comes with the MSXML3 library, which includes an XSLT processor. Earlier versions may be upgraded and there are many alternatives, see the ''[[#External links|External Links]]'' section. |
||
The W3C finalized the XSLT 1.0 specification in [[1999]]. The XSLT 2.0 specification is currently a Candidate Recommendation. |
The W3C finalized the XSLT 1.0 specification in [[1999]]. The XSLT 2.0 specification is currently a Candidate [[W3C_recommendation|Recommendation]]. |
||
== Example 1 (transforming XML to XML) == |
== Example 1 (transforming XML to XML) == |
Revision as of 19:56, 12 May 2006
Extensible Stylesheet Language Transformations, or XSLT, is an XML-based language used for the transformation of XML documents. The original document is not changed; rather, a new document is created based on the content of an existing one. The new document may be serialized (output) by the processor in standard XML syntax or in another format, such as HTML or plain text. XSLT is most often used to convert data between different XML schemas or to convert XML data into web pages or PDF documents.
XSLT was produced as a result of the Extensible Stylesheet Language (XSL) development effort within W3C during 1998–1999, which also produced XSL Formatting Objects (XSL-FO) and the XML Path Language, XPath. The editor of the first version (and in effect the chief designer of the language) was James Clark. The version most widely used today is XSLT 1.0, which was published as a Recommendation by the W3C on 16 November 1999. A greatly expanded version 2.0, under the editorship of Michael Kay, reached the status of a Candidate Recommendation from W3C on 3 November 2005.
Overview
The XSLT language is declarative — rather than listing an imperative sequence of actions to perform in a stateful environment, an XSLT stylesheet consists of a template rules collection, each of which specifies what to add to the result tree when the XSLT processor, scanning the source tree, according to a fixed algorithm, finds a node that meets conditions. Instructions within template rules are processed as if they were sequential instructions; but, in fact, they comprise functional expressions, representing their evaluated results - ultimately, nodes to be added to the result tree.
The XSLT specification defines a transformation in terms of source and result trees to avoid locking implementations into system-specific APIs and memory, network and file I/O issues. For example, the specification does not mandate that a source tree always be derived from an XML file, since it may be more efficient for the processor to read from an in-memory DOM object or some other implementation-specific representation. Output may be in a format not envisioned by the XSLT language's designers. However, XSLT processing often begins by reading a serialized XML input document into the source tree and ends by writing the result tree to an output document. The output document may be XML, but can be HTML, RTF, TeX, delimited files, plain text or any other format that the XSLT processor is capable of producing.
XSLT relies upon the W3C's XPath language for identifying subsets of the source document tree, as well as for performing calculations. XPath also provides a range of functions, which XSLT itself further augments. This reliance upon XPath adds a great deal of power and flexibility to XSLT.
Most current operating systems have an XSLT processor installed. For example, Windows XP comes with the MSXML3 library, which includes an XSLT processor. Earlier versions may be upgraded and there are many alternatives, see the External Links section.
The W3C finalized the XSLT 1.0 specification in 1999. The XSLT 2.0 specification is currently a Candidate Recommendation.
Example 1 (transforming XML to XML)
Transforming the XML document
<?xml version="1.0"?> <persons> <person username="MP123456"> <name>John</name> <family_name>Smith</family_name> </person> <person username="PK123456"> <name>Morka</name> <family_name>Ismincius</family_name> </person> </persons>
by the following XSLT transform:
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="xml" indent="yes"/> <xsl:template match="/"> <transform> <xsl:apply-templates/> </transform> </xsl:template> <xsl:template match="person"> <record> <username> <xsl:value-of select="@username" /> </username> <name> <xsl:value-of select="name" /> </name> </record> </xsl:template> </xsl:stylesheet>
We obtain the new document, having another structure:
<?xml version="1.0" encoding="UTF-8"?> <transform> <record> <username>MP123456</username> <name>John</name> </record> <record> <username>PK123456</username> <name>Morka</name> </record> </transform>
Example 2 (transforming XML to XHTML)
Example of incoming XML document:
<?xml version="1.0" encoding="UTF-8"?> <domains> <sun.com ownedBy="Sun Microsystems Inc."> <host> www <use>World Wide Web site</use> </host> <host> java <use>Java info</use> </host> </sun.com> <w3.org ownedBy="The World Wide Web Consortium"> <host> www <use>World Wide Web site</use> </host> <host> validator <use>web developers who want to get it right</use> </host> </w3.org> </domains>
Example XSLT Stylesheet:
<?xml version="1.0" encoding="UTF-8" ?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml"> <xsl:output method="xml" indent="yes" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"/> <!--XHTML document outline--> <xsl:template match="/"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>test1</title> <style type="text/css"> h1 { padding: 10px; padding-width: 100%; background-color: silver } td, th { width: 40%; border: 1px solid silver; padding: 10px } td:first-child, th:first-child { width: 20% } table { width: 650px } </style> </head> <body> <xsl:apply-templates/> </body> </html> </xsl:template> <!--Table headers and outline--> <xsl:template match="domains/*"> <h1><xsl:value-of select="@ownedBy"/></h1> <p>The following host names are currently in use at <strong><xsl:value-of select="local-name(.)"/></strong> </p> <table> <tr><th>Host name</th><th>URL</th><th>Used by</th></tr> <xsl:apply-templates/> </table> </xsl:template> <!--Table row and first two columns--> <xsl:template match="host"> <!--Create variable for 'url', as it's used twice--> <xsl:variable name="url" select= "normalize-space(concat('http://', normalize-space(node()), '.', local-name(..)))"/> <tr> <td><xsl:value-of select="node()"/></td> <td><a href="{$url}"><xsl:value-of select="$url"/></a></td> <xsl:apply-templates select="use"/> </tr> </xsl:template> <!--'Used by' column--> <xsl:template match="use"> <td><xsl:value-of select="."/></td> </xsl:template> </xsl:stylesheet>
Output XHTML that this would produce (whitespace has been adjusted here for clarity):
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> <head> <meta content="text/html;charset=UTF-8" http-equiv="Content-Type" /> <title>test1</title> <style type="text/css"> h1 { padding: 10px; padding-width: 100%; background-color: silver } td, th { width: 40%; border: 1px solid silver; padding: 10px } td:first-child, th:first-child { width: 20% } table { width: 650px } </style> </head> <body> <h1>Sun Microsystems Inc.</h1> <p>The following host names are currently in use at <strong>sun.com</strong></p> <table> <tr> <th>Host name</th> <th>URL</th> <th>Used by</th> </tr> <tr> <td>www</td> <td><a href="http://www.sun.com">http://www.sun.com</a></td> <td>World Wide Web site</td> </tr> <tr> <td>java</td> <td><a href="http://java.sun.com">http://java.sun.com</a></td> <td>Java info</td> </tr> </table> <h1>The World Wide Web Consortium</h1> <p>The following host names are currently in use at <strong>w3.org</strong></p> <table> <tr> <th>Host name</th> <th>URL</th> <th>Used by</th> </tr> <tr> <td>www</td> <td><a href="http://www.w3.org">http://www.w3.org</a></td> <td>World Wide Web site</td> </tr> <tr> <td>validator</td> <td><a href="http://validator.w3.org">http://validator.w3.org</a></td> <td>web developers who want to get it right</td> </tr> </table> </body> </html>
In a web browser, this XHTML appears as:
Template rule processing
XSLT stylesheets are declarative, not procedural; rather than defining a sequence of operations to execute, they define rules and other hints applied during processing, according to a fixed algorithm. The algorithm, which is somewhat complicated, is described below, although many of its esoteric details have been omitted.
Every XSLT processor is required to behave as if it had followed the following steps to prepare for the transform:
- Read the XSLT stylesheet with an XML parser and convert (abstract, rather) its content to a tree of nodes (the stylesheet tree), according to the XPath data model. "Compile-time" stylesheet syntax errors are detected at this stage. Stylesheets can be modular, so any transclusions (
xsl:include
,xsl:import
instructions) would also be handled at this stage in order to bring template rules and other top-level stylesheet elements from other XSLT documents into the stylesheet tree. - Read the input XML with an XML parser and convert its content to a tree of nodes (the source tree), according to the XPath data model. The stylesheet may reference other XML sources via
document()
function calls. These are, typically, evaluated at run-time, since their locations may have to be calculated and the function calls may not even be reachable. (The example above does not reference any other source documents.) - Strip whitespace-only text nodes from the stylesheet tree, except those that are descendants of
xsl:text
elements. This allows nested elements in template rules to be on separate ('pretty') lines in the original XSLT without resulting in unintended whitespace being added to the result tree. - Strip whitespace-only text nodes from the source tree, if
xsl:strip-space
instructions are present in the stylesheet. This allows 'pretty' input XML to be processed in a manner that ignores extraneous whitespace. (The example above does not use this feature.) - Supplement the stylesheet tree with a trio of built-in template rules that provide default behavior for any node type that might be encountered during processing. One template rule is provided for processing the root node or any element node; it directs the processor to continue and process each child node. Another template is provided for any text node or attribute node; it directs the processor to make a copy of that result tree node. A third template rule is provided for any comment node or processing instruction node; it is a no-op. Templates, explicitly provided in the stylesheet, will override some or all of these. If the stylesheet contains no explicit template rules, the built-in template rules will result in a recursive source tree descension and only text nodes are copied to the result tree (attribute nodes will not be reached because they are not "children" of their parent elements). This result is generally never desirable, as it tends to be just a concatenation of the non-markup character data from the XML source.
Then, the processor performs the following steps to produce and serialize the result tree:
- Create the root node of the result tree.
- Process the root node of the source tree. The procedure for node processing is described below.
- Serialize the result tree, if desired, according to hints provided in the
xsl:output
instruction.
When processing a node, the following steps are undertaken:
- The best-matching template rule for that node is located. This is facilitated by each template rule's "match" pattern (an XPath-like expression), indicating the nodes to which it can be applied. Each template is assigned a relative priority and import precedence by the processor to help ease conflict resolution. The order of template rules in the stylesheet can also help resolve conflicts between templates which match the same nodes, but it does not affect the order in which nodes are processed.
- Template rule contents are instantiated. Elements in the XSLT namespace (prefixed with
xsl:
, typically; it is the namespace identifier bound to the prefix — not the prefix, itself — that matters) are treated as instructions and have special semantics that guide how they are interpreted. Some result in nodes being added to the result tree; others are control oriented. Non-XSLT elements and text nodes encountered in the template rule are copied, verbatim (namespaces and all) to the result tree. Comments and processing instructions are ignored.
The XSLT instruction xsl:apply-templates
, when processed, results in a new set of nodes being selected for processing. The nodes are identified via an XPath expression. Each node is processed in document order (the relative order in which they appear in the original document).
XSLT extends XPath's function library and allows XPath variables to be defined. These variables have different scopes in the stylesheet, depending on where they are defined and their values can originate outside the stylesheet. A variable's value cannot be changed during processing.
Although this procedure may sound complicated, it has the net effect of making XSLT much like other web templating languages. If the stylesheet consists only of a single template rule that matches the root node, everything in the template is essentially copied to the output, except for the XSLT instructions (the 'xsl:…
' elements), replaced by computed content. XSLT even offers an abbreviated stylesheet format ("literal result element as stylesheet") for these simple, single-template transformations. However, the ability to define separate template rules greatly increases XSLT's versatility and efficiency, especially when producing output that is very similar to the input.
See also
- XML transformation language, any computer language designed specifically to transform an input XML document into an output XML document which satisfies some specific goal.
- XSLT is a member of Extensible Stylesheet Language family of languages.
External links
- Implementations
-
- Implementations for Java
- Xalan-Java
- SAXON by Michael Kay
- XT originally by James Clark
- Oracle XSLT, in the Oracle XDK
- Implementations for C or C++
- Xalan-C++
- libxslt the XSLT C library for GNOME
- Sablotron, which is integrated into PHP4
- Implementations for Perl
- XML::LibXSLT is a Perl interface to the libxslt C library
- Implementations for Python
- 4XSLT, in the 4Suite toolkit by Fourthought, Inc.
- lxml by Martijn Faassen is a Pythonic wrapper of the libxslt C library
- Implementations for Ruby
- Ruby/XSLT is a simple XSLT class based on libxml and libxslt
- Sablotron module for Ruby is a ruby interface to Sablotron
- Implementations for JavaScript
- Google AJAXSLT AjaXSLT is an implementation of XSL-T in JavaScript, intended for use in Ajax applications. Because XSL-T uses XPath, it is also an implementation of XPath that can be used independently of XSL-T.
- Implementations for specific operating systems
- Microsoft's MSXML library may be used in various Microsoft Windows application development environments and languages, such as .Net, Visual Basic, C, and JScript.
- Saxon .NET Project Weblog, an IKVM.NET-based port of Dr. Michael Kay's and Saxonica's Saxon Processor provides XSLT 2.0, XPath 2.0, and XQuery 1.0 support on the .NET platform.
- Implementations integrated into web browsers
- (Comparison of layout engines (XML))
- Mozilla has native XSLT support based on TransforMiiX.
- Safari 1.3+ has native XSLT support. Unfortunately, a major drawback is that Safari is unable to perform XSL transformations via JavaScript, a limitation that does not occur in Mozilla or Internet Explorer. This limits the capabilities of Ajax applications that would run in Safari. Safari's XML-parser is also not standards-compliant; it will parse XML strings according to HTML rules. Therefore, under certain circumstances, it will omit data from the DOM tree if it encounters malformed "HTML" -- even though it actually encountered valid XML. These errors will propagate to XSL-processed DOM trees.
- X-Smiles has native XSLT support.
- Opera has native XSLT support since Version 9.
- Internet Explorer 6 supports XSLT 1.0 via the MSXML library (described above). IE5 and IE5.5 came with an earlier MSXML component that only supported an older, nonrecommended dialect of XSLT. A newer version of MSXML can be downloaded and installed separately to enable IE5 and IE5.5 to support XSLT 1.0 through scripting, and if certain Windows Registry keys are modified, the newer library will replace the older version as the default used by IE.
- Documentation
- XSLT 1.0 W3C Recommendation
- XSLT 2.0 W3C Candidate Recommendation
- Zvon XSLT 1.0 Reference
- XSL Concepts and Practical Use by Norman Walsh
- Tutorial from developerWorks (1 hour)
- Zvon XSLT Tutorial
- XSLT Tutorial
- Quick tutorial
- What kind of language is XSLT?
- XSLT and Scripting Languages
- XSLT Community Wiki (down?)
- Mailing lists
- The XSLT list hosted by Mulberrytech
- Blogs
- A commentary, news, and evangelism weblog devoted to XSLT
- Books
- XSLT by Doug Tidwell, published by O’Reilly (ISBN 0-59-600053-7)
- XSLT Cookbook by Sal Mangano, published by O’Reilly (ISBN 0-596-00974-7)
- XSLT Programmer's Reference by Michael Kay (ISBN 1-86-100312-9)
- XSLT 2.0 Web Development by Dmitry Kirsanov (ISBN 0-13-140635-3)
- XSL Companion, 2nd Edition by Neil Bradley, published by Addison-Wesley (ISBN 0-20-177083-0)
- XSLT and XPath on the Edge (Unlimited Edition) by Jeni Tennison, published by Hungry Minds Inc, U.S. (ISBN 0-76-454776-3)
- XSLT & XPath, A Guide to XML Transformations by John Robert Gardner and Zarella Rendon, published by Prentice-Hall (ISBN 0-13-040446-2)
- Tools
-
- Libraries
- EXSLT is a widespread community initiative to provide extensions to XSLT.
- FXSL is a library implementing support for Higher-order functions in XSLT. FXSL is written in XSLT itself.
- Other
- An XML and XSLT Editor with Debugger A tool for creating and testing XSLT documents
- Altova XMLSpy supports XSLT 1.0/2.0 editing and debugging
- <oXygen/> XML editor an XSLT editor and debugger
- Saxon XSLT and XQuery processor developed by Michael Kay
- TestXSLT by Marc Liyanage, a Mac OS X tool for experimenting and learning with various XSLT processors
- XSL Transformation to provide markup of XML contents from external lexicons
- Stylus Studio XSLT IDE an XSLT debugger, editor and mapper.
- Netbeans IDE provides an XSLT development environment
- Xselerator dedicated XSLT editor/debugger with support for arbitrary XSLT engines
- Treebeard Open source Java XSLT editor with support for various XSLT 1.0/2.0 processors
- xmlBlueprint XML editor