XML database
An XML database is a data persistence software system that allows data to be stored in XML format. This data can then be queried, exported and serialized into the desired format.
Two major classes of XML database exist:
- XML-enabled: these map all XML to a traditional database (such as a relational database), accepting XML as input and rendering XML as output. This term implies that the database does the conversion itself (as opposed to relying on middleware).
- Native XML (NXD): the internal model of such databases depends on XML and uses XML documents as the fundamental unit of storage, which are, however, not necessarily stored in the form of text files.
Rationale for XML in databases
O'Connell (2005, 9.2) gives one reason for the use of XML in databases: the increasingly common use of XML for data transport, which has meant that "data is extracted from databases and put into XML documents and vice-versa". It may prove more efficient (in terms of conversion costs) and easier to store the data in XML format.
Native XML databases
The term "native XML database" (NXD) can lead to confusion. Many NXDs do not function as standalone databases at all, and do not really store the native (text) form.
The formal definition from the XML:DB initiative states that a native XML database:[1]
- Defines a (logical) model for an XML document — as opposed to the data in that document — and stores and retrieves documents according to that model. At a minimum, the model must include elements, attributes, PCDATA, and document order. Examples of such models include the XPath data model, the XML Infoset, and the models implied by the DOM and the events in SAX 1.0.
- Has an XML document as its fundamental unit of (logical) storage, just as a relational database has a row in a table as its fundamental unit of (logical) storage.
- Need not have any particular underlying physical storage model. For example, NXDs can use relational, hierarchical, or object-oriented database structures, or use a proprietary storage format (such as indexed, compressed files).
Additionally, many XML databases provide a logical model of grouping documents, called "collections". Databases can set up and manage many collections at one time. In some implementations, a hierarchy of collections can exist, much in the same way that an operating system's directory-structure works.
All XML databases now[update] support at least one form of querying syntax. Minimally, just about all of them support XPath for performing queries against documents or collections of documents. XPath provides a simple pathing system that allows users to identify nodes that match a particular set of criteria.
In addition to XPath, many XML databases support XSLT as a method of transforming documents or query-results retrieved from the database. XSLT provides a declarative language written using an XML grammar. It aims to define a set of XPath filters that can transform documents (in part or in whole) into other formats including Plain text, XML, or HTML.
Many of XML databases, also support XQuery to perform querying. XQuery includes XPath as a node-selection method, but extends XPath to provide transformational capabilities. Users sometimes refer to its syntax as "FLWOR" (pronounced 'Flower') because the flow may include the following statements: 'For', 'Let', 'Where', 'Order' and 'Return'. Traditional RDBMS vendors (who traditionally had SQL only engines), are now shipping with hybrid SQL and XQuery engines. Hybrid SQL/XQuery engines help to query XML data alongside the Relational data, in a same query expression. This approach helps, in combining Relational and XML data with much ease.
Some XML databases support an API called the XML:DB API (or XAPI) as a form of implementation-independent access to the XML datastore. In XML databases, XAPI resembles ODBC and JDBC as used with relational databases. On the 24th of June 2009, The Java Community Process released the final version of the XQuery API for Java specification (XQJ) - "a common API that allows an application to submit queries conforming to the W3C XQuery 1.0 specification and to process the results of such queries".
Databases known to support the XQJ or the XML:DB API (XAPI)
XML Database | License | Language | XQJ API Support | XML:DB API Support | Transaction Support? |
---|---|---|---|---|---|
Apache XIndice | Open source, free | Java | No | Yes | No |
BaseX | Open source, free | Java | Yes | Yes | No |
Gemfire Enterprise | Commercial | Unknown | No | Yes | Yes |
DOMSafeXML | Commercial | Unknown | No | Yes | Yes |
eXist | Open source, free | Java | No | Yes | No |
MonetDB/XQuery | Open source, free | C++ | No | Yes | No |
myXMLDB | Open source, free | Java | No | Yes | Unknown |
OZONE | Open source, free | Java | No | Yes | Yes |
Sedna | Open source, free | C++ | Yes | Yes | Yes |
Software AG's Tamino | Commercial | Unknown | No | Partial | Unknown |
Implementations
- Apache Xindice(previous name:dbxml)
- BaseX native, open-source XML Database developed at the University of Konstanz
- BSn/NONMONOTONIC Lab: IB Search Engine, embeddable XML++ search engine using a generic/abstract model and a mix of polymorphic objects types. Spin-off from the Isearch project.
- DB2 9 Express-C, no-charge hybrid relational/XML data server with PureXML
- EMC Documentum xDB, a commercial native XML database including XQuery implementation, embeddable
- eXist-db, open-source native XML database, written in Java
- Gemstone System's GemFire Enterprise commercial XML database
- MarkLogic Server, a native XML database which uses XQuery.
- M/DB:X, a lightweight, REST-interfaced native XML database designed for use as a Cloud database.
- MonetDB/XQuery - XQuery processor on top of the MonetDB relational database system. Also supports W3C XQUF updates. Open source.
- Oracle XML DB XML Enabled, (as of Oracle 10g known as Oracle XDB) despite its name it does not support the XML:DB API.
- Oracle Berkeley DB XML, XML Enabled, embedded database; built on top of the Berkeley DB (a key-value database).
- Sedna XML Database, Open source XML database developed by MODIS team at Institute for System Programming. Supports XQuery, Updates, XQJ API, Transactions and Triggers
- SQL Server 2005, Free Express Edition with full xml features
- Tamino XML Server, native XML database. support for XQuery, XQuery Update, Transactions and Server Extensions.
- TEXTML Server, a native XML database combined with a full-text search engine.
- TigerLogic XDMS native XML Database
- Timber, a native XML database system developed at the University of Michigan
- Qizx 3.0 a native XQuery database engine written in Java (free & open source edition available)
- XStreamDB, native XML Database
References
External references
- XML Databases - The Business Case, Charles Foster, June 2008 - Talks about the current state of Databases and data persistence, how the current Relational Database model is starting to crack at the seams and gives an insight into a strong alternative for today's requirements.
- An XML-based Database of Molecular Pathways (2005-06-02) Speed / Performance comparisons of eXist, X-Hive, Sedna and Qizx/open
- XML Native Database Systems: Review of Sedna, Ozone, NeoCoreXMS 2006
- XML Data Stores: Emerging Practices
- Bhargava, P.; Rajamani, H.; Thaker, S.; Agarwal, A. (2005) XML Enabled Relational Databases, Texas, The University of Texas at Austin.
- O'Connell, S. Advanced Databases Course Notes, Southampton, University of Southampton, 2005
- Initiative for XML Databases
- XML and Databases, Ronald Bourret, September 2005
- XML Database Products, Ronald Bourret, 2000-2009
- The State of Native XML Databases, Elliotte Rusty Harold, August 13, 2007
- XML for DB2 Information Integration, an IBM Redbook that has a chapter on XML and databases (1st chapter).