Jump to content

Office Open XML

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 81.101.137.204 (talk) at 00:10, 3 September 2007 (Undid revision 155260405 by Gabrielzorz (talk)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Template:Totally-disputed

Template:Distinguish2

Office Open XML (commonly referred to as OOXML or Open XML) is an XML-based file format specification for electronic documents such as memos, reports, books, spreadsheets, charts, presentations and word processing documents. MS Office 2007 default saves files in Office Open XML. The specification has been developed by Microsoft as a successor of its binary office file formats and was published by Ecma International as the Ecma 376 standard in December 2006.[1] The format specification is available for free at Ecma International.

Office Open XML uses a number of dedicated XML markup languages in fileparts that are placed in an Open Packaging Convention file container. The format specification includes XML schemas that can be used to validate the XML syntax.

The format is currently undergoing a standardization process within the International Organization for Standardization (ISO).

Background

Since its inception, Microsoft Office and its component applications such as Microsoft Word and Excel have used binary file formats for electronic office documents. These formats comprise the majority of office documents in use today due to the dominant market position of Microsoft Office. Microsoft published incomplete specifications for these documents in the past, but has more recently limited licensing of these file formats only to governments and noncommercial use. Currently Microsoft offers the binary format specifications to everyone under a royalty free covenant not to sue [2]. Because of the difficulty in obtaining royalty free access to the format specifications it was fairly difficult for third-party commercial or free software developers to implement them. Despite these difficulties a very good level of support was achieved, though full interoperability has remained elusive.

In 2002 Microsoft took a step to remedy this situation by releasing a new file format for MS Word based on XML rather than binary data. This format, known as WordProcessingML, [3] was later incorporated into the 2003 release of a full set of formats for Microsoft Office 2003 known as Office 2003 XML formats under royalty-free licensing.

The 2003 formats did not use a package file format but still used a single file format with embedded items like pictures as binary encoded within the XML. A new version of WordprocessingML is used in Office Open XML.

In 2004 governments and mainly the European Union recommended both OASIS and Microsoft to standardize their XML office file formats through an official standardisations organization. [4] OASIS then, based on these recommendations, [5] decided to submit their Open Office XML format for ISO standardization renamed as Open Document Format the term used in EU recommendations. Following this in December 2005 Microsoft decided to standardize the new versions of their Microsoft Office XML formats under development for MS Office 12 through Ecma (renamed as Ecma Office Open XML).

The development and standardization of Office Open XML takes place amid a groundswell of interest in open, standards-based technologies by commercial and government organizations. Microsoft continues to have problems with it s quality assusurance, prefering to let its customers do the debugging.

File format and structure

The Office Open XML file is an Open Packaging Convention package containing the individual files that form the basis of the document. In addition to XML files with Office markup data, the ZIP package can also include embedded (binary) files in formats such as PNG, BMP, AVI or PDF.

Document markup languages

An Office Open XML file may contain several documents encoded in specialized markup languages corresponding to applications within the Microsoft Office product line. Office Open XML defines multiple vocabularies (using 27 namespaces and 89 schema modules.) The primary markup languages are:

  • WordprocessingML - Wordprocessing
  • SpreadsheetML - Spreadsheets
  • PresentationML - Presentation

For drawing

  • DrawingML
  • VML (deprecated)

Shared markup language materials include:

  • OMML (Office Math Markup Language)
  • Extended properties
  • Custom properties
  • Variant Types
  • Custom XML data properties
  • Bibliography

In addition to the above markup languages custom XML schema's can be used to extend Office Open XML.

The XML Schema of OOXML can be characterized as being highly generic, highly systematic and with an emphasis on reducing load time and improving parsing speed. In an a test with current implementations XML based office documents still were to be a lot slower than binary formats.[6]. For speed, OOXML uses very short element names for common elements and spreadsheets save dates as index numbers (starting from 1899 or from 1904). In order to be systematic and generic, OOXML typically uses separate child elements for data and metadata (element names ending in Pr for properties) rather than using multiple attributes, which allows structured properties. OOXML does not use mixed content but uses elements to put a series of text runs (element name r) into paragraphs (element name p). The result is terse and highly nested in contrast to HTML, for example, which is fairly flat, designed for humans to write in text editors and is more or less congenial for humans to read.

OMML

Included with Office Open XML is Office Math ML (OMML). This is a mathematical markup language which can integrate with the WordprocessingML markup. This means that the math zones can also include word processing markup like revision markings, footnotes, comments, images and elaborate formatting and styles. [7] The format is different from the World Wide Web Consortium (W3C) MathML recommendation but compatible through relatively simple XSL Transformations. For example Microsoft Office 2007 ships with such XSL transformation files allowing MathML to by copied from a clipboard with XSL transformation into OMML. During XSL transformation from OMML to MathML any WordprocessingML related markup should be lost because MathML does not allow for other markup in math zones.

Container structure

Office Open XML files conform to the Open Packaging Convention and different applications have characteristic directory structures and file names within these packages. An OPC-aware application will use the relationships files rather than directory names and file names to locate individual files. In OPC terminology, a file is a part. A part also has accompanying metadata, in particular MIME metadata.

Office Open XML format uses a ZIP container for packaging XML and other data files.[8]

A basic Office Open XML file contains an XML file called [Content_Types].xml at the root level of the ZIP package, along with three folders: _rels, docProps, and a directory specific for the document type (for example, in a .docx word processing file that would be a word directory). The word directory contains the document.xml file which is the core content of the document.

[Content_Types].xml file
This file describes the content of the ZIP package. It also contains a mapping for file extensions and overrides for specific URIs.
_rels Folder
The _rels folders are where one goes to find the relationships for any given part within the package. To find the relationships for a specific part, one looks for the _rels folder that is a sibling of one's part. If the part has relationships, the _rels folder will contain a file that has one's original part name with a .rels appended to it. For example, if the content types part had any relationships, there would be a file called [Content_Types.xml.rels] inside the _rels folder.
_rels/.rel
The root level _rels folder always contains a part called .rels. This URI (/_rels/.rels) and /[Content_Types].xml are the only two reserved URIs for parts in files that adhere to Office Open XML conventions. This is where the "package relationships" are located. Whenever one opens a file using these conventions, one always starts by going to the _rels/.rels file. All relationship files are represented with XML. If one opens it in a text editor, one will see a bunch of XML that outlines each relationship for that part. In a minimal word document containing only the basic document.xml, the top level parts are two metadata parts, and the document.xml part.
word/document.xml
This is the main part for any Word document. If one views it in an XML editor, one will see a pretty basic XML file. The body of the word processing document is contained in this part.

Relationships

Relationship files in Office Open XML

An example relationship file in Office Open XML (for example word/_rels/document.xml.rels)

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<Relationships
  xmlns="http://schemas.microsoft.com/package/2005/06/relationships">
  <Relationship Id="rId1"
     Type="http://schemas.microsoft.com/office/2006/relationships/image"
     Target="http://en.wikipedia.org/images/wiki-en.png"
     TargetMode="External" />
  <Relationship Id="rId2"
     Type="http://schemas.microsoft.com/office/2006/relationships/hyperlink"
     Target="http://www.wikipedia.org"
     TargetMode="External" />
</Relationships>

Relationship files allow navigation of the package without having to open up each part. For example, images that are referenced in a Word document can be found in the relationship file by looking for all relationships that are of type http://schemas.microsoft.com/office/2006/relationships/image. To point to a different image, you just edit the relationship.

The following code shows an example of inline markup for a hyperlink:

<w:hyperlink w:rel="rId2" w:history="1"> 

In this example, the URL is represented by "rId2". The actual URL is located by the corresponding "rId2" item in the accompanying relationships file. Linked images, templates, and other items are referenced in the same way. The locations of referenced items can be updated by editing the relationships file.

Embedded or linked media file relations

Pictures can be embedded or linked in the XML files using a tag:

<v:imagedata w:rel="rId1" o:title="example" />

This is the reference to the image file. In Office Open XML, all references are done via relationships. For example a document.xml part has a relationship to the image part. The actual URI is located by the corresponding "rId1" item in the accompanying relationships file. There is a _rels folder in the ZIP package, in the same directory as document.xml. Inside _rels is a file called document.xml.rels. In this file there will be a relationship definition that contains a type, an ID and a location. The ID is the referenced ID used in the XML document. The type will be a reference schema definition for the media type and the location will be an internal location within the ZIP package or an external location defined with an URL.

Licensing

Ecma International provides its standard specifications for free without copyright restrictions [9] and under the Ecma code of conduct in patent matters which requires participating and approving member organisations to make available their patent rights under a reasonable and non-discriminatory basis (see Reasonable and Non Discriminatory Licensing).

Microsoft which is a main contributor to the Ecma standard provided a covenant not to sue[10]. The covenant received a mixed reception, with some like Groklaw identifying problems[11] and others (such as Lawrence Rosen) endorsing it.[12]

Microsoft also added the Office Open XML format to their Microsoft Open Specification Promise in which Microsoft irrevocably promises not to assert any Microsoft Necessary Claims against you for making, using, selling, offering for sale, importing or distributing any implementation to the extent it conforms to a Covered Specification ("Covered Implementation"). The Office Open XML 1.0 - Ecma 376 and its predecessor Office 2003 XML format are among the covered specifications.[13]

The Office Open XML format therefore can be used under the Covenant not to Sue or the Open Specification Promise.

In support of the licensing arrangements Microsoft commissioned an analysis from the London legal firm Baker & Mckenzie.[14]

The Microsoft Open Specification Promise was included in documents submitted to ISO in support of the Ecma 376 fast track submission.[15]. In response to criticism of the licensing, ECMA provided the following statements[16]:"

  • Contributions to Ecma were made under the Ecma Code of Conduct in Patent Matters...
  • The OSP enables both open source and commercial software to implement DIS 29500."

EU definition of an Open Standard

With Ecma International publishing the specification for free and patents made irrevocably available on a royalty-free basis, Office Open XML conforms to all characteristics of the European Union's definition of an open standard.

  • The standard is adopted and will be maintained by a not-for-profit organisation, and its ongoing development occurs on the basis of an open decision-making procedure available to all interested parties (consensus or majority decision etc.). [17]
  • The standard has been published and the standard specification document is available either freely or at a nominal charge. It must be permissible to all to copy, distribute and use it for no fee or at a nominal fee. [17]
  • The intellectual property — i.e. patents possibly present — of (parts of) the standard is made irrevocably available on a royalty-free basis. [17]
  • There are no constraints on the re-use of the standard.[17] .

Standardization

Creation of Ecma 376

Microsoft stated that Office Open XML would be an open standard, and submitted it to the Ecma standardization process. On 2005-12-08 Ecma created technical committee 45 (TC45) in order to "produce a formal standard for office productivity applications that is fully compatible with the Office Open XML Formats, submitted by Microsoft".[18]

The TC45 committee is co-chaired by two Microsoft employees;[19] it also includes members from Apple, Canon, Intel, NextPage, Novell, Pioneer, Statoil ASA, Toshiba and The United States Library of Congress.[1]

At the General Assembly meeting on 2006-12-07, Ecma International approved Office Open XML as an Ecma standard (Ecma 376).[1] The General Assembly also approved submission of the text to the Fast Track mechanism of ISO/IEC JTC 1, which — if successful — would result in it becoming an ISO standard.

A full copy of Ecma 376 or a copy in bits can be downloaded from Ecma international.

Submission to ISO

As an ISO external Category A liaison, Ecma have submitted Ecma 376 to the ISO Fast Track process, the same process available to National Standard Organizations. To meet the requirements of this process, [20] Ecma have submitted the documents "Explanatory report on Office Open XML Standard (Ecma-376) submitted to JTC 1 for fast-track"[21] and "Licensing conditions that Microsoft offers for Office Open XML".[15]

Contradiction phase

The fast track process allows a 30-day review period by national standardizing bodies (NBs), during this period NBs may identify to the JTC 1 Secretariat any perceived contradiction with other JTC 1, ISO or IEC standards. If such a contradiction is alleged, "the JTC 1 Secretariat and ITTF shall make a best effort to resolve the matter"[20]. At the end of the 30-day review, 20 countries submitted responses. The large number of submissions received was unprecedented, six of which raised objections and only one was fully supporting with most ballots beings equivocal. The full text of the national bodies' submissions is available from the ISO/IEC JTC1 SC32 website.

Most countries (including the United States, represented by ANSI) did not respond in the contradictions phase — member countries are not required to respond if they perceive no contradictions.

Ecma responded to the issues raised during the contradiction period with a document reviewing the national bodies' comments.[22] This document cites the objections raised by the national bodies, and shows the comments overlap considerably with material on the Web created by opponents of Office Open XML, particularly from the Grokdoc[23] site. Microsoft employees and others have suggested that the national bodies' documents were not written by the bodies, but by Microsoft competitors;[24] these suggestions are supported by the author metadata in Kenya's PDF submission, which contains the names of an IBM Germany employee and a Malaysian ODF supporter.[25] However, national bodies are permitted to source technical skills as required.

Five-month ballot

The JTC 1 directives[20] state that regardless of whether or not resolution is reached on the question of contradiction, a five-month ballot commences immediately. So, on April 2, 2007 the ISO JTC 1 Secretariat duly informed Ecma International that the five-month DIS 29500 (Office Open XML) ballot period had started and would close on September 2, 2007.[26]

At the end of the five-month letter ballot, all the technical comments that have been made are consolidated and redistributed so that the voting nations may form a view on them in their totality. The SC34 secretariat then may decide to arrange for a special ballot resolution meeting (BRM) to take place no sooner than two and one-half months after the ballot has closed.

Response to the ballot

After the five-month letter ballot closes, the proposer (Ecma) has a chance to respond to the comments made by the national bodies that voted. They may combine, de-duplicate, label and group the comments and then attempt to liaise informally with national bodies, to try and arrive at a set of revisions that are acceptable.

The result of Ecma's activity will be a "Disposition of Comments" document — effectively a set of proposed revisions to the DIS 29500 text designed to be acceptable to all the national bodies who disapproved of the text in the letter ballot. It is this series of edits which effectively forms the agenda for the subsequent Ballot Resolution Meeting, and on which the attendees of that meeting will be asked to form opinions.

Ballot resolution process

During the six-month period (of a one-month contradiction phase, and five-month letter ballot) the national bodies are able to cast a vote of approval, disapproval or abstention. P-members are required to vote. The ballot resolution process is the process which follows this vote.

In the event there is not 100% support, or overwhelming disapproval, of DIS 29500 there can be a Ballot Resolution Meeting (BRM) if needed, in which comments submitted with ballot votes can be resolved. The BRM will have been called at the discretion of the SC34 secretariat at the end of the five-month ballot voting. The outcome of this meeting effectively decides whether DIS 29500 succeeds or fails in its bid to become a full International Standard. The DIS 29500 BRM, should it happen, is scheduled for the week of February 25-29, 2008 at the International Conference Centre Geneva.

Who attends the BRM?

The participants in the BRM are representatives of the national bodies, the proposer (Ecma), and support and administrative staff.

The national bodies who attend this meeting are:

  • SC34 members (who have to send representatives);
  • those that voted "disapprove" during the five-month ballot, who have a duty to send a delegation to this meeting. (JTC 1 Directives[20] clause 13.7);
  • optionally, those who voted to "approve" DIS 29500.

The meeting is expected to attract between 40 and 100 participants.

Meeting process

During this meeting, the participants consider each of Ecma's responses to the comments gathered in the preceding process. Each comment is thus effectively "resolved" by the meeting's participants either:

  • agreeing to a proposed alteration of the text by Ecma
  • agreeing to withdraw that comment (if, for example, it is incorrect)
  • otherwise agreeing to amend the text or ignore a comment following discussion

In this way, a set of editorial changes to the text is agreed that, collectively, implies a new revised document. The meeting then agrees whether this final "implied" document is acceptable for publication as a full International Standard.

Voting procedures

JTC 1 states that decisions at the BRM should be reached preferably by consensus, but that any unavoidable votes should be taken according to normal JTC 1 procedures (JTC 1 Directives[20] clause 13.8). Any country that voted (i.e. either "approve with comments", "disapprove with comments" or "abstain") in the five-month ballot mentioned above may vote at the BRM. Countries that did not vote in that letter ballot may not vote at the BRM. The vote for ISO standardization passes if:

  • At least two-thirds of the P-members voting (abstinentions do not count) shall have approved. The P-members form a subset of 40 countries with more voting power.[27]
  • Not more than one-quarter of the total number of votes cast (P-members and others) are negative

Countries may change their position to any of "approve", "disapprove" or "abstain" during the course of the BRM.[citation needed]

Final outcome

If this meeting fails to agree on a final text, the proposal of OOXML for fast-tracking fails and the procedure is terminated: if the meeting does agree on a final text, any required changes are applied by the editor and OOXML is passed for publication as an ISO standard.

ISO maintenance regime

The maintenance regime for OOXML (should it become an ISO Standard) is yet to be determined. Ecma have however tabled a maintenance proposal for discussion by SC34 at a meeting scheduled to take place in December 2007 in Kyoto.

National body activity

Some countries opened their scrutiny procedure to public view during the five-month ballot:

  • The American National Standards Institute (ANSI) is publishing comments they receive on DIS 29500 here. An archive of email exchanged between members of INCITS V1 (who provide recommendations on the U.S. position) is available here. When voting on a position, the INCITS V1 committee was divided, with Microsoft and Ecma TC45 members Nextpage and BP voting for "Approve with comments" on the one hand, and a group including IBM, Sun and Red Hat voting for "disapprove with comments" on the other hand. The committee thus failed to reach agreement on a recommendation to its parent committee.[28]. Subsequently the INCITS Executive Board held several ballots for a position of "Approval with comments" and to include all comments processed or non-processed by the V1 committee.[29] The second of these ballots achieved the necessary two-thirds majority (12-3-1) [30] making the U.S. ballot vote a "Approve with comments".
  • The British Standards Institute (BSI) used an open Wiki (read-only to the public; read/write for BSI technical committee members) to help coordinate the UK's input into the ballot. While active, the site gathered 630 comments on the text. The main page of the wiki is here, and specific comments on DIS 29500 are here. Beyond this, the deliberations and voting of BSI are confidential.
  • The Standards Council of Canada (SCC) is seeking comments on a proposal to adopt Office Open XML (Open XML) as an international open standard. The forum soliciting comments from Canadians only is here.
  • The KATS, a standard agency of South Korea, is seeking comments from the standard technology study group for open document format(개방형 문서형식 표준기술연구회) and the committee for electronic document processing language(전자문서 처리언어 전문위원회).[31]

Complaints About National Bodies Process

Some complaints have surfaced during the process.

  • At Portugal's national bodies TC meeting, it was suggested that Sun Microsystems be represented. An unofficial transcript suggested that this was refused for "lack of space". [32] This has been criticized by opponents of OOXML [33], while Microsoft claims that the number of seats (not chairs) on the committee was limited to 20 by the national body before the meeting. [34]
  • The Swedish Standards Institute has become one of the battlegrounds for supporters and opponents of Office Open XML in the current ISO standardization process. Microsoft Sweden apparently asked its partners to get involved in the standardization process. In total 22 Microsoft partners (4 of which may have been IBM partners as well) and Google paid a 17.000 SEK (2444 USD) fee to join the committee and were allowed to vote at the last minute. [35] A Microsoft memo to its partners has surfaced, requesting them to join the SIS committee and vote in favor of OOXML in return for "marketing contributions". [36] Microsoft claims the memo was the action of an individual employee acting outside company policy, and retracted soon as it was discovered. [37] In the end, SIS decided to invalidate the vote as one company cast more than one vote, which is against SIS rules.[38]
  • In Netherlands, the committee of the Dutch standardization institute NEN intended to vote "No with comments", where the comments would list the conditions under which OOXML would be acceptable to the committee. These conditions were a compromise between the parties represented in the committee. However, the procedure required unanimous support of the "no" vote, and the lack of support by the Dutch Microsoft representation in the committee resulted in an "abstain" vote.[39]
  • In Switzerland, who voted "yes", there was criticism about a conflict of interest regarding the chairman of the NK 149 committee, who overruled votes and did not allow discussion of legal and economic arguments.[40][41]

Adoption

Office Open XML is the default Microsoft Office 2007 format. For older versions such as Microsoft Office 2000, XP and 2003 a compatibility pack is provided.[42] The compatibility pack can also be used as a stand alone converter with Microsoft Office 97.

  • Microsoft Office Open XML File Format Converter for Mac 0.2 (Beta)[43]Microsoft had previously advised users of Office 2007 to save their files in the old Office binary format[44].
  • Beta testing has started on Microsoft Office 2008 for Mac, which will support the format. The final version is scheduled to release in the mid-January 2008.[45]
  • Corel has announced that by mid-2007 its WordPerfect Office suite will support Office Open XML as well as OpenDocument.[51]
  • Gnumeric has limited SpreadsheetML markup language support.[52]
  • docXConverter by Panergy Ltd. converts from WordprocessingML to Rich Text Format (RTF). DocXConverter can be used to transfer WordprocessingML data to other applications that read RTF data such as Word 97.[55]
  • Datawatch supports Office Open XML spreadsheets in its report mining tool Monarch v9.0[57]

Arguments in support and criticism of OOXML standard

Support

Organizations and individuals supporting Office Open XML have provided arguments for standardization, summarized by ECMA[58].

User base argument

The most widely used office productivity packages currently rely on various proprietary binary file formats such as doc, ppt and xls. For users of the binary formats there could be an advantage to migrating to an open XML standard that maps the features of previous binary file formats. Office Open XML for this purpose explicitly states as a goal of the format[59] to preserve investments in existing files and applications.

Key benefits arguments

Microsoft provided an overview of benefits in using Office open XML [60]

  • Integration of business information with documents
  • Open and royalty-free specification
  • Compact, robust file format

Key benefits and functionality

  • Compact file format
  • Safer documents
  • Easier integration
  • Transparency and improved information security
  • Compatibility

Policy arguments

ECMA has provided the following policy arguments in favor of standardization[61] with respect to Overlap in Scope with ISO/IEC 26300:2006 (ODF): Overlap in Scope of ISO/IEC standards is common and can serve a practical purpose; OpenXML addresses distinct user requirements; ODF and OpenXML are Structured to Meet Different User Requirements; OpenXML and ODF can serve as duo-standards.

Microsoft attacked IBM's fundamental opposition to the Open XML standardization process[62]:

  • "Ecma almost unanimously agreed to submit Open XML as a standard for ratification by ISO/IEC JTC1 with only IBM dissenting."
  • "IBM led a global campaign urging national bodies to ... not even consider Open XML, because ODF had made it through ISO/IEC JTC1 first – in other words, that Open XML should not even be considered on its technical merits because a competing standard had already been adopted. This campaign to stop even the consideration of Open XML in ISO/IEC JTC1 is a blatant attempt to use the standards process to limit choice in the marketplace for ulterior commercial motives – and without regard for the negative impact on consumer choice and technological innovation."

Technical arguments

  • The use of the Open Packaging specification which allows for Indirection, Chunking and Relative indirection. [63]
  • Office Open XML (Part II of the format specification) specifies the ZIP format making ZIP a part of a standard.
  • Due to ZIP compression files are smaller than the currently widely used binary formats [64]
  • It supports custom data elements for integration of data specific to an application or an organisation that wants to use the format. [64]
  • It is currently the only open document standard to define spreadsheet formulae.[citation needed]
  • Office Open XML contains alternate representations for the XML schemas and extensibility mechanisms using RELAX NG (ISO/IEC 19757-2) and NVDL (ISO/IEC 19757-4) [64]
  • OpenXML contains no restriction on image, audio or video types. For example, images can be in Microsoft WMF, GIF, PNG, TIFF, PICT, JPEG or any other image type (§1:14.2.12).[citation needed]
  • Embedded controls can be of any type, such as Java or ActiveX (§1:15.2.8).[citation needed]
  • WordprocessingML font specifications can include font metrics and PANOSE information to assist in finding a substitution font if the original is not available (§3:2.10.5). [64]
  • Alternate Content Block (§3:2.18.4) A solution to define alternate content (like an image) which can be used in various situations where a consuming application might not be capable of interpreting what a producing application wrote. [64]
  • Internationalization supporting all kind of features needed for support by multiple nations. For example date representation: In WordprocessingML (§4:2.18.7) and SpreadsheetML (§4:3.18.5), calendar dates can be written using Gregorian (three variants), Hebrew, Hijri, Japanese (Emperor Era), Korean (Tangun Era), Saka, Taiwanese, and Thai formats and for example several internationalization related spreadsheet conversion functions. [64]
  • Custom XML schema extensibility allowing implementations to the format with features. That can for instance facilitate conversion from other formats and future features that are not part of the official specification yet. [64]

Criticism

The Office Open XML standard has been the subject of wide and varied debate in the software industry. Many of the participants in the approval process are generally supportive of eventual ISO standardization, but are unwilling to support the ISO fast track process until their issues are resolved. At 6000 pages long, the specification is difficult to quickly evaluate.[65] A raised issue is the existence of the OpenDocument format (ISO 26300:6000) which has overlap of the new Office Open XML format. Critics suggest Microsoft to adopt the OpenDocument format as its default format for future versions of Microsoft Office [citation needed]. Objectors also complain that there could be user confusion regarding the two standards because of the similarity of the "Office Open XML" name to both "OpenDocument" and "OpenOffice".[23]

Criticism by competitors and free software and open source

The critics include a wide variety of organizations and individuals, including the free software and open source communities, OpenDocument supporters[66] and major industry players, such as Sun Microsystems, IBM and Google.

OOXML has been widely criticized on technical and legal grounds, and the standardization process itself has also been questioned. In addition to the specific issues noted below, an overall premise of their argument is that the format is inherently closed in many respects and thus a poor candidate for a global standard.[67]. Similar concerns are raised by Preliminary Google reply to DIS 29500 : the consideration of ECMA-376 OOXML for ISO standardization] and IBM Comments on INCITS LB 2212 - DIS 29500.

There is also criticism that the proposed standard duplicates, overlaps with, and is unable to be merged with the existing ISO OpenDocument Format.

  • Scope of the Patent Licensing does cover only required features of the standard, but not the entire standard. Specifically Microsoft's Covenant not to sue grants patent use "that are necessary to implement only the required portions of the Covered Specification that are described in detail and not merely referenced in such Specification." Also Microsoft's Open Specification Promise only protects what is explicitly specified in the standard. [68]
  • The Open Specification Promise is not available in languages other than English and tied to an anglosaxon legal system. It is untested in court.

Technical criticisms

  • Reliance on application-defined behaviors to support important functionality that should be documented or supported via existing standards. For example, book 4 section 6.1.2.19 defines the "equationxml" attribute of "shape" elements, "used to rehydrate an equation using the Office Open XML Math syntax"; however, the "actual format of the contents of this attribute are application-defined".[23]
  • SpreadsheetML stores dates in decimal time as the number of fractional days since 1900. It incorrectly treats 1900 as a leap year in order to remain backward compatible with previous versions of Microsoft Excel and in Lotus 1-2-3.[69] The criticism is twofold; only dates after the nonexistent Gregorian date 1900-02-29 can be used, and it ignores the ISO 8601 standard for the representation of time and date.
  • Use of DrawingML and VML instead of SVG, and of a new mathematical format instead of MathML. MathML and SVG are W3C recommendations. VML was recommended as a W3C standard in 1997 but got rejected. Microsoft considers it deprecated and it should only occur in files converted from the MS Office WordprocessingML 2003 format.
  • Internal inconsistencies and omissions. For example, book 4 section 2.18.4 lists styles such as "apples", "scaredCat", and "heebieJeebies", but does not fully define these styles. Missing properties include height, width, color depth, and orientation.[23]
  • Inconsistent notations for percentage units. In book 4, section 2.18.85 uses predefined symbols (like "pct15" for 15%) in 5 or 2.5 percent increments, section 2.15.1.95 uses a decimal number giving the percentage, section 2.18.97 uses a number in fiftieths of a percent, and section 5.1.12.41 uses a number in thousandths of a percent.[23]
  • Inflexible numbering format. For example, book 4 section 2.18.66 describes a numbering format that is fixed to a few countries and contradicts both the W3C XSLT recommendation and Unicode ISO 10646 standard.[23]
  • Non-standard, inflexible paper size naming. For example, book 4 sections 3.3.1.61 define a "paperSize" attribute for which values 1 through 68 are predefined standard paper sizes such as A4 paper.[23]
  • Non-standard language codes and color names.[23]
  • Non-extensible bitmasks, some elements attributes are defined as bitmasks. For example, book 4 section 2.8.2.16 "sig (Supported Unicode Subranges and Code Pages)" describes the <w:sig> element, the attributes of which are all bitmasks.[23]
  • Legacy document rendering compatibility is identified using (deprecated) tags. For example, book 4 section 2.15.3.6, "autoSpaceLikeWord95", “useWord97LineBreakRules”, “useWord2002TableStyleRules", and book 4 section 2.15.3.31, "lineWrapLikeWord6", and "suppressTopSpacingWP" for a 16-year-old version of WordPerfect.[23] These items should only occur in OOXML documents that were converted from predecessor Microsoft Office documents.
  • Errors in the spreadsheet formula specifications confirmed by Microsoft.[70]
  • Accessibility issues according to University of Toronto,[71] such as form fields not being associated with their labels, absence of a tabbing order for forms, and limitations in the use of alternative text descriptions of objects.
  • SpreadsheetML has a large number of internal dependencies, which requires many changes in different parts of the XML data for changes to a single data cell, and multiple different ways to represent semantically identical cell data.[72]
  • Locale conventions (such as decimal points, date formats, and character settings) are inconsistent. For example, SpreadsheetML documents are internally represented in the US English locale, but font types such as "bold" can be specified in any language (e.g. "gras" in French), even though the specification does not provide a list of equivalents in different languages.[72]

Technical and legal issues such as mentioned here were a reason for OpenOffice.org in Denmark to submit objections to ECMA 376 to the Danish National Body (Dansk Standard).[73] The objections stated that there were "many serious mistakes, self-contradictions, legal problems and Microsoft dependencies in the specification", that "cultural and linguistic adaptability suffers [which] are not extensible by vendors in an interoperable way", and that Ecma 376 does not meet the stability requirement for the ISO standardization.

References

  1. ^ a b c "Ecma International approves Office Open XML standard" (Press release). Ecma International. December 7 2006. Retrieved 2006-12-08. {{cite press release}}: Check date values in: |date= (help)
  2. ^ "How to extract information from Office files by using Office file formats and schemas". Microsoft. 2007-03-27. Retrieved 2007-07-10.
  3. ^ Brian Jones (2007-01-25). "History of office XML formats (1998-2006)". {{cite web}}: Unknown parameter |Publisher= ignored (|publisher= suggested) (help)
  4. ^ Telematics between Administrations Committee based on IDA expert group on open document formats (2004-05-25). "TAC approval on conclusions and recommendations on open document formats". IDABC - European eGovernment Services. {{cite web}}: Unknown parameter |accesdate= ignored (|access-date= suggested) (help)
  5. ^ Micheal Brauer(Sun) (2007-09-01). "News from IDA, ISO and TC roadmap". OASIS.
  6. ^ George Ou (2007-04-27). "MS Office 2007 versus Open Office 2.2 shootout". ZDnet.com. Retrieved 2007-04-27.
  7. ^ Murray Sargent (2007-06-05). "Science and Nature have difficulties with Word 2007 mathematics". MSDN blogs. Retrieved 2007-07-31.
  8. ^ Tom Ngo (December 11 2006). "Office Open XML Overview" (PDF). Ecma International. p. 6. Retrieved 2007-01-23. {{cite web}}: Check date values in: |date= (help)
  9. ^ "What is Ecma International". {{cite web}}: Unknown parameter |Publisher= ignored (|publisher= suggested) (help)
  10. ^ "Microsoft Covenant Regarding Office 2003 XML Reference Schemas". Microsoft. Retrieved 2006-07-11.
  11. ^ "2 Escape Hatches in MS's Covenant Not to Sue". Groklaw. Retrieved 2007-01-29.
  12. ^ Berlind, David (November 28 2005). "Top open source lawyer blesses new terms on Microsoft's XML file format". ZDNet. Retrieved 2007-01-27. {{cite web}}: Check date values in: |date= (help)
  13. ^ "Microsoft Open Specification Promise". Microsoft. 2006-09-12. Retrieved 2007-04-22. {{cite web}}: Cite has empty unknown parameter: |1= (help)
  14. ^ Baker & McKenzie (2006). "Standardization and Licensing of Microsoft's Office Open XML Reference Schema" (PDF). Baker & Mckenzie. Retrieved 2007-02-01. {{cite web}}: Unknown parameter |month= ignored (help)
  15. ^ a b Licensing conditions that Microsoft offers for Office Open XML
  16. ^ -Response Document- National Body Comments from 30-Day Review of the Fast Track Ballot for ISO/IEC DIS 29500 (ECMA-376) Office Open XML File Formats
  17. ^ a b c d IDABC - European eGovernment Services (2004). "European Interoperability Framework for pan-European eGovernment Services". Retrieved 2007-07-30.
  18. ^ "The new open standard safeguards the continued use of billions of existing documents". Ecma International. Retrieved 2007-01-28.
  19. ^ "TC45 - Office Open XML Formats". Ecma International. Retrieved 2007-02-08.
  20. ^ a b c d e "ISO/IEC JTC 1 Directives, 5th Edition, Version 2.0". iso. Retrieved 2007-01-28.
  21. ^ Explanatory report on Office Open XML Standard (Ecma-376) submitted to JTC 1 for fast-track
  22. ^ "Response Document: National Body Comments from 30-Day Review of the Fast Track Ballot for ISO/IEC DIS 29500 (ECMA-376) "Office Open XML File Formats"" (PDF). Ecma International. 2007-02-28. Retrieved 2007-04-03.
  23. ^ a b c d e f g h i j "EOOXML objections". grokdoc. Retrieved 2007-01-02.
  24. ^ Brian Jones. "A few updates on the OpenXML formats". Retrieved 2007-05-04.
  25. ^ Stephen McGibbon. ""There is no reason to be browbeaten into thinking that there should only be one document format."". Retrieved 2007-06-22.
  26. ^ "Office Open XML reaches next step in ISO/IEC process". Ecma International. 2007-04-02. Retrieved 2007-04-03.
  27. ^ JTC 1 P-Members
  28. ^ "Email with appended notes from INCITS/V1 meeting on 2007-07-13".
  29. ^ Doug Mahugh (2007-07-19). "INCITS Executive Board to vote on "approve with comments". MSDN blogs.
  30. ^ INCITS (2007-08-13). "Vote Tally for INCITSLB2341". INCITS.
  31. ^ Digital Times (2007-08-29). "Would be approved or not, Microsoft's Open XML as an international standard (MS `오픈 XML` 국제 표준 승인될까?)". Digital Times.
  32. ^ "CT-173 meeting of 2007-07-16 By Rui Seabra". {{cite web}}: Text "author Rui Seabra (ANSOL - FSF europe)" ignored (help)
  33. ^ | url = http://www.groklaw.net/article.php?story=2007071812280798 | author = Pamela Jones | title = Notes from Portugal on the July 16 meeting on ECMA-376
  34. ^ Jason Matusow (Microsoft senior director of intellectual property) (2007-07-31). "Ecma Open XML and the Portuguese National Body". MSN blogs.
  35. ^ "Microsoft buys the Swedish vote on OOXML".
  36. ^ "Microsoft pressed partners in Sweden to vote for OOXML".
  37. ^ "Open XML - The Vote in Sweden".
  38. ^ Kim Haverblad (2007-08-30). "The Swedish OOXML vote has been declared invalid!". {{cite web}}: Unknown parameter |Publisher= ignored (|publisher= suggested) (help)
  39. ^ ISOC.nl regrets absence of Netherlands decision on OOXML. Internet Society Netherlands press release, 17 August 2007.
  40. ^ FSFE formal objection to the UK14 meeting. Free Software Foundation Europe. 2007-08-13.
  41. ^ Appeal to the decision by Swiss Internet User Group. 14 August 2007.
  42. ^ "Microsoft Office Compatibility Pack for Word, Excel, and PowerPoint 2007 File Formats". Microsoft. 2006-11-06. Retrieved 2007-11-18.
  43. ^ "Microsoft Office Open XML File Format Converter for Mac 0.2 (Beta)". Microsoft. July 31 2007. {{cite web}}: Check date values in: |date= (help)
  44. ^ sherjo (2006-12-6). "Converters Coming! Free and (Fairly) Fast". The Office for Mac Team Blog. Retrieved 2007-03-18. {{cite web}}: Check date values in: |date= (help)
  45. ^ Forbes (August 2, 2007). "Microsoft Delays Office for Mac Release".
  46. ^ "Apple - iWork - Pages". Retrieved 2007-07-08.
  47. ^ "Apple - iWork - Numbers". Retrieved 2007-07-08.
  48. ^ "Apple - iWork - Keynote". Retrieved 2007-07-08.
  49. ^ "OS X leopard Text Edit to Support Office 2007?". uneasysilence. {{cite web}}: Unknown parameter |acessdate= ignored (|access-date= suggested) (help)
  50. ^ ""iPhone User's Guide"" (PDF). Apple, Inc.
  51. ^ "Corel WordPerfect Office To Support Open Document Format and Microsoft Office Open XML". corel. Retrieved 2007-01-30.
  52. ^ "GNOME Office / Gnumeric". GNOME.org. Retrieved 2006-07-28.
  53. ^ "Download OpenOffice.org–OpenXML translator". Novell. Retrieved 2007-03-02.
  54. ^ "Issue 79123 - Integrate a first version of the import filter for ooxml wordprocessing documents". OpenOffice.org. Retrieved 2007-07-09.
  55. ^ "docXConverter - Features". panergy. Retrieved 2007-01-31.
  56. ^ ""DocumentsToGo for PalmOS Premium Edition"". Dataviz.
  57. ^ "Datawatch Announces Availability of Monarch V.9.0; Supports Microsoft® Windows Vista™ and Extends Excel Capabilities". 2007-02-27. {{cite web}}: Unknown parameter |Author= ignored (|author= suggested) (help); Unknown parameter |Publisher= ignored (|publisher= suggested) (help)
  58. ^ Open XML community. "Hear what Ecma has to say about Open XML (paragraph: Key benefits of Open XML)". OpenXMLcommunity.org.
  59. ^ [1]
  60. ^ "Ecma Office Open XML File Formats overview".
  61. ^ -Response Document- National Body Comments from 30-Day Review of the Fast Track Ballot for ISO/IEC DIS 29500 (ECMA-376) Office Open XML File Formats
  62. ^ Interoperability, Choice and Open XML
  63. ^ Rick Jeliffe (2007-07-29). "(comment on) Can a file be ODF and Open XML at the same time ?". O'Reilly XML.com. Retrieved 2007-08-06.
  64. ^ a b c d e f g Cite error: The named reference ecma_tc45_white paper was invoked but never defined (see the help page).
  65. ^ "Six thousand pages, one month, no chance..." Retrieved 2007-02-03.
  66. ^ ODF Alliance. "Office Open XML factsheet" (PDF). {{cite web}}: Unknown parameter |accesdate= ignored (|access-date= suggested) (help)
  67. ^ Sam Hiser (June 14 2007). "Achieving Openness: A Closer Look at ODF and OOXML" (HTML). ONLamp.com. p. 1. Retrieved 2007-07-12. {{cite web}}: Check date values in: |date= (help)
  68. ^ Achieving Openness: A Closer Look at ODF and OOXML
  69. ^ Spolsky, Joel (2006-06-16). "My First BillG Review". Joel on Software. Retrieved 2007-01-31. {{cite web}}: Cite has empty unknown parameter: |1= (help)
  70. ^ Brian Jones. "Spreadsheet formula bugs". MSDN blogs.
  71. ^ Stephen A. Hockema, Jutta Treviranus (2007-08-07). "Accessibility Issues with Office Open XML". University of Toronto.
  72. ^ a b Stéphane Rodriguez (August 28, 2007). "OOXML is defective by design".
  73. ^ "Objections to Ecma 376 from OpenOffice.org in Denmark" (pdf). OpenOffice.org in Denmark. 2007-06-25. Retrieved 2007-07-03. {{cite journal}}: Cite journal requires |journal= (help)

See also

General Office Open XML

OOXML criticism

OOXML support

Converters and tools