Jump to content

Talk:Rich Text Format

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Oub (talk | contribs) at 10:04, 11 June 2010 (Add: DOC vs RTF). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

DEC v.s. Microsoft

The first paragraph states "developed by vyas in 1987", whereas the "information box" in the top-right states "Developed by: Microsoft". This seems inconsistent.


DEC had nothing to do with the development of the RTF format. It was entirely a Microsoft development effort within the Microsoft Word team. Richard Brodie, Charles Simonyi, and I (David Luebbert) were responsible for the design of the RTF 1.0 format.

The first RTF implementation was shipped with Microsoft Word 3.0 for Macintosh in early 1987. I wrote the first RTF reader and writer for that release of Word. RTF format was listed as a Save As format in that version of Word. RTF files that were opened by MacWord 3.0 were automatically translated into a Word document.

In Mac Word 3.0 and its descendents (all subsequent versions of MacWord and Windows Word), RTF was used as an interlingua for translation to and from other word processing file formats. Foreign file formats (PC Word, Mac Write, DisplayWrite, WordPerfect) were translated into an RTF stream which was fed into the builtin RTF reader to produce a Word format document. When translating Word documents into a foreign format, the Word document was translated into an RTF stream which was passed to a translation to produce a foreign format. With this design it became possible to produce plug-in modules for translating from Word to a foreign format and vice versa, making it unnecessary to link rarely used large translation routines into the main Word executable. This design also made it possible for third parties to ship translation packages on their own without Microsoft's involvement. DLuebbert 20:43, 2 October 2007 (UTC)[reply]

Hi David, the "DEC" involvement was added on 7 June 2007 by PGSONIC (talk · contribs), who revised it on 4 October 2007, presumably based on information you provided above. What we really need is reliable sources to verify the early history of the file format. We have a gap between 1987 and 1992, the date on the RTF 1.0 spec. Was the RTF file format static during that period ? John Vandenberg 03:31, 5 October 2007 (UTC)[reply]
Hi John, I did provide PGSONIC (talk · contribs) with that information and I thank him for making that revision,
From the 1992 date I'm guessing that what you label as the RTF 1.0 spec corresponds to what was shipped with Windows Word 2.0.
It was always a requirement that the entirety of structures and properties that could be stored in a Microsoft Word doc format file would have an equivalent representation in RTF. Whenever a major architectural change was made to the doc format, RTF was changed in parallel so that the newly added structures could be accurately translated.
RTF was always backwardly compatible with previous version of Word, so that any new structure descriptions or tags could be ignored by old versions and still produce a best possible translation of the new format.
Whenever a new structure declaration was introduced, we required that it begin with a '{' character followed by a /* tag followed by the structure tag (eg./table, /field). The /* instructed the reader to skip to the '}' character that matches with the opening '{' character and ignore all intervening text if it could not recognize the newly introduced structure tag. This prevented mysterious uninterpretable text (like WinWord 1.0 field declarations) from appearing in documents produced by back-level RTF readers.
The initial spec (I don't remember the version numbering scheme we used back then) was first published a few months after Mac Word 3.0 was released in 1987. It was provided to development houses that wished to prepare translators from RTF to other word processing formats. The spec was revised whenever a new version of Word was released. The tables implementation used in Mac Word and Win Word was introduced in MacWord 4.0, so /table and all of the table description tags were added to the spec at that time. This would have been 1989. WinWord 1.0 shipped the first version that used Word's field constructs, so field descriptions were added to the spec at that time. This would have been 1990. MacWord 5.0 was the first version of Word that allowed embedded objects from the earliest implementation of OLE, so the spec was revved to add object descriptions after that version shipped. This would have been late 1991 or early 1992.
DLuebbert 03:21, 12 October 2007 (UTC)[reply]
The spec we currently link to for Version 1.0 is identified as "GC0165: RICH-TEXT FORMAT (RTF) SPECIFICATION". As you have indicated there were changes between "Mac Word 3.0 RTF" and that "Version 1.0", it would be good to put our hands on any published information regarding RTF between those two. Perhaps similar text documents accompanied the releases, or documentation shared with business partners interested in writing import/output filters. Thanks for your help, John Vandenberg 06:12, 12 October 2007 (UTC)[reply]
So the information about a DEC connection on [this site], for example, is incorrect.Rick Jelliffe (talk) 06:43, 20 October 2008 (UTC)[reply]

WordPad

It is my understanding that the opposite is true. WordPad used to save as .DOC by default and now saves .RTF

I believe you are right and have edited accordingly. I know for a fact that WordPad once supported the Word 6.0 format, but as for defaulting to it, I'm not sure. Can anyone on an older version of Windows verify? --Mark Yen 04:44, 13 August 2006 (UTC)[reply]

Wordpad.exe 5.0.2195.6991, found in an up-to-date Win2K, defaults to saving RTF, but will save as Word 6.0. Wordpad.exe 5.00.1691.1, found on my official 98SE upgrade CD, defaults to saving Word 6.0, but will save as RTF

Test method:

 close all instances of Wordpad, 
 open Wordpad, 
 enter text, 
 save it as a text file (*.txt), 
 close all instances of Wordpad, 
 open Wordpad, 
 enter text, 
 format the text, 
 select File Save - what format does it offer? 

Neither version appears to remember what type of file you saved as previously.

24.17.178.36 (talk) 02:53, 31 December 2008 (UTC)[reply]

Linux

is rtf format widely supported on Linux platforms?

RTF is supported under Linux by Ted (word processor) and Abiword. I am puzzled why these are not listed here.

where does Apple's .rtf format fit in? (rtf is the standard format for rich text in Mac OS X, in its default text editor TextEdit.)

what's the common rich text format used in linuxes? (if there is one at all)

thanks Xah P0lyglut 07:19, 2003 Nov 29 (UTC)

Probably the best example of Linux standard "rich text" is HTML. But there are editors for Microsoft RTF as well. PeteVerdon

Yes, and HTML is widely-used today, probably more than RTF. I've added a reference to HTML. Perhaps we should include a link to the PDF format, as well, since it is similar too? dionyziz 18:49, Feb 13, 2005 (UTC)

DocBook

Btw rtf is supported by most Linux word processors, including OOo. --grin 10:40, 10 May 2006 (UTC)[reply]

RTF code

Perhaps we schould include the basic RTF code (for bold/italic/underline and font face/font size)? dionyziz 18:49, Feb 13, 2005 (UTC)

Well, perhaps not. The official specification is already linked for RTF authors; and 'basic RTF codes' would serve no purpose for non-authors. This kind of information is just not encyclopedic.
Herbee 19:23, 17 Feb 2005 (UTC)

Out-of-Date

RTF Spec for Word 97 -- link appears to be out-of-date

A link to the RTF v1.3 spec -- link yields an access denied


Microsoft refers to a March 1987 RTF specification—presumably a pre-1.0 version. Does anyone have access to the actual text?
Herbee 19:18, 25 Nov 2004 (UTC)

Does it support images? What kind?

Does it support images? What kind?

Yes, it does support images. However, I don't know which format they use to store them... dionyziz 18:20, Feb 13, 2005 (UTC)

Added information regarding image support (only the fact that it supports images, actually). Perhaps we should add more? dionyziz 18:49, Feb 13, 2005 (UTC)

The image format seems to be a “metafile” of some sort, though I don’t know if it’s the same as WMF (Windows Metafile). Upon doing a Paste Special in OpenOffice Writer, the dialog box shows “GDI Metafile” and “Picture (Metafile)” as options. As for the details of that format, they’re simple to figure out by opening the RTF with a text editor: the data is encoded entirely in hex digits in ASCII, with a palette coming first if the image has one, and then the image data (palette index numbers or RGB triplets). It seems only 8-bit palettes are supported, and whatever values the image palette hasn't filled are filled with values from the Websafe palette. I think this is enough information about RTF image support, with the exception of the actual name of that format. --Shlomital 20:49, 23 October 2005 (UTC)[reply]

According to the spec linked to in the article it supports a number of metafile formats and in addition PNG, JPEG and macpict (QuickDraw). Mlewan 19:09, 8 December 2006 (UTC)[reply]
This needs to be updated as there are problems with RTF compressing images. If you add a 100kb image, it often bloats the rtf to over 3 meg! --78.33.40.66 (talk) 16:08, 29 September 2009 (UTC)[reply]

Tools

Has someone made an editor/tool for rtf that shows the markup directly like notepad, but helps out with things like intellisense/autocomplete/keyword insight popups? I.E. not a wysiwyg editor like a word processor, but more of a tool for rtf application developers a bit like how visual studio works for HTML.

Well, jEdit has RTF syntax highlighting. —Caesura(t) 02:51, 19 Apr 2005 (UTC)

Unicode support

A Unicode character escape needs to be followed by the character in the current code page which most closely represents it, and the code point also needs to be represented as a 16 bit signed decimal integer. [1] I have corrected this information as the previous version was misleading. Jammycakes 13:32, 18 May 2006 (UTC)[reply]

If you have used \uc0, then you don't follow a Unicode character escape with a substitution character.--Jwwalker 21:18, 23 June 2006 (UTC)[reply]
Does the Unicode format really take a "signed" decimal integer? How can a negative value possibly be valid? It would be my guess that unsigned would be the correct adjective. —Ksn 17:39, 20 August 2006 (UTC)[reply]
Yes, it is signed, according to the RTF specification. Values greater than 32767 have to be specified as a negative number eg 0xFEFF would be -257. The point about the \uc0 is correct though. Jammycakes 18:26, 20 August 2006 (UTC)[reply]
Thanx. Leave it to Microsoft to specify code points with negative numbers. —Ksn 22:58, 21 August 2006 (UTC)[reply]

Open standard?

Is Rich Text Format an open standard? --Aeon17x 04:22, 22 July 2006 (UTC)[reply]

I'd say it meets the definition of open standard given at the beginning of that page, but does not meet the EU definition given later on that page. By the way, RTF is mentioned on the open format page, but it is not clear to me whether it was saying that RTF is an open format or not. --Jwwalker 06:52, 22 July 2006 (UTC)[reply]
It is not a standard (as defined by a standards body), but is defined by a single company. The text claims patents may be involved, so it isn't free. I removed the reference to "free text format". --NealMcB (talk) 03:59, 22 September 2008 (UTC)[reply]
At the Open format page, under the Examples of Open Formats and then under Text, RTF is listed as a free format. Which one is incorrect?--Dbmikus (talk) 03:00, 7 November 2009 (UTC)[reply]

Human-readable

I think the information on human-readability is obsolete. Yes, RTF is still human-readable, but so are most of the other formats in use today (including .doc, .odt and .html). Usually the formats of today are XML-based. —The preceding unsigned comment was added by AVainio (talkcontribs) .

It is still important to mention this as the format was designed to be human-readable in an era when that was not the norm, especially for Microsoft. Would the following be better:
Unlike most of the word processing formats designed in that era, RTF is human-readable.
btw, .doc is still a closed binary format. .docx is the extension for Microsoft Office Open XML, the human-readable replacement. Jayvdb 13:03, 21 August 2006 (UTC)[reply]
What other word processing formats are not human readable? What other word processing format designed in that era were not human readable? SPRINT, Wordstar, RTF.... markup languages were the norm. "Unlike most word processing formats" needs to be justified.203.206.162.148 (talk) 10:59, 11 March 2009 (UTC).[reply]
There was an early "Wordstar" era and now a modern XML era -- but in between was a period of almost 20 years when the default was to be not human readable... AnonMoos (talk) 00:26, 31 December 2009 (UTC)[reply]

RichEdit?

I noticed there is nothing on Wikipedia about the Rich Edit controls in Windows. In my memory, they're supposed to handle RTF, aren't they ?

Rich

The article should define where the word "Rich" comes from. —Preceding unsigned comment added by 85.166.80.132 (talk) 17:35, 23 January 2008 (UTC)[reply]

Well, while I'm here (to bring up a different point), I'll hazard a guess as to where "rich" came from. I pretty strongly suspect that the rich is intended to indicate that rtf provides for text that is "richer" than, for example, some prior formats, like, for example plain (ASCII) text.

Exactly what existed before, or what features form the basis for calling it rich, I'm not short, but I would guess that it may included things like the ability to display bold, italic, ... fonts, and the ability to "declare" a portion of text to be a paragraph and be formatted accordingly, and similarly for maybe bulleted and numbered lists, and whatever. So "rich" might be considered a synonym for "fancy".-Rhkramer (talk) 18:27, 14 July 2008 (UTC)[reply]

Influence of the Navy?

I have this recollection that RTF either originated with the (US) Navy, or maybe was used heavily by the navy in the early days.

I can't remember exactly why I think that, but I seem to recall that some of the choices for saving (and opening) documents might have included options with both the word "navy" and the acronym "RTF" in very early versions of MS Word--I probably mean before Word version 3.0.

Can anybody shed any more light on that?-Rhkramer (talk) 18:27, 14 July 2008 (UTC)[reply]

Oops--I'm not 100% sure yet, but maybe I was thinking of "Navy DIF" (Document Interchange Format). BTW, there doesn't seem to be a Wikipedia article about Navy DIF.-Rhkramer (talk) 18:52, 14 July 2008 (UTC)[reply]

Falsehood in character encoding section

The character encoding section of the article begins, "RTF is an 8-bit format. That would limit it to ASCII, ...." This is a falsehood, since ASCII is a 7-bit specification, and numerous characters sets, including all the ISO 8859-* sets, use 8-bit codes. Many of them (including the 8859-* sets) coincide with ASCII for the code points < 128, but EBCDIC, for example, doesn't. I don't know what it should say, but I know what it says now is incorrect. Can someone please correct it? —Largo Plazo (talk) 19:15, 1 April 2009 (UTC)[reply]

Actually, RTF generally avoids including hi-bit (128-255) characters in the file... AnonMoos (talk) 00:29, 31 December 2009 (UTC)[reply]

Internet media type

Shouldn't text/rtf read application/rtf? http://www.fileformat.info/info/mimetype/text/rtf refers to something different than the RTF as mentioned in http://www.fileformat.info/info/mimetype/application/rtf (on iana.org, http://www.iana.org/assignments/media-types/text/rtf downloads some email message with the same content as given on FileFormat.info)

Intellectual property

The page currently says "The intellectual property of the format belongs to Microsoft." This seems too vague: are there patents involved? The Microsoft docs don't seem to mention any. Dmurdoch (talk) 10:53, 29 October 2009 (UTC)[reply]

Microsoft help (HLP) files

RTF was the source format for the MS Help Compiler - which rendered subscripts and superscripts as hyperlinks and index entries. So all Help files were written in RTF. The HLP format was replaced with Compiled HTML (CHM), which uses HTML as the markup language. —Preceding unsigned comment added by 203.206.162.148 (talk) 07:26, 27 May 2010 (UTC)[reply]

Interoperability: RTF vs DOC

The article states:

Nevertheless, the RTF format is consistent enough from computer to computer to be considered highly portable and moderately acceptable for cross-platform use.[who?] 

Can somebody provide a link to a systematic study? I have tried several times to deal with documents generated by MS Word 2003, containing complex tables. These files were either saved as DOC or RTF. I opened these files with OpenOffice3.X using the native OO import filters. Result: the OO import filters work better with binary doc, than with RTF. That is why I am sceptical about the above statement.Oub (talk) 10:04, 11 June 2010 (UTC):[reply]