User:Proteins/Writing scripts for Wikipedia
Scripts are amazing. They give you nearly unlimited power to analyze Wikipedia articles, to modify their appearance and even to add new elements. For example, you can count the number of polysyllabic words (analysis), color the words according to their syllables (modification) and create interactive dialogs for the reader (addition). In general, scripts do not affect the underlying article, the one stored on the database, so that multiple people can view the same article according to their own preferences, by using different scripts.
Scripts are also not hard to write! You need to know some HTML tags, and you have to learn how browsers represent the HTML internally, in a so-called DOM tree. Once you've learned those things, and mastered a few commands in JavaScript, you can do anything. There's already a WikiProject devoted to scripts, but this essay was written in the hope that Wikipedians might appreciate a slightly simpler introduction to scripts. Please let me know if something is unclear or incorrect.
Wiki markup, HTML code and the DOM tree
Wikipedia articles can be represented in three forms: as wiki-markup, as HTML and as a DOM tree. This section explains the difference, and how you can see and modify the same article in each of its forms.
Wiki-markup
A typical article on Wikipedia is written in standard wiki-markup. For example, bold-faced words are written as '''bold-faced words''' contained between three single quotes. I believe this is the form of the article stored in the Wikipedia database. To see and modify the article in this form, you click on the "edit this page" tab at the top of the article.
HTML code
When you request an article from the Wikipedia database, it is returned to you as HTML code. This HTML code is generated by the MediaWiki software from the underlying wiki-markup. You can see this HTML code by clicking on "View page source" or "View source" on your browser. Your browser takes the returned HTML node and renders it for you into the beautiful webpage you see before you. You can modify the HTML code directly using JavaScript, but it's difficult; it's much better and more customary to modify the rendering of the page by manipulating the DOM tree, which is the browser's internal representation of the HTML code.
Here are instructions to view the HTML code in different browsers:
- In Firefox 3, you can type "Ctrl-U" and a pop-up window should appear containing the HTML code. Alternatively, you can click on the "View" menu, which is located in the top menu bar between the "Edit" and "History" menus. Within the View menu, click on the "Page source", which is the next-to-last choice, and the same pop-up window should appear.
- In Opera, typing Ctrl-U also shows the HTML code in a pop-up window. Alternatively, you can click on the "View" menu, which is located in the top menu bar between the "Edit" and "Bookmarks" menus. Within the View menu, click on the "Source", which is the sixth choice, and the same pop-up window should appear.
- In Google Chrome, typing Ctrl-U also shows the HTML code in a pop-up window. Alternatively, you can click on the "control this page" icon to the very right of the search bar. On the resulting submenu, click on "Developer", and on its submenu, click on "View source".
- In Safari, typing Ctrl-ALT-U shows the HTML code in a pop-up window. Alternatively, you can click on the "View" menu, located as usual between the "Edit" and "History" menus in the top menu bar. Under the View menu, you can click on "View source" for the same pop-up window.
- In Internet Explorer 7, click on the "Page" menu, which is to the left of the "Tools" menu, on the line just above the fram showing the article. Third from the bottom of that menu is the choice "View source", which will open the HTML in a pop-up window using Notepad.
The DOM tree
Every browser converts the received HTML code into a DOM tree; the word "DOM" stands for "Document Model". It's a tree, meaning that it arranges the HTML elements into a hierarchy. You can view this tree as described in the next section, and you can modify this tree however you wish using JavaScript.
How to view the DOM tree in your browser
Most browsers allow you to see the DOM tree, which is the browser's internal representation of the webpage. The following instructions should allow you to see it in different browsers:
- In Firefox 3, the best approach is to download an add-on known as "DOM inspector". Once added, it should appear under the "Tools" menu in the top bar of the browser, which is next to the "Bookmarks" menu". DOM Inspector can also be activated using the keycode Ctrl-Shift-I.
- In Google Chrome, right-clicking on any part of the page summons a menu. At the bottom of that menu is the choice "Inspect element", which shows the position of the element in the DOM tree.
- In Internet Explorer 7, the Internet Explorer Developer Toolbar, a free download from Microsoft, is used to show the DOM tree. This toolbar can be found at the far right, behind the double arrows that are to the right of the "Tools" menu, which is itself to the right of the "Page" menu.
- In Safari, click on the "Develop" menu and select the choice "Show Web Inspector". The Develop menu is located in the topmost menu bar, between the "Bookmarks" and "Window" menus. If the Develop menu is not there, click on the "Edit" menu and select its last element, "Preferences". A window will pop up, on which you choose the last tab, labeled "Advanced". At the bottom of the Advanced screen is a checkbox labeled "Show Develop menu in menu bar." Clicking this checkbox should introduce the Develop menu in the menu bar.
- In Opera, the equivalent DOM inspector can be turned on by clicking on the "Tools" menu in the top menu bar (sandwiched between the "Widgets" and "Help" menus). Under the Tools menu, click on the "Advanced" submenu, and from the resulting sub-sub-menu, choose "Developer Tools". This should turn on an analysis system at the bottom of the screen, which incidentally can also be detached into a window of its own. Within this analysis window, clicking on the "DOM" tab should reveal the DOM tree. One drawback of this inspector seems to be that it does not reveal the changes in the DOM tree after your script has run. Instead, it reloads the webpage afresh, always showing the original unmodified DOM tree.
The DOM tree of typical Wikipedia pages
Inspecting the DOM tree of Wikipedia articles will reveal a common architecture. The main content of the article is contained inside a DIV element with the id label "bodyContent"; to reach this crucial node, however, you need to drill down a few levels. The bodyContent node is found under the "content" node, which in turn is under the "column-content" node, which in turn is under the "globalWrapper" node, which is turn is under the standard BODY node, which is under the HTML node, which is under the "document" node, the top of the DOM tree. Thus, to reach bodyContent, you need to follow the sequence of child-nodes (sometimes called a "trail" through the document, or an XPath)
document → HTML → BODY → globalWrapper → column-content → content → bodyContent
Why are so many levels necessary before getting to the main article? The MediaWiki software uses these other levels to add all the extra decorations found on the page. For example, the user commands along the upper edge at the right, such as your user name, you user talk page, your preferences, etc. are found under "column-one" node, which is the sibling node of the "column-content" node. So are the tabs at the top of the page such as "article", "talk", "edit this page", etc. as well as the menus for navigation, search, interaction and toolbox in the left-hand column. By placing these in a separate node, they can be located and manipulated independently from the content.
Looking inside the bodyContent node using a DOM inspector reveals all the HTML code that makes up the article. For example, typical section headings are contained under H2 nodes, whereas successive subsections are contained under H3, H4 and H5 nodes. Normal text is contained in paragraph nodes labeled "P". Unordered (that is, bullet-pointed) lists and ordered (that is, numbered) lists are contained under UL and OL nodes, respectively; individual items in both cases are contained under LI (list item) nodes. Indentation corresponds to discursive lists; these are labeled with a DL, and the indented text is contained under a DD node. In some cases, a DL list is actually a definition list, one that has defined, boldfaced terms contained under a DT node; these terms are generated using an initial semicolon in wiki-markup. Larger-scale groupings of HTML nodes can be made using DIV and SPAN tags.
JavaScript and the DOM tree
JavaScript is a small, simple language that bears a superficial resemblance to the programming language C. It has been called the "assembly language of the web", meaning that it is a fundamental language for interacting with web-pages; every detail of the web-page's content and layout is open to analysis and modification. Despite the similarity in its name, JavaScript is not based on the Java programming language.
Scripts written for Wikipedia are a form of client-side scripting, meaning that the computations are carried out on the computer viewing the web-page (your computer, the client) and not by the computer providing the HTML code (the Wikimedia Foundation computer, the server). More specifically, the computations are carried out by the browser program. For technical reasons, JavaScript is much slower than more basic languages such as C. Therefore, scripts should not involve intense computations, lest the browser slow down. However, given the speed of modern computers and of modern JavaScript interpreters (particularly on Google Chrome), it is unlikely that a typical script-author will write an overly demanding script, unless they accidentally provoke an endless loop.
This use of JavaScript is known as client-side JavaScript, the most common form. JavaScript could be used as an independent computer language such as C, but its slowness makes it poorly suited for such purposes.
Instead of buying an expensive reference book on JavaScript, I've found nearly everything I need to program efficiently in a cheap (<$10) little (127 pages, 4"x7") book: the 2nd edition of the JavaScript Pocket Reference by David Flanagan, published by O'Reilly. Although published in 2002, the book is adequately up-to-date for writing most scripts (as of October 2008). That may change in the next year or two as Firefox and Google Chrome develop new aspects of JavaScript, such as animation controls. For a few extra tricks, I've used Google to identify solutions on the web.
Getting a handle on elements in the DOM tree
Originally, there were competing models for the DOM tree between Internet Explorer and all other browsers. Thus, different scripts would have to be written for different browsers. Fortunately, the DOM model has been largely—but not completely!—standardized, so that one Wikipedia script is likely to work the same on most browsers.
Note to self: describe here how to determine the browser type and how to work with it.
The highest level objects used in scripts are the document and window objects.
Note to self: describe here how to do basic manipulations of the document and windows.
A specific element in a DOM tree can be obtained through its id code. For example, the bodyContent node can be obtained by the command
body_content = document.getElementById("bodyContent");