Jump to content

Adversarial information retrieval

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Brandelf (talk | contribs) at 21:57, 6 January 2011 (Simplified the opening sentences to get to the point). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Adversarial information retrieval (adversarial IR) is a topic in information retrieval related to strategies for working with a data source where some portion of it has been manipulated maliciously. Tasks can include gathering, indexing, filtering, retrieving and ranking information from such a data source. Adversarial IR includes the study of methods to detect, isolate, and defeat such manipulation.

On the Web, the predominant form of such manipulation is search engine spamming (also known as spamdexing), including techniques that are employed to disrupt the activity of web search engines, usually for financial gain. Examples of spamdexing are link-bombing, comment or referrer spam, spam blogs (splogs), malicious tagging, reverse engineering of ranking algorithms, advertisement blocking, and web content filtering [1].

The name stems from the fact that there are two sides with opposing goals. For instance, the relationship between the owner of a Web site trying to rank high on a search engine and the search engine administrator is an adversarial relationship in a zero-sum game.[citation needed] Every undeserved gain in ranking by the web site is a loss of precision for the search engine.

Topics

Topics related to Web spam (spamdexing):

Other topics:

History

The term "adversarial information retrieval" was first coined in 2000 by Andrei Broder (then Chief Scientist at Alta Vista) during the Web plenary session at the TREC-9 conference[2].

See also

References

  • AIRWeb: series of workshops on Adversarial Information Retrieval on the Web
  • Web Spam Challenge: competition for researchers on Web Spam Detection
  • Web Spam Datasets: datasets for research on Web Spam Detection