Jump to content

Grub (search engine)

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by A. B. (talk | contribs) at 01:23, 28 September 2008 (Reverted edits by 65.207.221.99 (talk) to last version by 82.82.223.191). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Grub is an open source distributed search crawler platform. On July 27, 2007 Jimmy Wales announced that Wikia, Inc., the for-profit company developing the open source search engine Wikia Search, had acquired Grub from LookSmart. [1]. The cost was $50,000 [2].

The project was started in 2000 by Kord Campbell, Igor Stojanovski, and Ledio Ago in Oklahoma City.[3] Undetermined copyright, patent or trademark rights from Grub, Inc. were purchased in 2003 for $1.3 million by LookSmart, Ltd.[4] For a short time the original team continued working on the project, releasing several new versions of the software, albeit under a closed license.

There were several controversial issues surrounding the Grub project in the time shortly after LookSmart acquired it. Grub had a slight tendency to ignore a few mis-configured robots.txt files on the sites it crawled.[citation needed] Even when the development team addressed these issues, a few webmasters continued blaming it for crawling their site too much, and not respecting their robots.txt files.[citation needed]

Another issue was the closing of the source code base, and the apparent lack of using the crawled data for anything useful, such as a searchable index of the sites it crawled. It appears that Grub was used for a short time to seed the URL list for NetNanny, another acquisition of LookSmart.

Operations of Grub were shut down in late 2005. The site was reactivated on July 27, 2007, and the site is currently being updated. The original developers are assisting with the new deployment, and investigating the robots.txt issue, to ensure a repeat performance does not occur.

Users of Grub can download the peer-to-peer grubclient software and let it run during computer idle time. The client indexes the URLs and send them back to the main grub server in a highly compressed form. The collective crawl could then, in theory, be utilized by an indexing system, such as the one being proposed at Wikia Search. Grub is able to quickly build a large snapshot by asking thousands of clients to crawl and analyze a small portion of the web each.

Wikia has now released the entire Grub package under an open source software license. However, the old Grub clients are not functional anymore. New clients can be found on the Wikia wiki.

References