Wikipedia like search engine to use Apache Lucene – Nutch

Jimmy Wales, founder of Wikipedia, is planning to build an online
commercial search engine that would compete with Google and Yahoo.
Wales plans to combine the user-based technology behind nonprofit
Wikipedia with open source Web-search software called Nutch, which is
part of the Apache Lucene project.

The search engine, code-named Wikiasari, would combine open source
technology and human intervention to deliver more relevant results than
the algorithm-based systems used today, Wales said Tuesday. “Human
intelligence is still the best thing we have, so let’s let humans do
what they do best, and computers do what they do best.”

Wales says that the reason for
Wikiasari is that
search as we know it is broken: “It is broken for the
same reason that proprietary software is always broken: lack of
freedom, lack of community, lack of accountability, lack of

Apache Lucene
is a high-performance, full-featured text search engine
library written entirely in Java. While Nutch
is an open source
web-search software. Nutch
builds on Lucene Java, adding web-specifics,
such as a crawler, a link-graph database, parsers for HTML and other
document formats, etc.

Nutch has graduated from the Apache Incubator, and is currently a
subproject of Lucene. It is coded completely in Java, but data is
written in language-independent formats.

The usage of Nutch and Lucene in the Wikiasari project is expected to
drive fresh interest in the projects and make even better Java
based search available.

