Web Start-up unveils Semantic Wikipedia Search Tool
Posted on Mon, 12 May 2008 07:03:46 CDT | by Luigi Lugmayr
More News Ticker News
By Eric Auchard
SAN FRANCISCO (Reuters) - Powerset on Sunday unveiled tools for searching
Wikipedia that use conversational phrasing instead of keywords, marking the
first step of its challenge to established Web search services such as Google.
Powerset's technology breaks down the meaning of words and sentences into
related concepts, freeing users from always needing to type the exact words they
want to find.
The closely watched Silicon Valley start-up is offering a way of searching
millions of entries in Wikipedia's online encyclopedia, helping users find
detailed answers to questions rather than isolated links that require further
research.
For example, a user who wants to know how many wives King Henry VIII had (six,
or two, depending on your definition of marriage) can find an answer via
Powerset's service at
http://tinyurl.com/5qpcr9/.
San Francisco-based Powerset is looking to leapfrog the current generation of
services that rely on keyword searches such as Google Inc , Yahoo Inc ,
Microsoft Corp and IAC InterActiveCorp's Ask.com.
"The Wikipedia is becoming a microcosm of the most useful parts of the Web,"
said Greg Sterling, an Internet analyst with Sterling Market Intelligence. "This
offers a powerful way to find what you are looking for against this subset of
the Web."
While still a far cry from letting users search the World Wide Web, Powerset is
using Wikipedia as a trial showcase for how its technology can be used to search
a vast number of other websites using natural language phrases or questions.
Over time, it aims to partner with other high-quality data sites where
information can be organized in a question and answer form that lends itself to
Powerset search techniques. Examples might include financial or patent filings,
the CIA Factbook or Wikipedia-inspired clones, company officials said.
Powerset, which can be found at http://www.powerset.com/, looks beyond words to
try to understand conceptual relationships that get closer to what a user may be
searching for. It analyzes each sentence and whole documents to do so.
Powerset plans eventually to make money selling advertising alongside its search
services. But for now, the 60-employee company consists almost entirely of
computer scientists and linguists. It has no advertising staff and only a
handful of marketing and support staff.
Sterling said it is likely to take years for Powerset to be able to search the
Web on the scale Google now does using statistical ranking techniques to find
relevant Web links.
"What I don't know is how Powerset will perform on the wide open Web. In a
sense, this is a massive prototype using the relatively structured information
of Wikipedia. It is difficult to compare to what Google has built," Sterling
said.
Sterling said a bigger danger to Google would be if rival Microsoft were to
acquire Powerset and incorporate it into other search technologies it has.
Recently, Microsoft backed off a $44 billion bid for Yahoo to create a
formidable rival to Google in Web search and online advertising.
"This could become the basis of a Google-killer," Sterling said. "Someone like
Microsoft might want to buy Powerset."
Spokesmen for Microsoft and Powerset declined to comment on rumors of a
potential tie-up between the two companies.
FUN WITH 'FACTZ'
Powerset offers richly annotated ways for searching inside Wikipedia entries to
find related concepts. Called "Factz," these related ideas generate outlines,
summaries and automated answers to users' questions.
"Our system is a little more forgiving," Scott Prevost, general manager of
Powerset, said in an interview on Sunday. "It is not looking for hard-word
matches. We are not searching for exact words, but concepts," he said.
The 2-1/2-year-old start-up licensed natural language processing technology and
related machine processing methods developed over three decades at the Xerox
PARC research centre in Silicon Valley to create new consumer Web search
services.
With tacit approval of the non-profit Wikimedia Foundation, the organization
behind the Wikipedia, Powerset officials said they are hosting a copy of
Wikipedia's 2.5 million English-language entries on its own computers. This lets
Powerset make links across the breadth of Wikipedia data.
"What Powerset is doing is offering readers a natural-language search interface,
and we think that is an interesting experiment," Mike Godwin, Wikimedia
Foundation's general counsel, said in response to an emailed question about how
the two organizations would work together.
In addition to Wikipedia, Powerset's new service also searches a related
database called Freebase created by MetaWeb, another Web search start-up.
After decades of research and debate, natural language processing is finally
poised to go mainstream, predicted Barney Pell, co-founder and chief technology
officer.
"2008 is the year that semantic and linguistic technologies cross over into
widespread consumer use," he said.
(Editing by Louise Ireland)
© Copyright 2007 Reuters.
Photo:
The Powerset homepage is seen in this handout photo. REUTERS/Handout/Powerset
Posted on Mon, 12 May 2008 07:03:46 CDT | by Luigi Lugmayr
I4U Gadget Models
I4U News Product Reviews
All I4U News Categories
Hot Gadgets
- Takara Tomy Air Guitar Pro from Japan
2008-05-10 12:59:13
- Asus Eee PC 900 on Sale on NewEgg [Update]
2008-05-08 10:09:02
- Disney WALL-E Robot Toys Coming Soon
2008-04-30 07:21:00
- HYmini portable Power Generator on Sale
2008-04-24 00:40:48
- Amazon Kindle eBook Reader available again
2008-04-20 12:25:01
- Unique PS3 Laptop up for Charity Auction
2008-04-18 22:23:41
More Gadgets
Subscribe to I4U Gadget Flyer
Stay in touch with our weekly round-up of the Top 10 Technology stories with our free newsletter.

More stories
More stories
Free Model Wallpapers