Recent posts on NGS

Content Intelligence: The Future of Search

by Walter Jessen

This article has been viewed 3468 times and has 5 comments.

Forget keywords, forget ontologies … I’ve seen the future of online search and it’s called Content Intelligence.

Mountain View, California based NetBase launched its Content Intelligence platform today, which is powering a new generation of consumer and B2B content-rich applications. I had the opportunity to attend a briefing last week where they showed off their technology and I have to tell you, as someone who searches Google and PubMed daily, and is well versed in constructing complex search queries to try and return the most relevant results using keyword search, I was extremely impressed with their technology.

NetBase

How many times have you run a web search and paged through tens, if not hundreds of pages of results? Research firm IDC (Interactive Data Corporation) estimates that in 2006 the world produced 161 Exabytes (trillion Megabytes) of digital data encompassing 70 million blogs and 150 million Web sites. That’s almost 5 terabytes per second. By 2010, it’s estimated that number will grow tenfold. Yet traditional search by keyword fails to address this explosion of information because it returns too many documents with too many irrelevant results and lacks actionable insights and answers.

NetBase’s Content Intelligence platform uses state-of-the-art analytics technology to search, parse and summarize information from any number of industry sectors. Content Intelligence addresses search in a new way by reading every sentence inside documents on an ongoing basis, linguistically analyzing them and identifying relationships between entities and attributes. The information is then stored in structured semantic indexes. NetBase CEO and co-founder Jonathan Spier and Jens Tellefsen, VP of Marketing and Product Strategy, talk more about what makes NetBase and the Content Intelligence platform different from traditional search engines in the video below.

The evolution of search

Keyword search using statistics such as number of incoming links to calculate relevancy and popularity is context independent, and that’s a problem as the amount of information on the web continues to grow.

keyword-statistics-search

Linguistic search is different, relying on pre-defined lexicons to extract domain-specific entities from documents. While this is more useful than keyword statistics, lexicons require a lot of time and resources to construct and maintain. Further, linguistic search is often unable to deliver actionable insights and answers from large amounts of data across multiple domains.

linguistic-search

In contrast, Content Intelligence focuses on understanding the meaning of sentences — independent of lexicons — by identifying the connection between entities. The relationship between keywords is crucial to the understanding of sentences and that’s where Content Intelligence shines.

content-intelligence-search

NetBase believes that Content Intelligence will solve the problem of information overload, enriching existing content and surfacing high quality, meaningful, contextually aware insights. NetBase’s sematic index can be deployed to access any content and is highly scaleable. The Content Intellegence platform doesn’t rely on hard-wired or human-edited data to identify relevant relationships within content.

Elsevier’s illumin8

Last month, NetBase announced that it’s continuing its customer relationship with science and health information publisher Elsevier. Elsevier has been using NetBase’s Content Intelligence platform to power illumin8 since early last year. illumin8 is a web-based research tool that integrates natural language search technology with content from Elsevier’s full-text scientific articles, millions of scientific abstracts, patents and billions of web sources to give users actionable solutions for research initiatives. The agreeement extends its commitment to NetBase for another three years. To see illumin8 in action click here to view an illumin8 search on “chip cooling”. The interface is designed to return results organized in categories including organizations, products, people, approaches, benefits and related results.

NetBase for Healthcare

NetBase has also announced that it is expanding it focus and marketing to other industries, in particular healthcare. At a time when healthcare professionals and biomedical researchers are looking to collaboratively leverage research, intelligence and new technologies, cutting time to finding relevant information has become all the more important. NetBase for Healthcare enables researchers, doctors and patients to quickly and efficiently search a vast and rapidly expanding number of books, medical journals, databases, Web sites and patient records, creating a new way to discover and use medical content. For example doctors and nurses can look up a symptom like hypertension and instantly see an organized, up-to-the-minute summary of the causes, effects, symptoms, complications, side effects and treatments of the condition. Additionally, consumers and patients could do their own research for alternative treatments, suggested lifestyle changes, or less expensive medications or treatments to cut down on increasing healthcare costs.


netbase-health-solution

As the amount of digital data on the web continues to increase, more sophisticated solutions are necessary to find the actionable insights and answers we need without sifting through thousands of documents. NetBase’s Content Intelligence platform is unique in its ability to understand every sentence of every document without lexicons or human editing and extends search by an order of magnitude over previous approaches. illumin8 was designed specifically for corporate R&D knowledge workers. With NetBase’s expansion into healthcare, we should soon begin to see their search platform become more widely available.

Now, if only I could use this search technology on PubMed directly (PubMed currently uses keyword search). One has to wonder why scientists are using antiquated technology based on keyword statistics to search through cutting-edge biomedical research literature. National Library of Medicine, are you listening?

Are you a Twitter user? Tweet this!

Tags: , , , , , , , , , , , , , , , , ,

Posted on Wednesday, April 22, 2009

Topic: Technology


GoodGreatFantasticAwesomeQuintessential (3 votes, average: 4.33 out of 5)
Loading ... Loading ...
Print Post Print Post



Subscribe with RSS  Like this article? Next Generation Science delivers weekly articles on emerging technologies and tomorrow's science. Join the community by subscribing (more).


1 Comment

5 responses to "Content Intelligence: The Future of Search"


Comments? Leave a Note Below
  1. Mickey Schafer commented on April 22nd, 2009:

    Very interesting stuff, and I am looking forward to running NetBase through some searches to see what happens. Maybe b/c I’m so used to using key terms and sifting info myself, I’ve found some of the “additional” info provided by some search engines (Iseek, for instance) a bit constraining to get around.

    The one difficulty that no search engine can address, though, is access. In healthcare, many practitioners have reading rights only to the publications provided as part of their professional affiliations. Some will also buy into MDConsult, since it is “relatively” inexpensive. How will NetBase address accessibility? Or is it planning to work with publishers, such as the case with Elsevier, to integrate the search platform with existing content? Or have a separate limiter for OA only?

  2. Mickey Schafer commented on April 22nd, 2009:

    Oh, I see — should’ve looked before I commented! This is a product/service to be integrated into an existing “database” (is that the correct idea?) not a stand-alone search engine like novoseek. I wish I didn’t have to register to try it out!

  3. Jens Tellefsen commented on April 23rd, 2009:

    That is correct. Our current business model is to use our Content Intelligence platform to power other companies search applications like publishers, portals and media companies. We plan to have a free health search demo application up in a few weeks that will include PubMed and other healthcare related data sources.

  4. Mickey Schafer commented on May 2nd, 2009:

    Thank you, Jens! I’m curious if the Content Intelligence approach incorporates semantic content at the paragraph level, too? While semantic relationships in sentences are important, they are only one part of discourse, and across a paragraph, the import of any particular relationship may be less (or more) weighty than what happens in a single sentence. For example, pronouns (and other shorter lexical forms) usually refer to well-known entities in a paragraph and indicate what is “understood” to be the central topic of the paragraph. Does your system incorporate a formula/algorithm for these kinds of relationships? They are quantitative in nature, so seem like viable candidates, though I am not a code developer (but I had one sit on my doctoral committee and he asked me to model stuff for him — some were easy; for some, I just told him “no”!).

  5. Jens Tellefsen commented on May 3rd, 2009:

    That is a great question Mickey. Today, Content Intelligence does not, in the strictest sense, do what’s called anaphora resolution (e.g. figuring out what “it” refers to in a sentence). We use other tricks to capture context from prior sentences as a back-off. Strict anaphora resolution on massive internet-scale has as far as we know not been done. This is something we are working on and hope to support as a robust commercial feature (anything is possible in the lab) soon.




TopHome