Search fusion

 

Connecting state and local government leaders

Support grows for new way to integrate information analysis and retrieval tools.

Despite the hype surrounding popular search engines such as Google and Yahoo, the kind of keyword search technology those services use are not enough to satisfy many of the government's industrial-strength information management requirements. With agencies riding herd on an expanding amount of unstructured information contained in Web sites, e-mail messages and other file formats, the government must address a critical need for tools to help people make sense of the online information, not simply search for the occurrence of a handful of words.Those needs have been the catalyst for the development of a diverse range of natural language processing tools that allow users to launch sophisticated queries of vast information stores using simple language. However, product interoperability has not kept pace with the increasing specialization of analysis tools. Different tools excel at certain tasks, such as recognizing and understanding different languages or extracting the who, what, where, when and why contained in a file's content. But the barriers to getting tools to work together have been difficult to overcome.Now government agencies, including the Defense Advanced Research Projects Agency (DARPA), are exploring a new method for tool integration called the Unstructured Information Management Architecture. IBM developed UIMA to coordinate the efforts of its application development teams that work on ways to better manage unstructured information. UIMA defines a common interface that gives application developers a standard way to exchange data from different applications. In the absence of a standard approach, such integration typically involves time-consuming and costly custom integration.IBM recently made UIMA available to the open-source software development community, hoping to spur wider adoption. Signs indicate that is starting to happen. The origins of UIMA date to five years ago at IBM's development labs. At that time, 'we had some 200 researchers around the world active in unstructured information management, which we viewed as a vital field, but we were not satisfied with our rate of progress,' said Arthur Ciccolo, department group manager of information and knowledge management at IBM Research and one of the leaders of UIMA's development. 'One of the things that was lacking was a common infrastructure that everyone could use and not have to reinvent each time they needed it,' Ciccolo said. 'Plus, there was no way people could share their results with each other.'UIMA is IBM's response to those challenges. It is a framework that supports an application from the acquisition of unstructured information in its raw form to its analysis and then use in tools such as databases, search engines and knowledge management systems.IBM's early work on UIMA caught the eye of officials at the Mayo Clinic, which was already collaborating with IBM on unstructured text processing. The clinic used UIMA to implement a system for extracting knowledge from 20 million clinical notes. The Memorial Sloan-Kettering Cancer Center also worked with IBM to develop a Web-based data warehouse that clinicians and researchers could use to search for various concepts in text-based pathology reports.The early stages of UIMA also piqued DARPA's interest because agency officials recognized its potential value to military systems. DARPA formed a working group with IBM that brought together university and industry experts in unstructured information management to help drive UIMA's evolution.It is the first system that allows analytical applications to easily connect with one another as modules that plug into a common architecture through the use of a 'really nice wraparound language,' said Joseph Olive, a DARPA program manager.'You could do this before, but [users] had to do all of the work themselves to make these modules connect up with each other,' he said. 'Now you can just wrap a [UIMA] envelope around them.'DARPA has used UIMA in various small projects, Olive said. Now the agency is putting it to large-scale use in the Global Autonomous Language Exploitation program, which aims to develop software that can analyze and interpret large volumes of speech and text in multiple languages.Because that program employs three lead vendors who each have big teams of subcontractors, he said, UIMA will let the contractors more easily share and distribute their work.Although UIMA is still relatively early in development, some vendors have already committed to the IBM framework.Attensity, for example, provides an applications suite that allows customers to extract information from unstructured text and combine it with structured data to quickly provide analysis-ready datasets. Some of Attensity's government customers use a logistics analysis solution that lets them convert unstructured data from equipment service notes and repair logs into relational tables so that automated tools can then detect patterns indicating manufacturing or maintenance problems.Attensity's products fully comply with UIMA, said Michelle de Haaff, Attensity's vice president of marketing, and the company is working with government clients to develop their own UIMA adapters for plugging in other applications.'Most agencies already use a wide range of tools in their search and analysis operations,' she said. 'They can simply put out UIMA calls to get data extracted using our tools and then put it back into other applications that also use UIMA.'This dynamic is especially beneficial for agencies that use tools that they can't reveal much about for security reasons. UIMA provides a standard exchange method external to the classified application, said David Bean, Attensity's co-founder, chief technology officer and vice president of engineering. Even traditional search companies see an advantage in UIMA compliance. Exalead, a 6-year-old company that's already well-established in industry and government in Europe, is looking to expand in the United States and sees UIMA as a potential advantage.Exalead sells an enterprise search platform built on Extensible Markup Language and Java that is complete for most search and retrieval purposes, said Francois Bourdoncle, the company's president and chief executive officer and one of the early developers of the AltaVista search engine. But the government frequently requires third-party tools for its most complicated applications.'That's why UIMA is interesting to us,' he said. 'It's a document description language that allows those third-party tools to plug in to our product.' Convera, which already has a large customer base in government agencies through its Excalibur and RetrievalWare search and knowledge discovery platforms, also recognizes the value of interacting with other companies' tools, although it does not support UIMA yet. Its search products already provide integration with other products and services through the use of open interface standards such as XML and Representational State Transfer. A huge demand exists right now for any capability that enables people to extract information from unstructured data, said Sameer Kalbag, Convera's vice president of product management. But he said it's unclear if UIMA will be the only answer.Largely to address that need, IBM decided to make UIMA available to the open-source community in January by publishing the source code at SourceForge.net, the world's largest open-source development site. Later this year, IBM said it intends to move UIMA to a full open-source community development model.'That will allow other vendors to freely apply it to their product development,' Olive said. 'We do firmly believe UIMA is a growing thing, which is why [the move to] open source is so important.'














Easier collaboration



















Support grows
























NEXT STORY: Lockheed Martin wins DMS extension

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.