Smart search

 

Connecting state and local government leaders

A commercial search engine ties together multiple pharmaceutical databases for the Food and Drug Administration's Drug Evaluation and Research Center.

A commercial search engine ties together multiple pharmaceutical databases for the Food and Drug Administration's Drug Evaluation and Research Center.A different search engine is helping the Nuclear Regulatory Commission level the mounds of paperwork for upcoming hearings on radioactive waste disposal at Yucca Mountain, Nev.At both agencies, the search engines must interface with widely varying formats and repositories.FDA and NRC decided to give their employees and others access through a browser interface, rather than building data warehouses to aggregate the volumes of information.'The nice thing is that we don't have to store it all,' said Helen Mitchell, enterprise search product manager at the Center for Drug Evaluation and Research, which built a search gateway to 15 data repositories. 'All we have to do is point to and index it.'With so many government repositories available, employees often squander time doing the same searches with different search engines, said Dave Connor, federal vice president for search engine provider Convera Corp. of Vienna, Va.Connor said he has seen analysts at intelligence agencies spend most of their days jumping from one search engine to another, trying to find information on a single topic.Convera, like other search vendors, sells an alternative approach. Agencies buy a basic search package and add modules to access specific formats, Connor said.'Most of our government customers collect information from multiple repositories,' said John Cronin, government sector vice president at search vendor Autonomy Corp. PLC of Cambridge, England. 'They have multiple file servers, databases and intranet Web pages,' Cronin said. 'In the old days, they'd do a brute-force integration and build a data warehouse. But you don't have to'you can leave things in their various formats and locations, and they can appear to be integrated.'The Center for Drug Evaluation and Research set up a single search page with RetrievalWare software from Convera so its 2,500 scientific reviewers could collect data more efficiently for investigations. The workers 'basically can make a single integrated query of all the different libraries,' Mitchell said.To search 15 electronic repositories on the center's intranet, they simply open a browser and type 'Enterprise Search' to view the internal collections.The center, a consumer watchdog for U.S. health care systems, tracks reports about medications and other products after they have been released into the marketplace.[IMGCAP(2)] In 1993, the center started a master file of individual drugs. It worked aggressively to make everything available in electronic form at a time when submissions from pharmaceutical companies more often came on paper.So the center scanned and converted paper documents into TIFF files. More than one researcher could view a file at the same time, and multiple copies didn't consume storage. A pharmaceutical company could refer in subsequent filings to a case number it had submitted previously'on, say, the packaging of a product'rather than resending the information each time.The staff liked the electronic library, so the center decided to add more data resources. One was a collection of adverse event reports.FDA gets about 130,000 reports each year of bad reactions to medications and other products.Those in paper format were scanned and kept on microfiche, with the metadata stored in an Oracle database. Although the metadata didn't include the full narratives or supporting documents, it gave researchers enough information to search for selected attributes, such as the location of an incident or a victim's age or gender.Once those files were linked to the search site, researchers could use the metadata to bring up scanned images of the paper files.In October 2003, the center began indexing another data source'FDA's repository of drug applications in Adobe Portable Document Format, stored in a document management system from Documentum Inc. of Pleasanton, Calif.This last addition has been the most popular of all, Mitchell said. 'Researchers in a meeting can pull up RetrievalWare and search for all reviews of a new drug application. Or maybe they want to search for documents in a particular date range, or they want just the chemistry reviews. They can specify that type of information.'The center set up one library of more than 65,000 documents in a collaboration area, so groups of researchers can maintain directories of items with shared interest. The documents can be in Microsoft Word or Excel, Corel WordPerfect or other formats.There are other new libraries for pulmonary drugs, biopharmaceutics, terrorist concerns and help-desk documents.Mitchell is always on the lookout for more libraries to add. A department with information it wants to share only has to notify her about which directories to index.'It's a work in progress,' she said.At NRC, the approach is slightly different. The commission set up a site called the Licensing Support Network, at www.lsnnet.gov, to corral all the electronic files for the Energy Department's application to house a radioactive waste repository at Yucca Mountain.The documents reside across many servers of groups participating in the application hearing.The distributed approach 'is a little bit unusual' for building a legal document discovery database, said Dan Graser, administrator of NRC's Licensing Support Network. But he felt it was the best approach.'There are always questions about the pedigree of a document that somebody introduces into evidence,' Graser said. 'We figured it would be better for the parties to maintain custody of their own documents.'Energy's proposal to bury tons of nuclear waste at Yucca Mountain is in hot dispute. The state of Nevada is expected to contest it, as are many nearby counties, the National Congress of American Indians, environmental groups and other parties.NRC, which will hold the hearing on the application, must provide the parties with electronic access to all legal documents.Graser and his team decided the best way to make everything available would be through a central search engine, so interested parties would not have to go to multiple sites. Yet there was no justification to build a costly data warehouse for one hearing.'The parties maintain their own document collection servers,' Graser said. 'They have custody of their own material. What we do is spider those collections of material nightly.'Currently 20,000 legal briefs, contentions and other papers are online. Eventually, there will be about 15 million pieces of information as legal teams generate more material.The agency started the portal project in October 2001. It purchased the Intelligent Data Operating Layer Server, an Autonomy software suite with both portal and search features. The purchase was through reseller AT&T Government Solutions of Vienna, Va.Graser said Autonomy proved to be a good choice because its search software can handle many formats. The documents are in Extensible Markup Language, HTML, PDF, text files, TIFF image files and other formats, said Matt Schmit, the project manager.NRC also looked at a portal and search suite from Vignette Corp. of Austin, Texas, but Graser said Autonomy's search engine met their needs better.'These documents are very large, very dense and very rich,' Graser said, and also very similar to each other. With so many common terms, a conventional search system would return far too many hits to be useful.'One of the things we were looking for was a lot of latitude in relevancy ranking and a natural-language user interface,' Graser said.Relevancy ranking places documents that probably are the most useful at the top of the results. A natural-language interface accepts queries in simple English.AT&T hosts the search facility in Ashburn, Va., with 18 Hewlett-Packard Compaq servers running Microsoft Windows Server 2003. The site serves public users, and there is a priority version for the participating parties.

'The nice thing is that we don't have to store' all the information, says Helen Mitchell of FDA's Center for Drug Evaluation and Research.

David S. Spence

NRC's Dan Graser and his team found a central search engine let them avoid building a costly data warehouse.

David S. Spence

Advanced search engines link many data sources





































Most popular































Multiple formats











X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.