Army center enhances intranet with text-mining app
Connecting state and local government leaders
SemioMap can recognize HTML and Notes documents. Using SemioMap, the Center for Army Lessons Learned at Fort Leavenworth, Kan., has mapped 50,000 pages' worth of text onto a single intranet screen. The Web text-mining application from Semio Corp. of San Mateo, Calif., represents key concepts in large volumes of unstructured text data without any initial tagging.
SemioMap
can recognize HTML and Notes documents.
Using SemioMap, the Center for Army Lessons Learned at Fort Leavenworth, Kan., has
mapped 50,000 pages worth of text onto a single intranet screen.
The Web text-mining application from Semio Corp. of San Mateo, Calif., represents key
concepts in large volumes of unstructured text data without any initial tagging.
Semios just-released SemioMap 2.1 handles direct connections to Lotus Development
Corp. Domino or Notes servers and Web servers.
Text mining is the logical extension of data mining, said Claude Vogel,
Semio chairman and chief executive officer. Analogous to data mining of structured
numerical data, it complements search tools such as Web crawlers, intranet search engines,
document management systems and push technology applications.
SemioMap combines lexical processing, information clustering and graphical display with
the companys patented semiotics, which Vogel said is the formal study of signs
carried by patterned communications.
SemioMap displays the relationships it uncovers as a graphical map directing users to
information hidden in massive volumes of text. The database that users build with SemioMap
contains all the concepts, relationships, patterns and links back to documents, Vogel
said.
As in data mining, he said, you dont need to know at the outset what you are
looking for.
SemioMap also can work on unstructured data in transactional systems, said Gail
Claspell, Semio marketing director. It uses the structured fields in transactional systems
as a way of slicing and dicing the data in the unstructured fields, Claspell
said.
The core functions of SemioMap are written in C. Administrative interfaces and client
software are in Java.
Semio used technology from Inso Corp. of Boston and Adobe Systems Inc. of San Jose,
Calif., to make SemioMap recognize most document formats, including Hypertext Markup
Language and Lotus Notes documents.
Vogel said Semio maps can combine any of the document file formats for exchange via
e-mail.
SemioMap 2.1 starts at $5,000 with a Java applet client and server software that runs
under SunSoft Solaris or Microsoft Windows 9x and NT.
An entry-level server license for processing a 100M data set is $15,000, Claspell said.
Semio provides support via e-mail only, he said.
Although there is no limit on the number of content maps users can create with
SemioMap, the largest data set it can process at any given time is 7G, Claspell said.
Future versions of the software will support 30G data sets.
A view of the Armys SemioMap application appears at http://call.army.mil.
Contact Semio at 650-638-3330.