System details The trouble with Web pioneering is that time marches on. Pioneers have to keep pulling more tricks out of their hat to keep up with newcomer sites that flaunt shiny new technologies.
Thats how it stands with the Securities and Exchange Commissions once cutting-edge Web site.
Despite good graphics, the site, at http://www.sec.gov, looks a bit old-fashioned, and its search tools are weak. Nevertheless, in the current financial climate, it remains an extremely busy site and a symbol of the governments commitment to posting public information online.
What the SEC site needs is a face-lift, and a big one is in the works, site managers said. A cornerstone of the initiative is work on the massive Electronic Data Gathering, Analysis and Retrieval system, which stores about 70G of corporate financial data. EDGAR visitors account for 80 percent of the sites traffic, said Chuck Woods, chief of the Server Development Branch.
A contractor maintains EDGAR. The site is run internally by SEC staff.
Whats in EDGAR? Most corporations are required by law to disclose financial and other information to SEC, and EDGAR collects, validates, indexes and stores that data.
Visitors to the SEC site can fill in forms to search EDGAR, or they can browse the rest of the site to find agency information, contact points, investor advice files, details on filing complaints, small-business information, press releases, proceedings from the Enforcement Division and details about SEC rulings.
Counting EDGAR activity, the site moves up to 30G of information each daya massive amount by any standard.
Separate from the Web site, EDGAR has its own workflow management system. Each staff attorney responsible for a certain industry finds a group of filings in the in-box every day. On the back end, after the filing is accepted, its forked off for SEC processing and also sent to a dissemination system for subscribers.
But the EDGAR interface has not changed much from the original set of tools SEC acquired from New York University for querying EDGAR via the Web. Another negative is that files available through EDGAR appear as straight ASCII text, not Hypertext Markup Language. They are difficult to read.
The search interface, a Wide Area Information Server, also is out of date compared with readily available and more powerful search tools.
In many ways, the SEC site has been a victim of its own success.
Were bigger than we ought to be for the tools we have, Woods said. The WAIS server is showing its age. Its dying under the load. Volume is such that nightly indexing takes forever. Plus, it isnt full-text indexing.
Only the headers of EDGAR filings are indexed. On the main site, press releases and SEC documents are indexed by keyword.
The obvious starting point for a face-lift is to replace WAIS with a more powerful search engine and convert files to HTML. But even more renovation is necessary. Its not easy to find specific SEC divisions or contact points, for example.
BDM Federal Inc. of McLean, Va., now part of TRW Inc., holds the contract for maintaining EDGAR and recently won a follow-up contract, Woods said. As the contractor rolls out a new system, corporations will begin submitting filings as HTML documents. That should make them more readable without placing pressure on the system to handle translations.
Woods plans to install a search engine from Verity Inc. of Sunnyvale, Calif. Veritys tool set ought to make it possible to customize an interface that streamlines searches across multiple indexes.
Why did SEC wait so long to update that it must now tackle everything at oncenew contract, new EDGAR system, new functions and new search interface?
Part of the delay arose from getting the machinery in order.
Moving to a distributed environment will make it easier to meet server demands while rolling out the new pieces. Also, work proceeds slowly on a site that handles legal documents, to avoid data loss and downtime.
Webmasters at other agencies understand the magnitude of the job SEC now faces. They may decide it is advisable to make smaller, iteractive changes that keep their Web technologies current.
Best + Graphics are clean, and navigation from the button bar is intuitive. +The special-purpose search section shows the SEC staffs ability to build products to serve the public well. + The About the SEC section tells what the commission does and how it came about; it lists acts of Congress back to the 1930s that define SECs mission. + The rule-making section helps visitors quickly locate proposed rules, final rules, interpretations and notices.
Worst Good information is located in strange places. For example, pointers to SEC decisions on year 2000 readiness are buried two levels down on a public statements page. This important issue should get top billing. The search interface isnt intuitive enough. Why should a visitor have to enter a full corporate name? Will a partial name yield the same search results? There are different search forms for different needs, and its not always obvious which one to use. For some searches, dates are mixed up and records are returned in no particular order. The only search options are last week, last two weeks or last month. Beyond that, visitors must search the entire database. The small business section isnt updated as frequently as it should be. The latest news section was updated last year.
Hits: 500,000 to 600,000 per day Volume: Up to 30G moved per day; thousands of other sites link directly to SEC searches High traffic areas: Main page, search page, enforcement complaint center where visitors find forms for questions and complaints, occasional hot areas such as year 2000 information and press releases
Server hardware: Array of small Unix servers with Pentium Pro processors as the front end to enterprise-class data servers
RAM: 3G
Storage: 250G
Search engine: WAIS
Connection: Switched T3
Internet provider: Digex Inc. of Beltsville, Md.
Overall management: Chuck Woods
Content webmaster: Ruth Pitt
Internal Web development: Fran Rowell
Server maintenance, security and integration of new search tools: User Technology Associated Inc. of Arlington, Va.
Original site development: Joe Segreti and Mark Brickman
Shawn P. McCarthy is a computer journalist, webmaster and Internet programmer for Cahners Business Information Inc. E-mail him at smccarthy@cahners.com.
|