Census sires system to foster broader access to its records
Connecting state and local government leaders
Census Bureau officials are thinking out of the box as they design data tabulation and distribution systems for the future. The bureau's Data Access and Dissemination System (DADS) now under construction will serve a potentially unlimited number of users, ranging from students and reporters to congressional staff, demographers and statisticians. For the first time in the bureau's history, the public will interact directly with Census' data warehouse, said Enrique Gomez, DADS program manager and systems division chief
Census Bureau officials are thinking out of the box as they design data tabulation and
distribution systems for the future.
The bureaus Data Access and Dissemination System (DADS) now under construction
will serve a potentially unlimited number of users, ranging from students and reporters to
congressional staff, demographers and statisticians.
For the first time in the bureaus history, the public will interact directly with
Census data warehouse, said Enrique Gomez, DADS program manager and systems division
chief for the Decennial Systems and Contracts Management Office.
"We want to give back more data to the public, and new technology allows us to do
that," Gomez said.
Schoolchildren and Census employees alike will go to the DADS home page on the Web to
create online maps from Census data.
Programmers now developing the user interface in Hypertext Markup Language and Java
will try to achieve a common look and feel for all users, Gomez said. The bureau has hired
University of Maryland interface expert Ben Shneiderman to critique each build.
Designing an Internet and intranet system for an unknown number of users is a tough
assignment, said Jim Loving, client manager for IBM Global Services, which won a
five-year, $35 million contract to develop DADS.
"Well have to size an appropriate pipe to support the anticipated
traffic," probably involving asynchronous transmission mode WAN service, he said.
Data tabulation requests from the public will come into the DADS server through a
firewall and Catalyst 5000 switch from Cisco Systems Inc. of San Jose, Calif., over one or
more fractional T3 circuits. The number will depend on how many of the 6-Mbps service
connections are needed.
The DADS project is unusual in other ways, Loving said. Census is planning to develop
it iteratively, he said, because "with each iteration, you reassess requirements,
take feedback and apply lessons learned to the next iteration."
The iterative development plan is in keeping with requirements spelled out in the
Information Technology Management Reform Act, Loving said. If the results are not what the
agency needs, Census can exit at that point.
"It also lets you make enhancements in manageable chunks," he said.
Census officials have scheduled the first dress rehearsal for March, when they put 1997
Economic Census data and 1998 decennial test data online.
The current architecture for the data distribution system is three-tiered: a
lightweight Web browser or optional Java-enabled standard client, a middleware component
and a back-end database.
Most users likely will run the lightweight client, Loving said, because hundreds of
concurrent Java downloads might degrade the systems responsiveness.
The middleware layer will provide security and broker the transaction requests via an
Oracle Corp. Web application server and the Hypertext Transfer Protocol.
Data requests from the public and from internal users will be siphoned off to an
Oracle8 Parallel Server database and geospatial server software from Environmental Systems
Research Institute Inc. of Redlands, Calif.
The database will be protected by a series of software filters that IBM is building in
collaboration with senior Census statisticians.
"When a request comes in, well examine it before it hits the database to see
whether it compromises confidentiality," Gomez said. DADS will reject requests for
data protected by the confidentiality rules.
"We still arent sure how we are going to implement the filters," Gomez
said. "It could be a combination of Oracle and some other language."
The IBM RS/6000 Scalable Parallel hardware for the system will have eight 200-MHz
PowerPC 604e nodes connected by a high-speed switch. The individual nodes can be
configured as map servers, database servers or transaction servers.
An IBM Adstar Distributed Storage Manager backup and archive system will manage space
allocation for the IBM 7132 RAID storage disks and IBM Magstar 3570 tape libraries.
For now, the principal management tool is IBMs AIX operating system, Loving said.
In future iterations, the data distribution system might become the electronic commerce
vehicle for disseminating specialized data the bureau provides to the public on CD-ROM,
tapes and paper, Census officials said.
"We want to make sure people have access to all the information we have already
prepared before they go to our databases," Gomez said.
When DADS finally goes live with decennial and economic data, the public will see
firsthand the results of the bureaus reinvention efforts.
"The bureau has always done an exceptional job of data collection and data
processing," Gomez said. With DADS, Census wants to do an equally exceptional job of
sharing its information with the public.
NEXT STORY: It's curtains for the Ada Joint Program Office