Database update | BUYERS GUIDE
Connecting state and local government leaders
In recent years, many database management system vendors struggled to move their products from a relational to an object-oriented model. Now the dust has settled a bit. Most databases described as O-O actually are what end users consider O-O, but this hasn't always been true. For a while, any remote link to object-orientation, including relational DBMSes created in an O-O language, were categorized as O-O.
In recent years, many database management system vendors struggled to move their
products from a relational to an object-oriented model. Now the dust has settled a bit.
Most databases described as O-O actually are what end users consider O-O, but this
hasnt always been true. For a while, any remote link to object-orientation,
including relational DBMSes created in an O-O language, were categorized as O-O.
The basic difference among RDBMSes, non-relational DBMSes and O-O DBMSes, or ODBMSes,
is the way they store information and how you retrieve it.
An ODBMS stores data as persistent objects defined by classes. Objects are not just
data elements but include information describing how they relate to other elements. The
way an RDBMS stores and processes data depends on the field in which the data is placed.
Before you decide to take your departments database resources and leap aboard the
O-O bandwagon, remember that although O-O might make it easier for users to retrieve
information, the O-O database model imposes some restrictions.
Chief among such restrictions is that, even more than with an RDBMS, to implement an
O-O database system successfully you must anticipate every type of search that will ever
be done on the data. Searching outside predefined categories is sometimes possible, but
performance degrades markedly.
Unlike most databases, to achieve maximum performance during the design phase of an O-O
database, you must predefine relationships among all fields. Data retrieval will be easier
but you will only be able to retrieve data based on the predefined relationships.
ODBMSes let you create new data types that consist of the conventional attributes found
in relational models as well as built-in functions and methods.
The technology also supplies significant performance gains by letting you store objects
on the server. Pure O-O databases are an appealing alternative for applications that
require video or other specialized content.
Standards such as the Object Data Management Groups ODMG-93 will help regulate
the technology, but dont look for industrywide adherence to one universal standard.
Computer Associates International Inc., for example, has developed its Jasmine O-O
database using a proprietary language.
Despite recent advances in O-O database technology and standards, its unlikely
that the relational database will disappear any time soon.
Although O-O databases can offer a significant performance edge over RDBMSes when
queries are known beforehand, the methods used to improve performance in relational
databases are better known, and RDBMS vendors have a big head start on their ODBMS rivals.
On one side of the corporate database battle lines is the tried-and-true RDBMS, used
for decades by government and business; on the other are new O-O database products from
vendors such as Computer Associates, Objectivity Inc. and Poet Software Corp. Sybase Inc.,
long known as a leader in relational technology, has hitched its wagon to the O-O star,
too.
Vendors of conventional RDBMSes will try to stave off the rush to buy O-O products by
adding object capabilities to RDBMS products. Such hybrid approaches maintain backward
compatibility with the relational server model, and take advantage of its robustness and
tremendous installed-user base. At the same time, they let programmers create custom data
types.
Although the approach wont please O-O purists, look for the hybrids to gain
ground as a pragmatic alternative for the enormous installed RDBMS user base.
A good mix of the disparate technologies is shown in the model adopted by such vendors
as Microsoft Corp. and Oracle Corp. The model has the ability to translate an O-O world
into the relational model on the back end.
An example of the free-form or nonrelational DBMS is a concordance or full-text
database in which every word in the data is preindexed. This lets you search for
individual words or combinations of words that meet certain criteria, such as
salary within two words of base.
A free-form DBMS is relatively fast and very flexible but requires periodic reindexing
when new information is added.
An even more basic example consists of a text file you can search using any word
processors search function. It limits the ways you can combine terms in searches
but, as long as the files arent too large, its easy to build as it requires no
predefinition of categories and no preprocessing.
A good example would be a list of all the people you contact, with their names, agency
or company affiliations, and telephone numbers.
As you can enter any sort of text and numeric data in any format, this is, by far, the
most versatile nonmultimedia database you can build. And, if all you want to do is
identify a person or phone number, or locate a possible contact, you can build and use
your database with no added investment.
But the problems are obvious. Such databases must be relatively small, and they provide
no way to produce complex reports or do sophisticated searches.
To get the added functionality, you must move up to an RDBMS. The more sophisticated
software lets you define specific data fields or classes of information.
Ideally suited to an RDBMS is data such as that in payroll records. Each record
contains identical categories, or fields. Fields in a payroll record might include, for
example, first name, last name, base salary and street address.
In creating an RDBMS, you must anticipate the sort of information you will need to
store, often predefining permissible field contents such as all numeric or alphanumeric.
Data is then arranged in rows using indexed fields, and the query engine joins tables
of data based on user-selected key fields.
The more sophisticated RDBMSes use Structured Query Language as the basis for creating
a query, that is, the passing of a request for specific information from a user to a
database engine.
RDBMSes have been around for decades, are well-understood and perform well for most
conventional data. Key vendors of RDBMSes are Microsoft, Oracle, and Sybase.
Where RDBMSes begin to falter is in dealing with massive amounts of data. As long as
queries are designed into the system, the O-O paradigm can be much faster. Whereas in an
RDBMS you must define and restrict data characteristics, in an O-O database you must
predefine relationships between data. The kinds of data you can store are more flexibly
defined, but the permissible queries are more restricted.
In an O-O database, data is represented not in fields grouped into records but as
objects. An object is closely related to a record in an RDBMS but goes beyond storing data
to include its possible relationships to other objects. Pointers show relationships
between various objects.
RDBMSes shine in an as yet unexplored area of database storagemultimedia. But O-O
databases shine more brightly.
The O-O paradigm lets you more easily deal with multimedia data and yields significant
performance improvements when handling large databases if the queries being made were
anticipated during database construction and relationships were built in.
Often, this kind of predefined query is easy to develop, but problems may come when new
needs are discovered and you want to process the data in new ways, as with data mining,
for example.
Key vendors of true O-O databases include Computer Associates, Objectivity and Poet
Software. Sybases future development path seems to leads to O-O.
Hybrid ORDBMSes are gaining favor because they retain the vital backward-compatibility
with DBMSes that have been used for decades, while at the same time, programmers can build
flexible new custom data types that can handle data not usually found in RDBMSes, such as
multimedia information.
Microsofts Object Linking and Embedding database technology, for example, uses
objects on the front end that map to data stored on an SQL server at the back end.
ORDBMSes are important developments, but year 2000-enforced budget restraints combined
with the lack of multimedia requirements in many agencies give conventional RDBMSes an
edge.
In the future, enterprise databases will be able to mix and match data as never before,
thanks to the interoperability of Java running on Common Object Request Broker
Architecture and Distributed Component Object Model.
The technology is gaining a firm foothold today. By 2001, a database manager will have
no trouble thinking in terms of distributed and parallel computing. More Web servers will
get data from multiple sources on the back end, either through mainframes or RDBMS or O-O
databases.
Its likely to be 2001 and not 2000 because, with the magnitude of year 2000
problems facing most agencies over the next year and a half, the only big database
projects likely to be undertaken are those related to testing and fixing year 2000
glitches.
Fixing the code may mean replacing software, but that doesnt end the work. You
not only must port legacy data and retrain users, you also must test to ensure that the
year 2000-ready database youre buying really is year 2000-ready.
At this point in the year 2000 crisis, its too late to buy software that comes
merely with vendor promises of future year 2000 readiness.
In a case of everything old is new again, Unix databasesspecifically the open
source code Linux versionare of rising importance.
You can buy the inexpensive OpenLinux from Caldera Inc. of Orem, Utah, in a commercial
version or download Linux as freeware.
Linux freeware isnt some Mickey Mouse game software; it is a powerful competitor
to full-blown Unix. Oracle, Informix Software Inc. and Computer Associates are porting
versions of their software to this platform.
But no database decision is made in a vacuum. Most government agencies have massive
legacy RDBMSes, and, although offices that need multimedia may want to consider
O-O database technology, most federal agencies will continue to rely on RDBMSes for
decades to come.
Strip away new user-friendly interfaces and youll find that
the basic mechanism for querying a database has an old face.
The face youll see is of Structured Query Language or query
by example (QBE).
SQL was developed and implemented in the mid-1970s by IBM Corp.
Users adopted it because it provided the first consistent way to access various databases.
Its many limitations were unimportant because most database users
were mainframe programmers, and SQLalthough complexwas simpler than the
programming languages the mainframe elite were used to.
Before PCs became ubiquitous, IBM next developed the less
programmer-oriented QBE.
QBE is so simple that many people probably use the basic query
format developed in QBE without even knowing it.
Youre using QBE whenever you search a database by providing
an example of the data youre looking for, such as salaries over $45,000, which might
be specified >45000.
Other programs have moved to graphical interfaces and even use
drag-and-drop tools. But if your database query form uses fill-in-the-blank components, it
is related to QBE.
QBE and SQL are closely related. In fact, QBE was developed as a
front end for SQL, and those fill-in-the-blank elements are often translated into complex
SQL queries.
As agencies move from mainframe to client-server databases, the
user base splits. Those who work directly with mainframes generally execute searches for
other workers. Users of usually smaller client-server databases stored on network servers
work directly with the database.
As the client-server setup became more common, a new problem
emerged: how to get around the complexity of accessing the server-based dataa task
thats as complex as accessing mainframe datato let less sophisticated desktop
PC users do their own searches.
By the mid-1990s, even a straightforward-looking PC database such
as Corel Corp.s Paradox or Microsoft Access could, with the right drivers, link to
data on a server.
Development of front-end query tools cut through much of the
complexity and eliminated the need for all users to understand the underlying structure of
the database they were accessing. Programmers built the understanding into predefined
database elements that users simply selected and linked.
Although the technique simplifies management and end-user
support, it also limits the complexity of queries to only those using predefined elements.
Buying client-server database management systems is trickier than
buying a desktop product for many reasons, not the least of which is pricing.
Prices for the products vary with the hardware, number of users,
kind and amount of data being stored, support costs and conversion services.
Comparison shopping, which includes price as a major factor, is
important but meaningful only when you go to several vendors with a specific
configuration, get quotes and then compare prices. Be ready to spend between several
hundred and several thousand dollars per user depending on your installation.
Most client-server database software vendors will customize their
products to meet user needs.
The descriptions in the chart will help you decide which products
will fit your data.
Angara Database Systems Inc.
3045 Park Ave.
Palo Alto, Calif. 94306
tel. 650-322-1810
http://www.angara.com
Angara Main Memory Database is a memory-resident relational database management system
that you can use by itself or with a conventional disk-based database to improve
performance.
Ardent Software Inc.
50 Washington St.
Westboro, Mass. 01581
tel. 303-294-0800
http://www.ardentsoftware.com
O2 Object Database System is a scalable ODBMS program with Web support. Prices start at
$3,000 per user.
Computer Associates International Inc.
1 Computer Associates Plaza
Islandia, N.Y. 11788
tel. 516-342-5224
http://www.cai.com
The penIngres product name covers a family of mission-critical client-server RDBMSes for
desktop through mainframe computers. Jasmine is CAs multimedia ODBMS package. Prices
start at $800 per user.
Computer Corp. of America
500 Old Connecticut Path
Framingham, Mass. 01701
tel. 508-270-6666
http://cca-int.com
Empress Software Inc.
6401 Golden Triangle Park
Greenbelt, Md. 20770
tel. 301-220-1919
http://www.empress.com
Environmental Systems Research Institute Inc.
380 New York St.
Redlands, Calif. 92373
tel. 800-447-9778
http://www.esri.com
Spatial Database Engine is an object-oriented client-server geographic information systems
database engine.
HOPS International Inc.
15105 N.W. 77th Ave.
Miami Lakes, Fla. 33014
tel. 305-827-8600
http://www.hops.com
The Heuristic Optimized Processing System is designed
for terabyte-sized data sets and uses heuristic algorithms to optimize searches. Its
suitable for GIS and data warehousing applications.
IBM Corp.
Old Orchard Road
Armonk, N.Y. 10504
tel. 520-574-4600
http://www.ibm.com
Informix Software Inc.
4100 Bohannon Drive
Menlo Park, Calif. 94025
tel. 415-926-6300
http://www.informix.com
Inprise Corp.
100 Enterprise Way
Scotts Valley, Calif. 95066
tel. 408-431-1000
http://www.inprise.com
InterBase RDBMS is a client-server architecture supporting mixed-architecture networks.
InterSystems Corp.
1 Memorial Drive
Cambridge, Mass. 02142
tel. 617-621-0600
http://www.intersys.com
Cach is a transaction-processing oriented ODBMS with SQL support for legacy data.
MapInfo Corp.
1 Global View
Troy, N.Y. 12180
tel. 518-285-6000
http://www.mapinfo.com
SpatialWare manages spatial data and retrieves data using SQL3 querying.
Micro Data Base Systems Inc.
1305 Cumberland Ave.
West Lafayette, Ind. 47906
tel. 765-463-7200
http://www.mdbs.com
Titanium supports O-O and RDBMS data models.
Micronetics Design Corp.
1375 Piccard Drive
Rockville, Md. 20850
tel. 301-258-2605
http://www.micronetics.com
MSM-SQL is an RDBMS; Unix, NT and other versions are available.
Microsoft Corp.
1 Microsoft Way
Redmond, Wash. 98052
tel. 425-882-8080
http://www.microsoft.com
SQL Server Enterprise Edition is intended for the largest possible NT DBMS apps.
NeoLogic Systems
1450 4th St.
Berkeley, Calif. 94710
tel. 510-524-5897
http://www.neologic.com
NeoAccess, an ODBMS for Windows 95, NT, Mac OS and Unix, costs $750 for a development
license.
NCR Corp.
1700 S. Patterson Blvd.
Dayton, Ohio 45479
tel. 513-445-5000
http://www.ncr.com
NCR Teradata is a parallel RDBMS for Unix and NT.
Objectivity Inc.
301B E. Ewelyn Ave.
Mountain View, Calif. 94041
tel. 415-254-7100
http://www.objectivity.com
Oracle Corp.
500 Oracle Parkway
Redwood Shores, Calif. 94065
tel. 650-506-7000
http://www.oracle.com
Pick Systems
1691 Browning Ave.
Irvine, Calif. 92606
tel. 949-261-7425
http://www.picksys.com
D3 is used to build Internet ODBC and SQL client-server databases for Win95, NT and most
Unix platforms.
Poet Software Corp.
999 Baker Way
San Mateo, Calif. 94404
tel. 415-286-4640
http://www.poet.com
Major ODBMS is designed for networks running NT. Prices start at $499.
Red Brick Systems Inc.
485 Alberto Way
Los Gatos, Calif. 95032
tel. 408-399-3200
http://www.redbrick.com
Red Brick Warehouse is a purpose-built data warehouse RDBMS as opposed to software
modified from a conventional RDBMS.
Sequiter Software Inc.
P.O. Box 783
Greenland, N.H. 03840
tel. 403-437-2410
http://www.sequiter.com
CodeBase is an xBase engine for programmers who need to build multiuser and client-server
databases that are compatible with dBase, FoxPro and Clipper on systems running Windows,
Unix, OS/2 and Mac OS.
Software AG Americas Inc.
11190 Sunrise Valley Drive
Reston, Va. 22091
tel. 703-860-5050
http://www.sagafyi.com
Adabas handles large volumes of fast-changing data.
Sybase Inc.
6475 Christie Ave.
Emeryville, Calif. 94608
tel. 510-922-3555
http://www.sybase.com
Tache Group Inc.
1901 S. Harbor City Blvd.
Melbourne, Fla. 32901
tel. 407-768-6050
http://www.tachegroup.com
CQL++ ANSI is a SQL implementation with ODBC extensions and B-tree.
Tandem Computers Inc.
19333 Vallco Parkway
Cupertino, Calif. 94014
tel. 408-285-6000
http://www.tandem.com
NonStop SQL/MP provides data warehouse support in a parallel RDBMS environment.
Versant Object Technology Corp.
1380 Willow Road
Menlo Park, Calif. 94025
tel. 415-329-7500
http://www.versant.com
Versant ODBMS offers client-side processing with server-side control.
W3apps Inc.
310 Lake Crescent
Fort Lauderdale, Fla. 33326
tel. 954-389-8429
http://www.w3apps.com
Jeevan is a platform-independent ODBMS program running under Java.
One testament to the staying power of conventional relational
database management system technology is the growing interest in data mining and data
warehousing.
In data warehousing, an organizations PC- and mainframe
RDBMS data is placed in a single data warehouse in which the data is often measured in
terabytes. Because of the huge size of the databases, mainframes and 64-bit platforms are
typically the workhorses.
In data mining, important or significant traits are unknown
beforehand and must be discovered by working within massive databases that have millions
of records. Data mining takes advantage of fast computers and new software to dig out
often unrelated data in ways that had been impossible.
Because of the database size and the processing power needed to
do queries in a reasonable time, the systems usually run on mainframes.
The size of the databases mined also makes data mining an
unsuitable application for object-oriented DBMSes.
The attraction of data warehousing and data mining is mostly for
businesses seeking more marketing data.
But they have many government applications in scientific areas,
law enforcement and the intelligence community.
Better and faster hardware puts data mining within reach for
agencies. In fact, any agency charged with determining trends can, if they have sufficient
legacy data to make the results meaningful, use data mining.
Using knowledge discovery and data-mining techniques, an agency
can examine factors and trends in population, pollution control, resource management and
criminal activities.
Data mining is already a mainstream trend. Look for it to grow in
importance.
John McCormick, a free-lance writer and computer consultant, has been working with
computers since the early 1960s.