Once upon a time, network management was one of the most straightforward of tasks: Monitor the health of key network devices like routers, hubs, switches and servers, and alert someone when something went down. But in the age of distributed applications, Web services, and cross-organizational processes and dependencies, that approach doesn't fly anymore.Sure, there are those elements of network management that will never go away'for example, discovery of networked devices and support for standard management protocols such as the Simple Network Management Protocol. But security issues have raised the stakes on even these basic features'problems with security in earlier versions of the SNMP have made support for SNMP version 3 an important check-off for any network management system. And the shift toward IPv6 networks in some agencies also dictates management tools that can take advantage of'or at least tolerate'the new IP protocols.Moreover, the way network operations is viewed on a daily basis has changed. The cornerstone for network management had long been the International Organization for Standardization's FCAPS model. FCAPS is an acronym for Fault, Configuration, Accounting, Performance and Security, the categories the model uses to break down network management tasks. While FCAPS is still useful in some areas of network management, it doesn't necessarily reflect the demands being placed on those systems today.Two years ago, Forrester Research's Jean-Pierre Garbani declared FCAPS 'relatively useless in categorizing the current production of infrastructure management solutions because it originated at a time when most applications were located on a mainframe and when PCs were running terminal emulations.' Networks have become even more networked, and everything on them depends even more heavily on connectivity.When the network touches everything, network management becomes more than keeping track of the health of switches in a wiring closet. A single fault can trigger hundreds, or even thousands, of alerts as it takes down dependent network services. Network administrators and help desks end up wading through a like number of trouble tickets generated by alerts and irate users, trying to find the root of the trouble.Security issues also have had an increasing impact on network management. A denial-of-service attack or network worm can cause major disruptions to network-based applications. And a degradation of a network connection that carries user authentication data can cause other security problems, or prevent access by users. Unauthorized devices, such as rogue wireless access points, can create a whole new set of security threats.And to top it all off, network managers must deal with the dark art of monitoring service level agreements. While many network, service and application contracts now include some form of SLA, it's often difficult to accurately monitor compliance with those SLAs without creating another problem'too much management traffic. But not enough monitoring can wind up costing agencies thousands or even millions of dollars for services they didn't get.To deal with all these emerging issues, IT managers at government agencies (like their peers in the private sector) are increasingly looking for network management systems that help them keep critical business processes going'and do it with fewer people.As FCAPS has fallen into a lower level of regard, many managers have shifted their focus to another model with a slightly less-aged perspective on networks'the Information Technology Infrastructure Library, originally developed by the United Kingdom's Office of Government Commerce. Rather than dealing with a breakdown of administrative tasks, ITIL uses a process-model view of information infrastructure and is intended to promote 'best practices' in IT management. ITIL has gained worldwide acceptance and has been incorporated into a number of network management products, such as Hewlett-Packard's OpenView platform.'We've seen some of the leading agencies start to talk about the evolution of network management,' said Trent Waterhouse, vice president of marketing for Computer Associates International Inc. 'It hasn't shown up in a tender request yet, but it's definitely becoming part of their planning. Agencies have already invested in products that tell them when they have a problem. They're saying, 'That's nice, but it doesn't go far enough.' 'That desire is reflected in the direction taken by leading management software vendors. The latest generation of network management platforms don't just provide multiple ways to alert managers when something goes wrong, they also provide more intelligence about what the actual cause is and what services it will affect. The result, in theory, is less downtime for critical systems and fewer trouble tickets for IT departments to run down in order to fix a problem.Nearly all the major network management platforms (and at least one open-source solution) provide or will soon provide some means of event correlation'a way of intelligently reducing the number of alerts sent to network operations staff by consolidating them as close to the root cause as possible. Most systems can also trigger automated responses that try to correct problems as they happen, performing what's being called 'automated problem resolution.' In some cases, these responses can be as simple as a script executed by an event; in others, they may require complex rule logic written in a tool specific to the management platform to execute.While there are parallels among the features and the goals of these products, there is substantial variation in execution. Some vendors, such as Computer Associates, IBM Tivoli and Micromuse, while providing basic network management products, sell much of their advanced functionality as part of a larger platform for consolidated operations. The benefits of these systems'such as automation based on network events, sophisticated rule and policy-based systems for driving automated actions'come as part of a much larger platform, or as add-on modules.Other software vendors, such as Hewlett-Packard and Enterasys, take more of a 'bus' approach to their products, creating software products that stand alone but have the ability to integrate components of their suites together through a common framework. 'That way, customers can install and get value quickly, then add onto their installations later,' said Bill Emmett, chief solutions manager for Hewlett-Packard. 'It's a good way to get quick return on your investment.'Then there's the open-source approach. OpenNMS, for example, is a Java-based open-source network management tool. Professional support for the software is available from the OpenNMS Group, a network management consulting and support group based in Pittsboro, N.C. Support contracts are based not on how many systems are installed, or how many nodes are managed, but on how many administrators need access to support services.While the company is small, the software is getting a great deal of attention. In August, OpenNMS was named Best Systems Management Tool at LinuxWorld in San Francisco, beating out products from IBM Tivoli and Novell Inc. By relying on open standards and open source, the community around OpenNMS has been able to rapidly expand its functionality and customize it for a number of different applications. And the software is being downloaded at a rate of about 4,000 copies per month, according to OpenNMS Group's Tarus Balog.The group's paying customers include universities and government agencies, and some telecommunications firms that appreciate having the ability to go down into the code of the software and make their own customizations. Swisscom, Switzerland's telecommunications company, 'manages 80,000 network devices from one instance of OpenNMS,' said Balog.Considering it only costs $99 a year for two network managers to get support for OpenNMS, that's pretty cost-effective management.
Better management, fewer people
Similar goals, varied execution
Open-source network management
S. Michael Gallagher is an independent technology consultant based in Baltimore.