The life & times of data

 

Connecting state and local government leaders

The Army Surface Deployment and Distribution Command faced an age-old problem: The amount of data it needed to keep was growing faster than the storage system designed to hold it.

The Army Surface Deployment and Distribution Command faced an age-old problem: The amount of data it needed to keep was growing faster than the storage system designed to hold it. But instead of purchasing more expensive hardware, the command devised a tiered system for handling the data, wherein older information could be automatically offloaded to cheaper storage devices.Tiered storage is not a new concept. Administrators have long moved older material over to tape drives in order to save space on more expensive disk drives. But a confluence of factors'cheaper storage technologies, more sophisticated software and an explosion of data'may be pushing agencies towards making more complex plans for saving data. Vendors call it information lifecycle management.'A lot of storage software is going that way,' said Andrew Ferguson, who oversees the administrative storage system at the Energy Department's Brookhaven National Laboratory in Upton, N.Y. In an information lifecycle management system, an organization will establish a process for moving data from one storage medium to another, and then automate that process using ILM software.'Information has value and that value changes over time,' said David Goulden, an executive vice president of customer operations of EMC Corp. of Hopkinton, Mass. The trick is to develop a strategy to balance the value of that information with the cost needed to manage it. That is where ILM software can help.Headquartered in Fort Eustis, Va., the Army Surface Deployment and Distribution Command coordinates the movement of personnel and equipment around the globe. When the conflict in Iraq flared up early last year, the command's ranks ballooned with reservists. As a result, the administrative storage space'7.5T of data split between Microsoft Windows-based file servers and Network Appliance Filers'bulged with files, from e-mails to PowerPoint presentations.Rose DuBoise, a technical adviser to the command, was already adding new drives to the system every three months before increased ranks stressed the system even further. Prior to making yet more storage purchases, DuBoise took a step back and looked at ways of characterizing her data, or describing it based on a number of attributes. Using ManageTone 2004 Lifecycle Suite, from Overtone Software Inc., of Bethesda, Md., she characterized the command's data by size, age, access history and origin.What she found was interesting: More than 60 percent of the data she was keeping had not been accessed in over 90 days.With this in mind, DuBoise procured a NearStore R200 from Network Appliance Inc. of Sunnyvale, Calif. NearStore, which is a disk-based 'nearline' storage system, doesn't offer the same response speed as network attached storage units, but it costs considerably less. By configuring the Overtone software to periodically sort through data, DuBoise was able to direct older, infrequently accessed data to the NearStore unit, saving storage costs. The software left pointers to the files' new location in their original directories, so most users weren't even aware they had moved and they could open the files as they had before.Information lifecycle management may be a new buzzword, but the concept has been around for awhile.Hierarchical storage management had similar goals, said Jeremy Burton, senior vice president and chief marketing officer of Veritas Software Corp. of Mountain View, Calif. With HSM, administrators could offload little-used data to tape units. The end users could still call up the data, although it might take a few seconds longer to come back to the desktop. That was an acceptable tradeoff for the savings in purchasing tape over more expensive disk-based solutions.The difference between HSM and ILM is that HSM was a technical answer to a problem of managing data, whereas ILM takes a larger, intelligent view of the entire process. What data needs to be accessed quickly? Which data do you need to store, for legal reasons, but don't have much use for? The software allows for greater nuance in making decisions about how and where to store data, and a greater ability to automate movement of data across all media, not just tape.'If we don't manage it somehow, it will be useless,' Burton said of most organizations' tremendous growth in data. But, he said, 'you don't want to micromanage files. What you save in disk space, you'll lose in labor costs.'In addition, ILM software helps users get a handle on what type of data they have.Greg Hilsenrath, vice president of business development for Overtone, said when he meets with customers, he often finds that they don't know the characteristics of their datasets. Is most of the storage space taken up by e-mail archives? By MP3 music files? ILM software can summarize what types of data reside on storage systems, allowing administrators to make better decisions of how to provision storage and what sorts of additional storage might be needed.For the South Carolina Department of Transportation, ILM has reduced the amount of primary storage it needs and allowed the agency to set up an additional backup site for disaster recovery, according to Lee Foster, system manager for the agency.At present all the agency's data'what Fosters calls the 'active set''resides in an 11T fiber-attached EMC Clariion CX600 storage array. It includes everything the agency has created, from e-mail to large design files of bridges.Foster is in the process of configuring Veritas NetBackup software to separate the data into three different levels: that which will remain in the active set, old data that will move to an existing 160T tape-based Dell PowerVault, and data that hasn't been read or changed in 90 days. This mid-term data will be automatically offloaded onto a 4T serial ATA-based storage system, which the agency recently purchased. Users can still access data in this location, though not as quickly as data stored in the active set.By developing these tiers it will 'reduce our active set tremendously,' Foster said.Reducing the active set of data affords the agency some other benefits in addition to reducing its storage costs. A smaller data set means that the storage volume will be defragmented more easily, and backed up more quickly, Foster said. Foster expects that once the system is up and running he could do full replications on a weekly basis.Reducing the size of the active set will also allow the agency to set up a near-line storage facility, about 100 miles from the main facility, for use in disaster recovery. The new facility will employ IP-based storage and RAID cabinets, with Veritas Storage Replicator software backing everything up from the agency's primary fiber-attached storage system. Using IP and disk-based storage for off-site disaster recovery means the DOT can be back online quicker in the event of a system disruption.'We can have a lot faster access to our data than if it were stored on tape,' Foster said.As continuity of operations becomes more important to agencies all kinds, ILM will play a prominent role.

The Market: ILM vendors

EMC Corp., Hopkinton, Mass. EMC has centered its software development and marketing on information lifecycle management, tweaking 30 products to this approach. The company has just released Celerra FileMover as a feature for its network attached storage products. Administrators can use FileMover to schedule when data is to be moved to NAS devices. The company has also just introduced a number of enhancements for its EMC Symmetrix DMX Series of networked storage systems, including the ability to mirror data across numerous sites and the ability to toggle between real-time and near-real-time backup of data.


Hewlett-Packard Co., Palo Alto, Calif. HP has a wide range of ILM-based products. Earlier this year, the company introduced its Reference Information Storage System, a storage system based on a building-block architecture of individual storage nodes, each with a dedicated processor, search engine and management tools.


IBM Corp., Armonk, N.Y. Big Blue offers a number of products for data management, including programs for message monitoring and archiving of DB2-stored messages. IBM's Tivoli Storage Manager offers fine-grained policy control over which files are backed up.


Network Appliance Inc., Sunnyvale, Calif. NetApp views the emergence of ILM as a perfect fit for its NearStore storage appliance line, competitively priced backup units that allow files to be accessed by end-users. The company's Data Fabric Manager software, working in conjunction with NetApp's storage operating system Data OnTap, provides the application interfaces for software vendors and organizations to craft custom data management programs. The company itself offers an ILM package.


Overtone Software Inc., Bethesda, Md. Overtone's ManageTone Lifecycle Suite software can move data, residing on either Microsoft Windows-based or Unix systems, to storage media, using such criteria as creation date and the type of content or regulatory requirements. The software can store data on a secondary storage application from Network Appliance and keep that data visible to users.


Sand Technology, Boston. Sand has added information lifecycle management capabilities to its data warehouse analysis software. The software allows administrators to compress infrequently used data into read-only files that can be stored on cheaper storage mechanisms, while still remaining available for analysis.


Veritas Software Corp., Mountain View, Calif. Earlier this year, Veritas acquired KVault Software Ltd. of Berkshire, U.K. Using policies set by the administrator, KVault's software indexes and archives data held by Microsoft Exchange and other office-oriented applications and can work with SAN, NAS and other storage architectures. It will eventually replace Veritas' own Data Lifecycle Manager.

Greg Hilsenrath of Overtone says agencies often don't know the characteristics of their datasets.

Dan Gross

Tiered-storage approach looks at the long-term value of data and future needs for access to it









Analyzing data









ILM versus HSM











What type of data?
















NEXT STORY: By the numbers

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.