Lustre to battle corruption

 

Connecting state and local government leaders

Ever have this problem? You want to get a large piece of important data from the server to the storage system. All the systems involved seem to send the data correctly, and it is written to the disk without any error messages. Yet when the data is read later, it has been corrupted. How did that happen?

"It's staggering the risks that can happen along the entire data path," said Peter Bojanic, director of Lustre engineering at Sun Microsystems. He said that although there are plenty of studies about disk reliability, few have been done on the subject of data reliability on the network ' a problem that is increasing.

"In some large deployments, we experienced inexplicable data corruption to files on the disk," he said. In one setup, after a lengthy investigation, the technology staff found that the network cards were flipping bits.

The topic came up while we were talking about the future of the Lustre global file system. A year ago, Sun acquired Cluster File Systems Inc., the former maintainer of Lustre. We thought it would be a good time to catch up with the company to find out what's in store for the file system.

Bojanic said one of the most interesting opportunities Sun sees for Lustre is that it could be used when used alongside Sun's 128-bit next-generation file system, ZFS, to provide advanced data integrity.

Advanced data integrity means the system as a whole, rather than its individual components, can guarantee that the data has stayed intact. "When you write data to the disk, you know it can be repaired if anything goes wrong," he said.

Lustre is widely used in the high-performance computing community because of its ability to pool massive numbers of storage disks into a single file system. A metadata server keeps track of all file names, directories, permissions and file layouts. A client seeking data consults the metadata server for the location and then retrieves the data directly from the appropriate storage server.

Six of the top 10 computers in the semiannual Top 500 list of the world's most powerful supercomputers use Lustre, Bojanic said. For instance, the Energy Department's Lawrence Livermore National Laboratory uses Lustre for its BlueGene/L system.

Pairing Lustre with ZFS makes sense. The most recent version introduced what Bojanic called over-the-wire checksumming, a quick numerical tally done at the beginning and end of a data transmission. If the checksum at the end of the journey matches the one at the start, then the data hasn't changed en route.

One of the nifty features of ZFS is that it runs its own checksum to ensure that the disk controller doesn't change data as it is written. So it would seem like an obvious thing to put the two operations together, which is what Bojanic and his team are doing. Lustre can even pass its checksum to ZFS so the storage system doesn't waste time calculating a new, and presumably unchanged, checksum.

Bojanic said the Lustre/ZFS combo has been working, in an early alpha state, on Linux, and the team members hope to have a production version soon for Linux and Solaris.

Other new Lustre features

While we had Bojanic on the phone, we asked what else was happening with the file system. Version 2.0 should released next spring, he said.

That version will attack another growing concern associated with Lustre ' using the file system over a wide-area network (WAN). Therefore, Version 2.0 will contain full support for the Kerberos network authentication protocol. (Version 1.8 partially implements Kerberos.) Lustre can use Kerberos to check credentials across a network.

Sun is also adding data-replication features, which should ease the process of doing backups, Bojanic said.

The Lustre team is tackling a number of other interesting problems. Chief among them is management. Lustre has long had a reputation of being difficult to maintain. Bojanic admitted that this has been a concern, though he argued that Lustre has improved in that area. The latest version allows users to format file systems in a manner similar to formatting local file systems on a Unix system. And the team is building a new browser-based console to simplify matters even further.

Another area of focus is performance. Lustre is known for its high throughput: It can deliver up to 90 percent of the raw disk bandwidth, or the bandwidth of the connection between the server and the disk, Bojanic said. The development team is exploring ways to use caching to improve the performance even further, at least for datasets that are read from multiple machines.

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.