Science DMZ: faster, more secure high-performance computing
Connecting state and local government leaders
Built near the network perimeter, a Science DMZ is a portion of the network optimized for high-performance scientific applications rather than for general-purpose business systems.
University or government scientists often demand high performance computing resources, which means researchers need access to ever larger datasets and a way to collaborate with widely dispersed teams of scientists. To create an environment to facilitate such compute-intensive work, USDA’s Agricultural Research Service is expected later in June to award a contract for the construction of a Science DMZ network.
The Department of Agriculture is just the latest in a growing number of government agencies to use the concept. While many organizations deploy a DMZ (after the term “demilitarized zone”) to harden their regular business networks using security devices such as firewalls, Science DMZs have special needs that require their own specific designs. And it’s not something that can be created with high-speed connections alone.
The firewalls that protect email, Web browsing and other applications can cause packet loss in the TCP/IP networks, for example, which can dramatically slow data speeds. For business applications that latency may be nothing more than a temporary annoyance, but for scientific organizations, which need to shift many gigabytes of data at a time, it can be catastrophic. After all, what’s the use of electronic transmission when overnight FedEx will get your data to its target faster? Likewise, routers and switches without enough high-speed memory to handle large bursts in traffic can cause similar packet loss and reduction in data speeds.
And because scientific data has the same security needs as any other government data, the question is how to provide the special conditions that scientists need to do their jobs, while at the same time making sure the data is sufficiently protected. To do that, the scientific data has to be handled separately from data generated by regular business applications that runs over the local-area network.
A Science DMZ, which nevertheless is part of the agency’s overall network topology, is typically located at or close to the agency’s network perimeter, in many cases tied directly to the router that connects the research institution to the wide-area network. That guarantees the greatest possible network speed for science data.
Inside the Science DMZ there may still be many of the same devices as the LAN would have, except they are either specially built and of much better quality, or they are especially configured to handle the volumes of data that the science applications produce. Firewall input buffers have to be a lot larger than LAN firewalls, for example, because they need to handle far higher burst volumes of data.
Given the smaller application set producing the data on the DMZ, firewalls could even be eliminated by filtering the data through switches or routers based on IP addresses or TCP ports, used in conjunction with intrusion detection systems. The Science DMZ also needs dedicated servers called Data Transfer Nodes that are specially designed and configured for science data transfer.
A critical requirement for any Science DMZ is a performance monitoring system to quickly catch and mitigate problems that can slow the data flow. Such systems often use Performance focused Service Oriented Network monitoring ARchitecture. The perfSONAR is a set of tools developed collaboratively by various organizations, including the Department of Energy that can continually check for packet loss or increases in latencies across the network.
There can also be many different designs of a Science DMZ, depending on what applications it will service. However, they all include, more or less, the same components. Another defining characteristic is that Science DMZs are very flexible when it comes to incorporating emerging technologies, such as 100 gigabit/sec Ethernet.
The concept of a Science DMZ is not that new, but the urgency of dealing with rapidly increasing levels of scientific data mean that networks distinct from overloaded general business networks are becoming a necessity, and various organizations are pushing for their development.
Staff at the DOE’s Energy Sciences Network, which connects scientists at over 40 DOE sites, have become evangelists for Science DMZs throughout government, for example, and the National Science Foundation in 2012 issued a solicitation for proposals from universities to upgrade their network infrastructures with Science DMZs.
NEXT STORY: Tools to tighten the Internet of Things