Preparing for the CDM dashboard: Big data and the cloud
Connecting state and local government leaders
Big data and analytics are empowering federal agencies to make decisions faster than ever, but the added capabilities require a more stringent cloud security and data protection strategy.
Agencies likely have the Continuous Diagnostics and Mitigation (CDM) program on their minds as a result of the Office of Management and Budget’s November memo requiring they share cyber risk data to a new federal dashboard. Specifically, the memo notes, agencies are required to “report the status of their information security programs to OMB,” with inspectors general conducting “annual independent assessments of those programs.”
What should an agency’s “information security program” include? With agencies’ increased emphasis on big data throughout their operations, if the program does not include a strategy for big data and cloud security, it is incomplete.
Additionally, OMB expects agencies to certify their implementation of the CDM program’s Data Quality Management Plan and be ready to exchange data with the dashboard by the end of the 2021 fiscal year. Agencies will start that data exchange off on better footing when their information security program thoroughly considers the security of big data and the cloud. After all, it’s easier to measure security risks once the fundamental elements have been buttoned down.
Big data and the cloud are the usual suspects in cybersecurity risk. Data and analytics continue to empower federal agencies to make decisions and analysis faster than ever before. However, the added capabilities also require a more stringent information security and data protection strategy.
Big data environments were made for fault tolerance and large data lakes, but they were not architected with security in mind. Data protection for these large environments, especially in the cloud, should emphasize encryption and access controls to ensure data is protected and can proactively let security personnel know if it comes under attack. Encryption of data lakes that does not interfere with performance, analytics and architecture is paramount. FIPS-level protection for key management should also be required.
Security risks related to big data have grown an order of magnitude over the last few years because the technology has become more common in agencies’ daily operations. The Cloud Security Alliance estimates that data volume is doubling every two years, and with that growth comes more cybersecurity risk.
Cyber risk measurements related to big data demand a 360-degree view of an agency’s strategy for data security in the cloud. That means understanding the complete lifecycle of data -- producing, collecting, analyzing, storing, sharing and managing it -- and how to protect it at every stage. As data access becomes nearly real-time and search tools grow more sophisticated, new threats can emerge with alarming speed.
Understanding an agency’s risk criteria, in compliance with OMB guidance, means federal IT professionals at all levels (but especially CIOs) must make sure their agencies’ security strategies are aligned with cloud-based technologies.
Here are three points agencies should consider as they work to secure their big data in the cloud. Once these have been addressed, it becomes infinitely easier to take the longer view OMB requires.
1. Keep keys and data separate
Security becomes harder to manage once data is in the cloud. Agencies should not blindly assume that cloud service providers will meet their compliance requirements or security expectations.
That’s why IT professionals must build safeguards into their data before routing it to the cloud. End-to-end encryption is one way to go, as long as encryption choices are tailored to specific workloads.
In a decentralized context, agencies might encrypt data on users’ PCs before sending it on to the cloud. For a more network-centric approach, they could set policies to encrypt data in folders or files on their way to the cloud.
In either case, agencies must maintain control of their encryption keys -- especially with public clouds. Public clouds rely on general-purpose security, and that won’t be enough to meet agency requirements.
Encryption keys can be managed in the cloud, but that puts keys and the data they protect in the same place, which invites the compromise of both. To truly keep data secure, agencies need direct control of their encryption keys, which means keeping keys separate from where the data resides -- for example, keeping keys in an on-premises hardware appliance.
2. "De-identify” data for better privacy
Cloud data can come from a wide range of sources, and the explosive growth of digital technologies means that data is more often than not connected to personal information. An agency’s cloud-based security posture, therefore, is about more than protecting data; it’s also about protecting the personal information associated with that data.
A comprehensive security strategy must take into account technologies or practices to de-identify data so that only the metadata is carried into cloud-based domains outside agency control. Obvious ways to de-identify data would be to remove personal markers (birthdate, age, gender, Social Security numbers) and location identifiers (ZIP code, city or town).
Despite the clear risks, these markers continue to be associated with big data in the cloud. Agencies must either remove the markers or use encryption to make them unreadable. That’s an easy way to reduce the risk of compromising privacy when data is collected and shared.
3. Know where data comes from
While the cloud is continually generating mountains of new digital data, agencies must still manage a considerable amount of analog information. When these historical records and archived data are digitized for the cloud, they pose security policy challenges. The parameters once used to keep analog information private as it is digitized is likely not consistent with today’s cloud security requirements, so knowing where this information comes from is important.
Agencies may need to build in additional security safeguards when pulling these newly digitized data repositories into their big data and cloud equations. One approach might be use digital signatures to verify the identity of the person creating the data. That helps with understanding where the data comes from, which in turn maintains integrity of the data when it is sent the cloud and protects against unauthorized alteration during the transfer.
Agencies should start with the basics and put together a cyber strategy for keeping cloud-based big data secure. From there, it becomes easier to maintain more complete information security programs, refine measurements for CDM, and in turn share those measurements across the federal government.