The predictive analytics arms race
Connecting state and local government leaders
Agencies are mining data to fight fraud, but the criminals may have better algorithms.
Big data analytics are transforming government's ability to fight fraud, but the technical and policy obstacles are still significant --- and the bad guys are using analytics too.
That was the message at a May 23 panel discussion on analytics in Cambridge, Md. Hilary Cronin, the Education Department's program manager for risk management and monitoring, and Caryl Brzymialkiewicz, the Department of Health and Human Services Office of the Inspector General's chief data officer and assistant IG, both have fraud-spotting as a key mission for their data teams. They and the other panelists -- 18F Director of Analytics and Data Programs Johan Bos-Beijer and National Oceanic and Atmospheric Administration Lead Analyst for Open Data Services Dave McClure -- discussed what such analytics take as well as other issues that face agencies wrangling massive datasets.
The panel discussion, part of ACT-IAC's annual Management of Change conference, was on the record, but not for individual attribution.
The federal government issues some $600 billion a year in grant money, the participants noted, and a small data team can find both actionable leads for criminal investigators and "hot spots" where an agency can make changes to prevent future fraud.
The amount of available data related to grants is massive, they said, and anything from an unusual address to the sequence of medical procedure claims can be a red flag. "There are thousands of ways that you could be looking at the data," one said.
Yet "the fraud community is investing more money than we are to get ahead of us," another panelist noted. "They have better modeling and better algorithms right now than we do."
Yet while "the technologies are certainly not trivial," as one panelist put it, tech itself was rarely the limiter for in-house analytics. (NOAA has, however, launched research projects with several commercial cloud providers to find better ways of sharing the agency's massive datasets with outside users.) More often, he said, what's pushing the "bleeding edge are the business processes."
The other speakers agreed, with one citing the inherent tension between open data ambitions and the Privacy Act as an example. "I have seen cases where IGs went to other IGs under the law enforcement banner," she said. "It falls under routine use and should have had no challenge to sharing leads on folks who are taking advantage of the system for fraud ... that should have been the easiest easy button, rally-around-the-problem effort, and yet everyone's lawyers interpreted it differently."
And data criminals may well be looking for more than credit card numbers and digital identities to steal, another panelist noted. He cited the 2015 breach of Office of Personnel Management databases that exposed nearly 22 million records and argued that identity theft or potential blackmail were not the biggest risks. Attached to those background-check records, he noted, "are your high school, your spouse, your exes, your neighbors, who went to school with and didn't get along with, your doctor's reports, your children's stuff..."
"Guess what?" he said. "That's an analytics model. ... There are tons of patterns in there. So someone who wants to be devious will take that model and extrapolate it to 300 million citizens. And you can get a pretty good idea of how people behave ... if you intentionally spark a reaction."
NEXT STORY: How to secure 911 systems