Machine learning a growing force against online fraud
Connecting state and local government leaders
Sift Science uses new machine learning techniques to spot unusual fraud patterns as soon as they appear on the network.
A group of ex-Google employees has started a company that wants to expand the use of big data to spot fraud — a blight that costs taxpayers over $125 billion a year, and affects public-sector agencies involved in payments, collections and benefits — before it occurs.
San Francisco-based Sift Science says it has developed an algorithm that uses machine-learning techniques to stay ahead of new fraud tactics as they are introduced into its customers’ networks.
“Many anti-fraud technologies follow a set number, maybe 175 to 225 rules, against which to measure user behavior,” Sift Science co-founder Brandon Ballinger told GigaOm.
“The problem is fraudsters don’t follow the rules and change all the time.”
In contrast, machine learning is the ability of the system to process new patterns and react without specific rules or cues.
The Sift system trains itself by learning patterns of fraud as they appear on its customers sites, which become part of the machine-learning network. The company says its systems already incorporate over 1 million fraud patterns that help alert users to deceptive activity.
“As more sites join, it will learn more patterns to help everybody fight fraud more accurately, Ballinger added.
Sift Science uses APIs that allow customers to report events on their sites, then applies large-scale matching learning to assign a fraud score to each of the sites, users, according to the company’s description. The collection process has three principal components: On-page activity gathered by a JavaScript snippet, transactions reported from a customer’s server using Sift Science’s REST API, and labels of know fraudsters.
Running analytics on the collected patterns has revealed some unusual signs of probable fraud, according to the company. For instance, an item on an auction listed in all capital text by a seller is four times more likely to be fraudulent.
And people with Yahoo.com e-mail accounts are five times more likely to create a fake account than someone using Gmail.com, according to Sift Science records. The company recently opened up the fraud detection service for texting to the public.
Machine learning as a tool to fight fraud has been developing for a while. In 2006, researchers at Stanford University broke down several mathematical methods, concluding that, “machine learning methods are quite easily able to outperform current industry standards in detecting fraud.”
RSA, the security unit of EMC, has been using analytics to combat online fraud for years, and attributes an increase in detection rates over the past few years to greater use of machine learning, according to a report in the Guardian.
Machine learning systems also are becoming more common among government systems, including the Federal Aviation Administration’s Next Generation Air Transport System and the Energy Department’s Smart Gird program to create an interactive national power delivery system.
Public-sector agencies and programs that are vulnerable to fraud, waste and abuse are becoming more dependent on analytic techniques to identify fraud.
Recently the state of Michigan deployed an SAS Analytics suite as the engine for its Enterprise Fraud Detection System to go after fraud its unemployment insurance programs.
insurance claims fraud