How NIST helps facial recognition make better matches
Connecting state and local government leaders
The National Institute for Standards and Technology’s Face Recognition Vendor Test evaluates the algorithms used in facial recognition technologies.
As use of facial recognition technology creeps into daily life -- for everything from unlocking cell phones and bank accounts to verifying identities of international passengers at airports -- the accuracy of its algorithms is increasingly critical.
Since 2000, National Institute for Standards and Technology's Face Recognition Vendor Test (FRVT) has provided independent evaluations of commercially available and prototype facial recognition technologies to help determine where and how the technology can best be deployed.
The accuracy of a facial recognition algorithm cannot be stated as a single percentage, according to Patrick Grother, a NIST computer scientist who administers FRVT testing and helps compile the resulting reports.
The standard way to report the accuracy of these algorithms is by plotting two numbers, Grother explained. On one axis is the false match rate (FMR) -- the likelihood of an input image being matched with the wrong person. The other axis measures the false non-match rate (FNMR) -- the likelihood that the algorithm fails to identify a match when a clear match is present. Both factors are essential for evaluating the performance of an algorithm.
The risks these numbers signal depends on the application of the technology. Grother used mobile phone verification as an example. “If a system fails to make a match with the right person [a high FNMR], the subject will make a further attempt, but it is an inconvenience,” he wrote in an email. “Whereas the higher the FMR, the higher the insecurity of the system. If a facial recognition system makes a match with the wrong person, they would be granted access – the system is insecure.”
But these numbers mean something different in the context of identifying a bad actor. In that use case, a high FMR rate means a higher chance of a false accusation, and a high FMNR rate means it is an ineffective algorithm. “If a facial recognition system fails to make a match with a known threat even though a photo of that threat is in its database, it is ineffective,” he wrote.
Similar tradeoffs happen with other biometric verification methods like fingerprints and irises, but because the input images are generated using dedicated devices – iris and finger print scanners –the resulting accuracy of the system is generally higher, Grother said.
NIST had been conducting FRVT testing every two to four years, but as technology has rapidly improved, the private sector has begun asking for more frequent evaluations.
Starting in 2017, NIST told developers they could submit algorithms at any time to be tested. The testing was limited, however, to one-to-one facial recognition technologies, which match one input face to one known face, often called a verification application. In February, NIST switched to testing one-to-many applications, where a single input image is run against a database of images to find a match -- often called an identification application.
NIST runs the algorithms against six different categories of photos: visa photos, mugshots, photos in the wild, webcam images, selfies and child exploitation photos. Each of these categories provides different challenges for facial recognition algorithms. NIST runs the algorithms millions of times, revealing how often the facial recognition programs are right and how often they’re wrong.
NIST's testing uses sequestered image databases that haven't been seen by researchers, which is different from environments where an algorithm can be tweaked over and over until it runs smoothly against one database. “That is really the gold standard for doing testing because it reduces the opportunities to cheat,” Grother said.
These NIST tests are at least partially responsible for the improvement in facial recognition technology over the past couple of decades, according to Brian Martin, a senior director of research and technology at IDEMIA, a provider of trusted identities.
“It makes everyone compete to make a better face recognition technology,” Martin said. “I think that if it were not for the NIST tests, face recognition technology probably would not be at the state where it is today.”
Going through the FRVT process is important for another simple reason: There aren’t many alternatives. “It’s the only independent testing of these commercial technologies,” Martin explained.
Both Martin and Grother said facial recognition technology is improving at the speed of Moore’s Law, if not faster.
“Error rates came down substantially,” Grother said about the improvements NIST has witnessed since its first report came out in 2000.
Not much has changed on the hardware end other than the ubiquity of cameras, which has helped build large photo databases, Grother said. But the photo-matching algorithms and software have gone through monumental improvements thanks to research in deep learning and convolutional neural networks used for analyzing visual imagery, Martin said.
NIST is now studying the accuracy of facial recognition technology across different demographics -- like gender and race -- to look for bias. NIST usually looks at observational data, but for the demographic testing, it has set up an experiment that will show any differences from one demographic to the next.
The report with the results of this experiment will be released later this year, Grother said.