Where traditional DNA testing fails, algorithms take over

 

Connecting state and local government leaders

Powerful software is solving more crimes and raising new questions about due process.

This story was originally published by ProPublica.

Late on a hot August night in 2014, Syracuse, N.Y., police tried to pull over a car driving without headlights. The driver and passenger fled into a darkened park. As the officers chased them on foot, they said they heard a gunshot. The cops never caught the suspects, but recovered a loaded handgun.

The police connected the abandoned car to its owner, and arrested him, but could not tie him to the handgun without a DNA match. The mixture of DNA on the handgun was too complicated for forensic scientists to analyze with conventional methods, a representative from the Onondaga County crime lab later testified. There were at least four people’s DNA present and possibly five or six.

So the District Attorney’s office outsourced the analysis to Cybergenetics, a private company that makes TrueAllele, a “probabilistic genotyping” software program. Where traditional DNA analysis involves manually and visually interpreting DNA markers, TrueAllele runs DNA data through complex statistical algorithms to calculate the likelihood that a particular person’s DNA is present in a mixture, compared to a random person’s DNA.

Developers of tools like TrueAllele say that they remove human bias from the equation, delivering accurate, consistent results with the exactitude and cold remove of a calculator.

But critics worry that they undermine an important aspect of due process. The accused, defense attorneys, judges and jurors typically don’t have access to the tools’ often proprietary inner workings and, thus, the ability to question their conclusions. As one attorney wrote in a brief arguing that TrueAllele’s developer should have to reveal and explain its source code, “The Petitioner cannot cross-examine a computer.”

In the Syracuse case, TrueAllele indicated that the DNA on the gun was a likely match to Frank Thomas, the 19-year-old who owned the car. Prosecutors had previously offered Thomas a deal if he pleaded guilty to a gun possession charge, but Thomas had maintained his innocence.

The TrueAllele analysis was the only physical evidence presented at trial connecting Thomas to the gun. Dr. Mark Perlin, TrueAllele’s developer, testified that a match between the DNA on the gun and Thomas’ DNA was “1.78 trillion times more probable than a coincidental match to an unrelated African American person” and “892 billion times more probable than a coincidental match to an unrelated Hispanic person.” The attorney for Thomas, who is black and Hispanic, pressed Perlin to share the tool’s source code so that his results could be independently verified. Perlin argued this was unnecessary and irrelevant.

In March, Thomas was found guilty of criminal possession of a weapon, reckless endangerment and menacing a police officer. He was sent to prison for 15 and a half years. He’s appealing his conviction.

For the past year, ProPublica has been investigating and reverse-engineering various algorithms as part of a series called “Machine Bias.” We’ve found that these complex pieces of software are helping to guide decisions in an ever-growing number of realms, including criminal justice, in ways that are often little understood and sometimes unfair.

DNA evidence is the gold standard of forensic science. Even as other techniques, from bite-mark analyses to fire patterns, have come under question, DNA has remained the most unassailable and most objective form of proof that someone did, or did not, commit a crime.

The emergence of algorithmic analysis programs, however, is creating a new frontier of DNA science. The tools are so new and expensive that only a handful of local crime labs use them regularly. But as law enforcement looks to DNA more and more frequently to solve even minor crimes, that seems almost certain to change.

Perlin says that, while he resists turning over code, he takes pains to demonstrate how TrueAllele works when it’s used in a criminal trial, giving attorneys and judges access to test the software themselves. “‘Here’s the car, here’s the keys -- drive it,’” he said he tells them.

Perlin started building TrueAllele for casework in 1999, a few years after working on the Human Genome Project. He has a bachelor’s degree in chemistry, PhDs in math and computer science, and a medical degree. In the early 2000s, his company helped clear the backlog of DNA samples waiting to be interpreted for the government databank in the UK and later used TrueAllele to help identify victims’ remains at the World Trade Center site after September 11. TrueAllele was used for the first time in a criminal case in 2009 and now encompasses some 170,000 lines of computer code.

Cybergenetics offers police, prosecutors and defenders an appealing business model: It offers to take on their most difficult DNA cases and provides preliminary results for free. If the results indicate the likelihood of a statistical match, customers only pay at the point at which they want Cybergenetics to run a complete analysis and write a report about the results that can be used at trial. Cybergenetics also licenses its software for crime labs to use themselves. Labs in the Commonwealth of Virginia, Baltimore, Kern County in California, and Beaufort and Richland counties in South Carolina all license TrueAllele.

“Our laboratory does a lot of property crime, which involves a lot of weak samples and mixtures,” said John Barron, senior forensic scientist at the Richland County Sheriff’s Department. “It’s a more complete analysis of the mixture versus manually using [conventional DNA] thresholds, so it’s fairer to both the prosecution and the defense. We use it quite a bit.”

Since TrueAllele came on the scene, other companies have developed software to compete with it. The U.S. Army and the FBI use STRmix, developed by a New Zealand-Australia collaborative and sold in the U.S. by Nichevision, as do several public crime labs across the nation. New York City’s Office of the Chief Medical Examiner recently announced that it will switch to STRmix in 2017.

In recent years, these powerful tools have enabled prosecutors to make cases with evidence that would have otherwise been difficult or impossible to interpret. TrueAllele solved a string of armed robberies from “touch” DNA swabbed from a store counter. STRmix solved another robbery by analyzing the sweat inside a sneaker.

The software isn’t only a tool for prosecutors: The Indiana Innocence Project used TrueAllele to help free a man who had been in prison since 1991 for a violent rape that DNA proved he did not commit.

Still, probabilistic genotyping remains on the outer edge of scientific acceptance. The White House released a report in September by the President’s Council of Advisors on Science and Technology (PCAST) that called probabilistic genotyping an improvement over traditional methods of analyzing complex mixtures of DNA, but concluded the tools “still require scientific scrutiny.”

Studies have only established the validity of the available software in certain circumstances (such as a DNA mixture of three contributors), but not others, the report asserted. The authors cite a case in upstate New York in which TrueAllele and STRmix were used to analyze the same DNA data and came to different conclusions. (The judge in that case ultimately did not admit the DNA evidence into trial.)

The PCAST report also noted that independent research is especially needed. Most of the studies published on TrueAllele and STRmix in peer-reviewed journals have been done by the developers of the tools.

“Appropriate evaluation of the proposed methods should consist of studies by multiple groups, not associated with the software developers, that investigate the performance and define the limitations of programs by testing them on a wide range of mixtures with different properties,” the PCAST report says.

Perlin, TrueAllele’s creator, and John Buckleton, one of the creators of STRmix, both objected. “Your Report cannot unilaterally impose a novel notion of ‘independent authorship’ for peer-review,” wrote Perlin in an open letter, explaining that having a developer as part of a team of authors is the norm in scientific publishing. Buckleton wrote that the internal validation studies performed by jurisdictions using STRmix should be proof enough that it works.

Some makers of probabilistic genotyping software allow other programmers to use and modify their code. LRmix, software created by a pair of scientists in the Netherlands, EuroForMix, created by a Norwegian team, and Lab Retriever, a non-commercial program available under the Creative Commons license and uploaded to GitHub, are among the free, open-source tools available.

Beyond offering transparency, this approach can help expose problems. A significant bug was discovered and fixed in LikeLTD, an open-source Australian probabilistic genotyping program, because of outside scrutiny.

But TrueAllele and STRmix remain proprietary. A coding error in STRmix was only discovered in the midst of a criminal trial where prosecutors sought to include its faulty results as evidence. (Its makers say the error was minor and was quickly fixed.)

Defendants’ requests to get access to TrueAllele’s source code have consistently been denied, leading the Electronic Privacy Information Center, an advocacy group, to kick off a FOIA campaign to obtain whatever information is publicly available from the jurisdictions that use it.

Some who advocate for defendants see plenty of upside to probabilistic genotyping tools, even without the benefits of full transparency. Greg Hampikian, a professor of biology at Boise State University who leads the Idaho Innocence Project, said the project has begun using TrueAllele to help exonerate wrongly convicted people.

“Microsoft Excel doesn’t release its code either, but we can test it and see that it works, and that’s what we care about,” Hampikian said.

Judges have endorsed the business interests cited by makers of probabilistic genotyping software in ruling that they do not have to hand over their source code.

The defendant in the first case using a TrueAllele analysis appealed his conviction, based in part on his inability to understand enough about the software to challenge it. Pennsylvania Judge Jack Panella denied the appeal, saying the defendant had no right to the formula behind the software. “TrueAllele is proprietary software; it would not be possible to market TrueAllele if it were available for free,” Panella wrote.

Dr. Dan Krane, a professor of biology at Wright State University and frequent expert witness, said he figured defendants’ right to confront their accusers would outweigh companies’ right to make money.

“I suppose these are both Constitutional principles, but I thought one would trump the other,” Krane said. “And that’s not what’s happening here.”

NEXT STORY: Code.gov portal debuts

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.