Can big data predict violent behavior?

Connect with state & local government leaders
 

Connecting state and local government leaders

A team from Harvard Medical School is applying machine learning models to U.S. Army personnel datasets to predict violent behavior.

Using machine-learning to analyze over three terabytes of data on military personnel, researchers at Harvard Medical School were able to predict the 5 percent of U.S. Army soldiers who later committed one-third of all violent crimes in the workplace between 2004 and 2009.  When analyzing data from 2011 to 2013, the machine-learning model was even more accurate, predicting the 5 percent of soldiers that would commit 50.5 percent of violent crimes.

That research, sponsored by the Department of Defense, fused data from several military datasets, including the Army’s Study to Assess Risk and Resilience in Servicemembers

The team created a consolidated dataset that pulled data from many different military datasets, including crime data, deployment data and the soldiers’ electronic medical records,  said Ronald Kessler, professor of health care policy at HMS and the principal investigator on the project.

According to Kessler, simply getting the data ready for analysis was a major part of the job.  “With these big data jobs, 85 percent of the work is usually data management and the other 15 percent is the data analysis,” he said.  In fact, the project had four people working for three to four years just preparing the data.  When different agencies and offices have collected data over decades, Kessler noted, “variables and names get changed.” And the people who originally structured the data are either gone or have forgotten how it was put together.

“Now it’s a little more organized,” he added.  “We have a couple of people who do only data management – updating the data, cleaning the data – and who don’t do any statistical analysis at all.”

As for the statistical analysis, the massive amount of data itself presents a challenge.  The project is not interested just in soldiers’ personal data and crime data; it also takes note of when things happen.  Accordingly, the unit of analysis is a “person-month.”  With data from both 975,000 regular Army soldiers and 750,000 National Guard and Reserves collected over five years, that means 32 million person-months in the project file.  “And in each of those 32 million person-months, we have a couple thousand variables,” said Kessler. “So for a given year, it’s 32 million records, times 2,000 variables, times 12 months.” 

And each variable changes not only with each individual, but over time.  “Did you just get demoted?” Kessler asked, rhetorically.  “Being demoted could be a risk factor, but it’s not a risk factor for the rest of your life.  It’s maybe only a risk factor in the month after a demotion.  That’s the highest risk, but then it goes down with time.”

The specific machine-learning algorithms applied to the data depend on a number of factors.  According to Kessler, there are more than 50 algorithms available and none of them are appropriate for every application, “so the issue is selecting the right one for an application.”

“You have such an enormous number of variables, you’ll always find something that predicts anything,” said Kessler.  “The question is how stable it is.” 

Accordingly the team had to do a great deal of cross-validation.  “It takes a lot of computer time,” he noted.  “It might take three or four days for something to converge.” 

What the project is finding at this stage, Kessler stressed, is correlation, not causation.  “If we discover that being a left-handed midget is a significant predictor of suicide – which, by the way, it’s not – then that’s in the model, as long as it’s a stable predictor,” he said.  “We’re trying to be agnostic as to what causes what.”

It’s important to be cautious in interpreting causal links, Kessler stressed.  “We’ve tried to play down the importance of any one predictor.”

Even within the 5 percent of soldiers at highest risk for committing violent crimes, there could be three or four subgroups.  A 17-year-old who came into the Army from a bad family environment and a soldier with 28 years of service who had never been married and who was involuntarily discharged may both be at risk, he said, but “the issues for those two people are very different.”

Knowing how those subgroups are different through sharpening the understanding of causal links is the next step in the project.  “We want to be able to help the clinicians identify those at most risk,” said Kessler. 

The project will move beyond assessing risk for violence within the military.  In the future, for example, Kessler said he expects to apply the methods being developed to finding appropriate treatments for depression.  “With depression, there’s no one kind of treatment that works best for every person,” he said.  “Can we, without having to go through long trial and error, figure out which kinds of things are going to work for which kinds of patients?”

That would be especially valuable, since he noted that part of the problem with depression is that individuals give up on treatments easily when they don’t produce results.  “The vast majority of patients will be helped eventually by some treatment,” Kessler said.  “We’re trying to develop the same kind of complicated models [as in the military project] to use information about the individual to pick the best bet for a treatment.”

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.