sponsor content What's this?
Textual Chemistry: Getting to Know Citizens Through Their Words
Presented by Teradata
How applying analytics to unstructured data can help state and local governments better understand (and serve) citizens
Some of history’s greatest conundrums might have been avoided if humans were just a little better at one small task: understanding one another.
Despite small advancements over time, disconnects between what citizens want and what leaders perceive them to want continue to bring about inefficiencies in government. Luckily for the future, we’re getting there.
State and local governments have long known that they sit on troves of valuable information. This is big data, and it’s not a new concept. Some of that information, though, filed away in the recesses of administrative buildings, holds the secret to better understanding: text.
“Text tells what’s on people’s minds, and that’s why it’s got value,” says Mark Turner, Senior Analytic Data Scientist at Teradata.
Unstructured data—information that lacks the computability of numerical data but is descriptive and insightful despite its complexity—is an oft-overlooked asset in many state and local governments. It can range in form from survey comments to e-mail correspondence to public posts on social media. In the context of a municipality, it’s where citizens express themselves.
But state and local governments can analyze this data and fit it into a structure that produces value.
“When it comes to ingesting text, the first step is to turn it into something that can be analyzed,” says Peeter Kivestu, Industry Consultant at Teradata. “By doing text analysis, governments can get insights into what’s being talked about in the same way you can sum up and analyze numbers or statistics.”
On the ground level, the work starts with frequency analysis, a very literal computation of the most commonly used words in any set of textual data. At this point, Turner says the data is crude. For example, a first-run frequency analysis might indicate that a large number of citizens use the word “dogs” in e-mails to city government.
The analysis grows from there.
The technology, the Teradata Aster product in this case, can then indicate which multiple-word phrases appear most often. If “hot dogs” appears with a similarly high frequency, it’s reasonable to assume citizens are talking about food, not canines.
Then, through sentiment analysis, textual data can start to take quantifiable shape. The words “love” and “despise,” for example, can be reasonably classified to show positive and negative sentiment.
“Once we’ve classified a certain set of messages, the next step is finding what they’re positive or negative about,” Turner says. This happens through association. In the hot dog example, results could show the target phrase “hot dogs” appearing with high frequency alongside “despise,” an indication that a given population has something against America’s most beloved mystery meat.
What is initially a rough sketch becomes a statistically significant picture in the aggregate.
“These analytics find the important data—the signal—and ignore the unimportant parts—the noise,” Turner says.
That signal is the key to state and local governments understanding their constituencies, and it’s what takes the implications of textual data far beyond ballpark franks. For their citizens’ collective thoughts on roadwork, policy or other city events, leaders need only look to the public forum to find an opportunity for analysis.
“Enormous insight can come from starting small,” Kivestu says. “Look at two pieces of information together that you never have before, and go from there.”
Analytics projects can start with an existing problem but most often arise from questions about results and outliers when two pieces of data are brought together. These engagements are learning processes for those involved, he says—so newbies should not be afraid of trying, because they’re likely to uncover questions no one knew to ask before.
Though most state and local governments are harboring a wealth of information, the process doesn’t require an overwhelming amount of data to turn out results. What it does require is curiosity and a willingness to inquire in ways that might be new or unfamiliar.
“Analytics is all about asking, what’s the question? And the next question? And the next question after that?” Kivestu says.
It won’t be long before you’ve found insights that push you closer to an answer.
Read on here for more information about Teradata and text analysis.
This content is made possible by our sponsor. The editorial staff of Route Fifty was not involved in its preparation.