Harnessing the power of machine learning for improved decision-making
Connecting state and local government leaders
Using both supervised and unsupervised learning, analysts can find new patterns in their data and validate those patterns against mission relevance.
Across government, IT managers are looking to harness the power of artificial intelligence and machine learning techniques (AI/ML) to extract and analyze data to support mission delivery and better serve citizens.
Practically every large federal agency is executing some type of proof of concept or pilot project related to AI/ML technologies. The government’s AI toolkit is diverse and spans the federal administrative state, according to a report commissioned by the Administrative Conference of the United States (ACUS). Nearly half of the 142 federal agencies canvassed have experimented with AI/ML tools, the report, Government by Algorithm: Artificial Intelligence in Federal Administrative Agencies, states.
Moreover, AI tools are already improving agency operations across the full range of governance tasks, including regulatory mandate enforcement, adjudicating government benefits and privileges, monitoring and analyzing risks to public safety and health, providing weather forecasting information and extracting information from the trove of government data to address consumer complaints.
Agencies with mature data science practices are further along in their AI/ML exploration. However, because agencies are at different stages in their digital journeys, many federal decision-makers still struggle to understand AI/ML. They need a better grasp of the skill sets and best practices needed to derive meaningful insights from data powered by AI/ML tools.
Understanding how AI/ML works
AI mimics human cognitive functions such as the ability to sense, reason, act and adapt, giving machines the ability to act intelligently. Machine learning is a component of AI, which involves the training of algorithms or models that then give predictions about data it has yet to observe. ML models are not programmed like conventional algorithms. They are trained using data -- such as words, log data, time series data or images -- and make predictions on actions to perform.
Within the field of machine learning, there are two main types of tasks: supervised and unsupervised.
With supervised learning, data analysts have prior knowledge of what the output values for their samples should be. The AI system is specifically told what to look for, so the model is trained until it can detect underlying patterns and relationships. For example, an email spam filter is a machine learning program that can learn to flag spam after being given examples of spam emails that are flagged by users and examples of regular non-spam emails. The examples the system uses to learn are called the training set.
Unsupervised learning looks for previously undetected patterns in a dataset with no pre-existing labels and with a minimum of human supervision. For instance, data points with similar characteristics can be automatically grouped into clusters for anomaly detection, such as in fraud detection or identifying defective mechanical parts in predictive maintenance.
Supervised, unsupervised in action
It is not a matter of which approach is better. Both supervised and unsupervised learning are needed for machine learning to be effective.
Both approaches were applied recently to help a large defense financial management and comptroller office resolve over $2 billion in unmatched transactions in an enterprise resource planning system. Many tasks required significant manual effort, so the organization implemented a robotic process automation solution to automatically access data from various financial management systems and process transactions without human intervention. However, RPA fell short when data variances exceeded tolerance for matching data and documents, so AI/ML techniques were used to resolve the unmatched transactions.
The data analyst team used supervised learning with preexisting rules that resulted in these transactions. The team was then able to provide additional value because they applied unsupervised ML techniques to find patterns in the data that they were not previously aware of.
To get a better sense of how AI/ML can help agencies better manage data, it is worth considering these three steps:
- Start with existing domain-specific knowledge. This includes processes and rules that employees handle manually. The analysts working with the defense comptroller office knew transactions open for more than five days were a problem that had to be addressed. From that knowledge, they built a model to find new features for finding unmatched transactions.
- Automate to find new patterns. Applying automation and unsupervised learning, the team found reference numbers were incorrect, something they were not aware of at the onset of the project.
- Validate patterns against business relevance. Determine which ones are valuable from a business perspective. Each pattern must be validated against common-sense checks, because the patterns may be a statistical anomaly. Analysts might discover many patterns, but that does mean all of them are valuable.
Data analysts should think of these steps as a continuous loop. If the output from unsupervised learning is meaningful, they can incorporate it into the supervised learning modeling. Thus, they are involved in a continuous learning process as they explore the data together.
Avoiding pitfalls
It is important for IT teams to realize they cannot just feed data into machine learning models, especially with unsupervised learning, which is a little more art than science. That is where humans really need to be involved. Also, analysts should avoid over-fitting models seeking to derive too much insight.
Remember: AI/ML and RPA are meant to augment humans in the workforce, not merely replace people with autonomous robots or chatbots. To be effective, agencies must strategically organize around the right people, processes and technologies to harness the power of innovative technologies such as AI/ML to achieve the performance they need at scale.