Data quality: Whose job is it?
Connecting state and local government leaders
The IT department is usually held responsible for maintaining quality data, but those entering the data are not.
A recent survey suggests a significant disconnect in many organizations between the people creating data and those managing it.
The IT department is usually held responsible for maintaining quality data, but those entering the data are not. “Data quality responsibility, for the most part, is not assigned to those directly engaged in its capture,” according to a survey by 451 Research on enterprise data quality.
When perceptions of the two groups -- those “accountable” and those “responsible” -- are misaligned, data quality suffers, the report said, and IT departments must employ multiple cleansing technologies to compensate, even as the volume of data grows.
Clean data is important for business. Sixty-five percent of respondents said that nearly half of business value can be lost to poor data quality, and 29 percent thought the consequences would be were worse.
Bad data comes from many sources, but the biggest cause is improper data entry by employees. Data migration or conversion projects come in second, followed by mixed entries by multiple users, changes to source systems, systems errors, data entry by customers and external data.
Problems caused by poor data quality are numerous. It forces staff to spend extra time reconciling data, contributes to extra costs, lost revenue delays in deploying a new systems and bad decision making.
Despite recognition of the problem, respondents’ confidence in their organization’s data quality management (DQM) practices is not high. Less than half (40 percent) of respondents to the survey were very confident that all the data sources required for their purposes had been aggregated prior to cleansing. Only 50 percent believed their organization’s data quality and DQM practices were either slightly better than satisfactory, or at least “good enough.”
And while data aggregation and cleansing might seem ripe for automation, the tools and practices currently employed for ensuring data quality vary widely. “Many respondents (37.5 percent) also reported that dependency management of any kind for analytics is not automated and involves manual effort,” noted the report. Others dealt with data quality issues after the fact -- fixing errors found after running reports. Another 8.5 percent of respondents avoided DQM completely, favoring a “hope for the best” approach.
The list of DQM tools and services most needed by IT staff closely matched the list of what’s being used: tools for big data, data validation/data cleansing, master data management and monitoring. However, the report noted, a using variety of general- and special-purpose tools can create additional complexity.
The problem of managing data is only going to get worse. Nearly all (95 percent) expected the number of data sources and data volume to increase this year. Almost 70 percent expect these volumes to grow by up to 70 percent, and nearly 30 percent expect data volumes to increase by 75 to nearly 300 percent. This anticipated growth will only exacerbate DQM issues, the report said, “particularly if data quality is already less than satisfactory.”
The gap between those who enter the data and those responsible for its capture and use “leads to a lack of empathy between the two constituencies and thus, we suspect, largely accounts for the laissez-faire attitude of the respondents” regarding data quality management, the report concluded.
To close that gap and improve the overall quality of data, the relationship between the two groups must become more transparent.
NEXT STORY: GSA recognized for cloud analytics platform