What's holding back Hadoop?
Connecting state and local government leaders
A survey of data management experts finds that qualified staff and clearly defined business cases are the top obstacles to deploying the open source big data solution.
Hadoop -- the open-source, distributed programming framework that relies on parallel processing to store and analyze both structured and unstructured data -- has been the talk of big data for several years now. And while a recent survey of IT, business intelligence and data warehousing leaders found that 60 percent will Hadoop in production by 2016, deployment remains a daunting task.
TDWI -- which, like GCN, is owned by 1105 Media -- polled data management professionals in both the public and private sector, who reported that staff expertise and the lack of a clear business case topped their list of barriers to implementation:
Barriers to implementation | Respondents who checked each category |
Inadequate skills or difficulty of finding skilled staff |
|
Lack of compelling business case |
|
Lack of business sponsorship |
|
Lack of data governance |
|
Security for Hadoop data |
|
Lack of metadata management |
|
Excessive hand coding required of Hadoop |
|
Cost of staffing Hadoop admin/development |
|
Cost of implementing a new technology |
|
Difficulty of architecting big data analytic system |
|
Immature support for ANSI-standard SQL |
|
Interoperability with existing systems or tools |
|
Software tools are few and immature |
|
Enterprise-class manageability |
|
Not enough information on how to get started |
|
Slow pace of hand-coded development |
|
Cannot make big data usable for end users |
|
Handling data in real time |
|
Existing user-defined DW architecture |
|
Poor quality of Hadoop data |
|
Software tools need higher-level language support |
|
Hadoop's high operational expenses |
|
Enterprise-class availability |
|
Other |
|
The respondents did, however, see a wide range of uses to justify the deployment efforts, including:
HDFS applications | Respondents who checked each category |
Complementary extension of a data warehouse |
|
Data exploration and discovery |
|
Data staging for data warehousing and data integration |
|
Data lake |
|
Queryable archive for non-traditional data |
|
Computational platform and sandbox for analytics |
|
Enterprise data hub (for both new and traditional data) |
|
Business intelligence (reporting, dashboards) |
|
Queryable archive for traditional enterprise data |
|
Operational data store (ODS) |
|
Repository for content, records management |
|
Operational application support (apps on Hadoop data) |
|
Don't know |
|
Other |
|
And just 6 percent said Hadoop deployments were not in their organization's plans at all:
When do you expect to have HDFS in production?
- 2012 - 2014
The full report, which also includes best practices and implementation trends, is available here.