Reality Check | Metadata's new name is TED
Connecting state and local government leaders
Commentary: The classic definition of metadata ' data about data ' blurs the distinction between data and metadata. A better name for metadata might be "Targeted Extrinsic Description."
'WHAT'S IN A NAME? That which we call a rose by any other name would smell as sweet.'
Excuse me, Juliet! I beg to differ with our young, star-crossed lover because names do matter. Poor naming can cause a lot of heartache. Just ask Moon Unit Zappa or Jermajesty Jackson ' you can't make this stuff up. It can also cause confusion, uncertainty and misdirection.
That has been the case with metadata for the past decade. For example, what is the No. 1 metadata repository on the market? You might be shocked to learn there is no leader because homegrown databases are the most popular option. That fact hints at a problem in need of a better solution.
At a recent conference, I presented a controversial slide titled 'Metadata is Not Data.' The point is that our classic definition of metadata ' data about data ' blurs the distinction between data and metadata. To demonstrate that distinction, let's examine my favorite example of metadata: the iPod. An iPod's purpose is to play music that is encoded in data files. Each song is encoded via an audio format in one file of data. The device can perform its primary function (playing music) without any metadata. The metadata is the extrinsic, or external, description of each distinct data resource, in this case, that includes each file, such as the name of the song, the genre of music, the band that created the song, and when it was created. In other words, metadata is what makes the data useful.
Metadata is what helps you decide what songs you want to play at a particular moment. So it should be clear that there are significant differences between metadata and data in our example. But their names make you think they are similar. In fact, metadata has data in its name. The mistake here is that you don't differentiate concepts by their similarities; you differentiate them by their differences. A name that blurs those differences diminishes the understanding of those important differences. The resulting ambiguity stalls progress in metadata design, use and commercial products.
Techies have a difficult time swallowing this pill because there is commonality between metadata and data. Both can be stored in a database, and they share a tradition whereby all the field names of our tables in a database ' similar to column names in a spreadsheet ' are called metadata. This granular distinction between the value of a data item ' for example, the number 42 ' and the field name of that data item ' for example, age ' is explained by calling the raw number data and the field name metadata. This is a technical misunderstanding of data because it makes the incorrect assumption that data has no intrinsic structure. In an information-processing system, data does not and never will consist of random values. That would be called noise. Thus, by definition, data consists of enough intrinsic structure to achieve its primary function. And in this one basic misunderstanding lies the root cause of the ambiguity. Not understanding the intrinsic/extrinsic divide of data processing has delayed our ability to turn data into information.
So what is the solution to our clarity crisis? Let's begin with a better name for metadata. I propose TED, which stands for Targeted Extrinsic Description. Of course, this name always gets a chuckle because it's not a cool acronym and is presented with tongue firmly planted in cheek. Regardless of whether the acronym works, it accurately describes the concept of metadata. Either way, the search for a new name that embodies this concept must continue so we can improve metadata design as a distinct technical discipline. This is important because improving metadata design is the cornerstone of consumer-driven information production. And those information consumers ' your employees and citizens ' don't need more data. They need information.
Daconta (mdaconta@ acceleratedim.com) is chief technology officer at Accelerated Information Management and former metadata program manager at the Homeland Security Department. His latest book is Information as Product: How to Deliver the Right Information to the Right Person at the Right Time.
NEXT STORY: StudioDock 3i for iPod