DARPA's dielets and the prominence of provenance
Connecting state and local government leaders
DARPA is using data provenance to track counterfeit electronic parts infiltrating the military supply chain.
The Defense Advanced Research Projects Agency is looking for proposals to develop a “dielet,” an electronic component to authenticate the provenance of other electronic components. DARPA wants to use the tool to help track counterfeit electronic parts infiltrating the military supply chain.
DARPA’s project turns on concept of provenance and the metadata necessary to capture it. Provenance metadata is a cornerstone of metadata collection for every digital object in an organization. Here’s why.
The most common definition of provenance is the origin or source of something. Provenance is also the history of ownership of an object, and it is especially used to establish the authenticity of works of art. Likewise, data provenance covers the provenance of computerized data.
There are two main aspects of data provenance: ownership and usage. Metadata is used to capture the provenance of a particular data item that could be an individual file, archive, data set or data package. For example, a Word document may have a long chain of ownership via editors and reviewers before becoming a finished product. Capturing that lineage information establishes the provenance of that particular data object.
The purpose of the lineage metadata is to establish the authenticity of an artifact by understanding its chain of ownership. In fact, the art community has a long history of using provenance in order to establish the authenticity of art work. Without the provenance metadata, there is no trust.
The second important use case for provenance is to capture an artifact’s change process to be able to reconstruct it at any point in its lifecycle. This is a provenance requirement that I am currently designing a system to capture.
To design this type of metadata there are several data standards to choose from. In 2013, the W3C created a provenance data model and representations in the Web Ontology Language (OWL), Dublin Core terms (a popular set of metadata) and an extensible markup language (XML) schema. This standard is compatible with linked data models being used in several government transparency initiatives.
Establishing lineage, pedigree and provenance is so important that DARPA is paying to deliver it even for our computing hardware. Your organization should follow suit in order to deliver trust, reconstruction and high quality for your information consumers.
Michael C. Daconta (mdaconta@incadencecorp.com or @mdaconta) is the Vice President of Advanced Technology at InCadence Strategic Solutions and the former Metadata Program Manager for the Homeland Security Department. His new book is entitled, The Great Cloud Migration: Your Roadmap to Cloud Computing, Big Data and Linked Data.