Linked data takes a leap forward
Connecting state and local government leaders
The new version of the W3C Resource Description Framework will help government data managers publish their data as effective and usable linked data.
On Feb. 25, the World Wide Web Consortium RDF working group released eight new “recommendations” (aka standards in W3C-speak) and four new explanatory “notes” for a new version (1.1) of the Resource Description Framework. RDF forms the foundation of linked open data, which, in turn, is the foundation for many government open data initiatives. Additionally, linked data is a graph data model that is the underpinning for social graphs, knowledge graphs and Microsoft’s new Office Graph.
So what does this mean for open data and government transparency?
First, here is a summary of what was released:
RDF 1.1 data model specifications. Three recommendations form the heart of the new RDF 1.1 specification: RDF 1.1 Concepts & Abstract Syntax, RDF 1.1 Schema and RDF 1.1 Semantics. The RDF data model was enhanced with the notion of RDF data sets and some new data types. The RDF data model is very simple, with a subject, predicate and object. For example, you would say that “spiderman is the enemy of the green goblin” like this:
<http://example.org/#spiderman> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/#green-goblin>.
The linking nature of RDF is twofold. First, as seen in the example above, each thing (like spiderman) receives a unique universal resource identifier (URI) as its name (which in itself can be a link or URL). Furthermore, the three-part structure should also be looked at as two nodes in a graph with the relationship being the link.
New serialization formats. TriG, N-Triples, Turtle, N-Quads, XML and JSON-LD are all alternate syntaxes to represent the RDF data model. The addition of all these different serialization formats is a major boon for RDF that clearly separates the model from a particular implementation format (aka syntax).
Explanatory (and non-normative) notes. These guides assist practitioners understand and implement the standards. The RDF 1.1 Primer is especially useful in this regard.
So, what do these changes and enhancements to RDF mean?
First, it is important to highlight the JSON-LD serialization format. JSON is a very simple and popular data format, especially in modern Web applications. Furthermore, JSON is a concise format (much more so than XML) that is well-suited to represent the RDF data model. An example of this is Google adopting JSON-LD for marking up data in Gmail, Search and Google Now.
Second, like the rebranding of RDF to “linked data” in order to capitalize on the popularity of social graphs, RDF is adapting its strong semantics to other communities by separating the model from the syntax. In other words, if the mountain won’t come to Muhammad, then Muhammad must go to the mountain. Here is an example of JSON-LD:
{
"@context": "http://json-ld.org/contexts/person.jsonld",
"@id": "http://dbpedia.org/resource/John_Lennon",
"name": "John Lennon",
"born": "1940-10-09",
"spouse": http://dbpedia.org/resource/Cynthia_Lennon
}
Finally, in relation to government transparency, these updates will make it easier for the government to publish public data (and federal records) in a format that enables and preserves a “chain of authority.” Let me explain this concept more fully as it is important.
Online information, by itself and without corroboration, is inherently untrustworthy. Think about it – anyone, authoritative or not, can assert anything on the Internet. For instance, in this article (which is on the Internet) I can write: “I just spotted Bigfoot in my backyard eating a ham sandwich!” Ludicrous, yes – but it’s now out there as an assertion, available to search engines and increasing the noise in our data.
On the other hand, public data as linked data can be linked to additional detail, information about the source, information about the qualifications of the source. In turn, that data can be further linked to additional corroborative and supportive evidence. This is especially important for government where so much information is based on supporting policy, regulation and law. These must all be linked together for any particular information instance to be considered authoritative.
Trust, by the citizens of a nation, is paramount for government data. With these new RDF 1.1 linked data standards, trust and transparency have taken another leap forward.
Michael C. Daconta (mdaconta@incadencecorp.com or @mdaconta) is the Vice President of Advanced Technology at InCadence Strategic Solutions and the former Metadata Program Manager for the Homeland Security Department. His new book is entitled, The Great Cloud Migration: Your Roadmap to Cloud Computing, Big Data and Linked Data.