I attended the LODLAM meeting last week in Wellington, New Zealand as a relative newcomer to the concept of linked open data.

One of the questions the meeting tried to answer was “what are the best use cases for linked data?” “What is a use case that is compelling enough to warrant our investment in creating linked data?”

One comparison made* was that of Charles Darwin making his notes on the voyage of The Beagle. Did Darwin record merely a list of points in tabular notebook? No, he recorded his observations in journal form and his ideas were slowly organised over years. In his journal of assertions Darwin found connections and common themes emerging over time.

Linked open data is a series of assertions in the form of subject/predicate/object. Ideally this is not on isolated sets of data but across open sets of data on the web. Complex topics are made up of assertions of varying degrees of authority. Linked open data is a better representation of knowledge on the web.

There are problems with each of the three parts of the linked open data triple. The subject may require disambiguation. E.g. Which “Hamilton” are we referring to? We need some consistency in the language used for the predicate. E.g. How do we describe relationships between people? The objects that we refer to may not be trustworthy. How do we choose sets of data we are confident to refer to? These problems with linked open data perhaps aren’t problems after all – it’s a system that reflects the uncertainties of the real world.

This reminded me of Aaron Cope’s paper “The Interpretation of Bias”. By reverse-geocoding photo locations in Flickr, maps could be drawn which better represented the disputed boundaries of each named place.

In the same way, linked open data may give us a way to better describe grey knowledge – all of the objects that are grey around the edges.

* please comment if you recall who made this comparison

