Everybody who’s just slightly interested in Open Data has probably heard of the 5 star open data staircase attributed to Tim Berners Lee: At the end, the goal, we all want to reach, is LOD, Linked Open Data. Except, that if we ever manage to create those holy grail data sets, we better do not want to know, if they are ever used.

Because, to be honest, give developers and data people some XML and RDF and they prefer to run away, at least I do.
Academics know that as well. So a new concept emerged: It’s now LOUD instead of LOD. And the U makes me excited, because a lot of discussions in digital cultural heritage and digital archiving is about data collection and ontologies and not so much about using it again.
In my first Prototype Fund project „Remove NA – A knowledge graph about queer history“ I wrote about data modeling: „There is a tension between the most representative model of a chaotic reality and the tidiest possible data, which are then no longer accurate, but with which it is easier to work.“
And a reasonable data model is central for reusability. Reusing not in the sense of looking at a single entry on a website, but reusing it in ways how software developers, data scientist and data practicioners would do it: Like using APIs, writing queries, being able to build something without having to parse some XML first.
A rule of thumb: the more exceptions, the more detailed each special case is, the more difficult it will be for outsiders to use the data. And LOUD seems to take this perspective seriously: Make it as easy as possible to work with the data, because otherwise you can just save yourself the effort.
A few days ago I stumbled over Julien A. Raemy’s dissertation and his description of LOUD, also based on the work of Robert Sanderson. I just quote that whole part, because it reads like a missing link between an idealized data world and practical data work.
LOUD Design Principles
One of the main purposes of LOUD is to make the data more easily accessible to software developers, who play a key role in interacting with the data and building software and services on top of it, and to some extent to academics. As such, striking a delicate balance between the dual imperatives of data completeness and accuracy, which depend on the underlying ontological construct, and the pragmatic considerations of scalability and usability, becomes imperative.
Similar to Tim-Berners Lee’s Five Star Open Data Deployment Scheme, Robert Sanderson listed five design principles that underpin LOUD:
A. The right Abstraction for the audience
B. Few Barriers to entry
C. Comprehensible by introspection
D. Documentation with working examples
E. Few Exceptions, instead many consistent patterns
A. The right Abstraction for the audienceDevelopers do not need the same level of access to data as ontologists, in the same way that a driver does not need the same level of access to the inner workings of their car as a mechanic. Use cases and requirements should drive the interoperability layer between systems, not ontological purity.
B. Few Barriers to entry
It should be easy to get started with the data and build something. If it takes a long time to understand the model, ontology, sparql query syntax and so forth, then developers will look for easier targets. Conversely, if it is easy to start and incrementally improve, then more people will use the data.
C. Comprehensible by introspection
The data should be understandable to a large degree simply by looking at it, rather than requiring the developer to read the ontology and vocabularies. Using JSON-LD lets us to talk to the developer in their language, which they already understand.
D. Documentation with working examples
You can never intuit all of the rules for the data. Documentation clarifies the patterns that the developer can expect to encounter, such that they can implement robustly. Example use cases allow contextualization for when the pattern will be encountered, and working examples let you drop the data into the system to see if it implements that pattern correctly.
E. Few Exceptions, instead many consistent patterns
Every exception that you have in an API (and hence ontology) is another rule that the developer needs to learn in order to use the system. Every exception is jarring, and requires additional code to manage. While not everything is homogenous, a set of patterns that manage exceptions well is better than many custom fields.