Thursday, 11 June 2015

Semantic Web for the Working Ontologist: chapter 10

I'm absurdly busy at the moment, and was therefore rather relieved that this week's chapter "SKOS—managing vocabularies with RDFS-Plus" only ran to 12 pages.

SKOS – its focus – stands for Simple Knowledge Organization System. It is
"an area of work developing specifications and standards to support the use of knowledge organization systems (KOS) such as thesauri, classification schemes, subject heading lists and taxonomies within the framework of the Semantic Web" (see http://www.w3.org/2004/02/skos/)
The authors guide us through "an example of using SKOS" - AGROVOC: the thesaurus developed by the United Nations Food and Agriculture Organization "for organizing documents about agriculture". Naturally, this needs to be multilingual, with no one language taking precedence over any other, so individual resources are identified by numbers, with a variety of labels, including a language tag (optional in Turtle strings, and drawn from XML).

Labels in SKOS are more sophisticated than in RDFS."It defines three different types of label: preferred; alternative and hidden. Each is an rdfs:subPropertyOf rdfs:label, "which provides a human readable version of the name of each resource".


For instance…

agrovoc_7030 rdfs:label "Sheep"@en
agrovoc_7030 rdfs:label "Ovin"@fr


we are all agrovoc_7030. Also NAL:38846 
The "semantic relations" SKOS defines correspond to what the authors describe as "familiar terms" derived from thesaurus standards. These include skos:broader, skos:narrower and skos:related. skos:broader and skos:narrower are closely related – but not identical – to rdfs:superClassOf and subClassOf.

One of the best things about SKOS is the way it allows different vocabularies to be linked: including the many controlled vocabularies and thesauri that were developed before computers existed. Because it is RDF-based, every term has a URI. "This makes it possible to make statements about how a term in one vocabulary relates to a term in another", and SKOS provides skos:exactMatch, skos:narrowMatch, skos:broadMatch and skos:closeMatch, so that different people can make individual judgements about the nature of the relationships between different entities: Anyone can say Anything about Any topic.

A major facet of SKOS is its inclusion of "the notion of Concept Scheme", which Allemang and Hendler define as
"a largely informal collection of concepts, corresponding roughly to a particular thesaurus or knowledge organization system."
So, for instance where different thesauri have different identifiers for sheep like SKOS, and the US National Agricultural Library (NAL), where a sheep is NAL:38846, we can specify which thesaurus each belongs to using skos:inScheme followed by the URI for the scheme. The significance of this is that the "explicit statement of a concept scheme makes this relationship explicit and queryable with SPARQL".

Also useful in SKOS is the idea of a Top Concept. Whilst it's always possible to identify the property at the top of a property tree with a SPARQL query, it's "convenient to indicate it". There's a handy graph illustrating AGROVOC's Top Concepts on its FAQ page.

Finally, the authors explore why it is that SKOS has become so popular. They identify four major factors:
  1. simplicity
  2. "the ease with which a vocabulary can be transformed from other systems into SKOS"
  3. the fact that, before SKOS there wasn't a standard way to represent a vocabulary in digital form
  4. "translation into SKOS turns … locally unique identifiers into globally unique identifiers", making it much easier to relate disparate vocabularies to each other.
Next week - Basic Owl

No comments:

Post a Comment