Thursday 25 June 2015

Semantic Web for the Working Ontologist: chapter 11

Because I've enjoyed leaving everyone on tenterhooks after the cliffhanger at the end of chapter 10, and to prolong the joy, from now on, I'm going to alternate blogging a chapter of this book every fortnight, with musing on other subjects*

Anyway, this week's chapter covers basic OWL, and focuses on the use of owl:Restriction, which the authors tell us – almost breathless with excitement – "opens up whole new vistas in modeling capabilities".

In OWL, a Restriction is a special type of class: one that's defined "by describing the individuals it contains… in terms of existing properties and classes".

The chapter focuses on three kinds of restrictions:
owl:allValuesFrom – all values from a set of properties must apply
owl:someValuesFrom – one or more values from a set of properties apply
owl:hasValue – refers to a particular value for a property.

There's a difference between the first two of these which is subtle and not immediately obvious. someValuesFrom "is defined as a restriction class such that there is at least one member of a class with a particular property". That means that "there must be such a member". allValuesFrom, however, "means 'if there are any members, then they all must have this property", which doesn't "imply that there are any members". This will, apparently, be important later in the book (there are, gentle reader, still 5 chapters to go). Oh, and these three restrictions are often shortened to the keywords all, some, and value, using the Manchester Syntax (developed at the University of Manchester).

Because it's so useful and frequently-used, owl:hasValue – the type of restriction that "effectively turns specific instance descriptions into class descriptions"  – is "identified in the OWL standard in its own right". It is viewed "as a design pattern" and given a name: the Class-Individual Mirror pattern. It is used to describe "the relationship of an individual to a set—this is the set of all things that relate to this individual in a certain way."

Having demonstrated the application of restrictions in the design of a questionnaire, the authors go on to explore ways of ensuring steak isn't classified as a vegetarian food, and preventing a steak-eater from being described as a vegetarian, and, then, to explore 'relationship transfer'. Relationship transfer happens when "Everything related to A by property p should also be related to B but by property q".

not suitable for vegetarians
So, for instance, everyone who has bought an NUS Extra Card† will also be a student affiliated to an educational institution (or a work-based apprentice - but let's not split hairs here). Or, as in the authors' examples "'Everyone who plays for the All Star team is governed by the league's contract' and "Every work in the Collected Works of Shakespeare was written by Shakespeare'".

Finally, they describe the difference between rdfs:subClassOf and owl:equivalentClass. The former is "a simple IF/THEN relation", the latter "an IF and only IF relation". In other words, it is two IF/THEN relations: "one going each way".

And, in their summary, Dean Allemang and Jim Hendler explain the importance of restrictions. Essentially, "they can be used to model complex relationships between properties, classes, and individuals". This means that "interactions of multiple specifications can be understood and even processed automatically." And, while our brains are busy working out the possibilities that opens up, I'll finish by saying that next week I'll be blogging/thinking more about reader journeys.

I also won't make the mistake I made this week of blogging on 2 computers and accidentally overwriting the finished version by saving an earlier draft and having to rewrite. Grrrrr.

* oh, all right, it's mostly because I'm very busy at the moment, and this allows me to blog about some of the stuff I'm doing

† and if you're eligible and haven't, it's well worth it

Thursday 18 June 2015

We interrupt this blog…

… because I've been distracted from reading about the semantic web by playing with something made with its tools, that does something similar to what I'm aiming to do with Hayleyworld.

The BBC's Home Front Story Explorer is designed for both current and new Home Front listeners wanting to catch up on stories they've missed, recap, or focus on specific characters/storylines. Based on this model (See the News Storyline Ontology developed with The Guardian, the Press Association and digital media consultancy Ontoba)…

Screen-grabbed from http://www.bbc.co.uk/ontologies/storyline.

 …the Explorer is clear, clean and attractively-designed, with elegant and evocative illustrations.

The Beeb's approach is ideally suited to long-running radio dramas, where the broadcast programmes cut between characters and plot lines, are built from distinct scenes or "moments" (not necessarily the same thing) and stories ebb and flow over months, years or even, in the case of The Archers, decades. One advantage is that – as here – the length of the re-ordered material can be determined either by content – ie the extent of an individual storyline – or by listener choice, instead of the time constraints of a broadcast slot.

Also, listeners can explore as and where they listen: although the two-screen (or one screen + radio) approach trumps this particular experimental iteration, as exploring too deeply takes you to a different webpage, and the clip stops playing…

What's exciting about this is both the control it gives the listener and the fact that using semantic web technologies to build it means that, over time, its extent and depth can be increased by federating more data. At the moment, for instance, it includes limited "on this day" snippets of historical information, which could be extended into details, images and other AV content, and a description of the geographical (real and imagined) location of the clip, but no map.

At present, there are no cast or creative details or images. I'd have liked to be able to explore these too, but wonder if they should be kept separate to enable us to immerse ourselves more fully in Home Front's time and place? Perhaps the ideal would be to be able to toggle in and out of the story world: to be able to choose either to immerse yourself in its reality, or to zoom out for all the who, what and how behind-the-scenes information… I'd love that…

Visit the BBC's Home Front Story Explorer
Lead producer Tristan Fearne explains the background to the Explorer

Thursday 11 June 2015

Semantic Web for the Working Ontologist: chapter 10

I'm absurdly busy at the moment, and was therefore rather relieved that this week's chapter "SKOS—managing vocabularies with RDFS-Plus" only ran to 12 pages.

SKOS – its focus – stands for Simple Knowledge Organization System. It is
"an area of work developing specifications and standards to support the use of knowledge organization systems (KOS) such as thesauri, classification schemes, subject heading lists and taxonomies within the framework of the Semantic Web" (see http://www.w3.org/2004/02/skos/)
The authors guide us through "an example of using SKOS" - AGROVOC: the thesaurus developed by the United Nations Food and Agriculture Organization "for organizing documents about agriculture". Naturally, this needs to be multilingual, with no one language taking precedence over any other, so individual resources are identified by numbers, with a variety of labels, including a language tag (optional in Turtle strings, and drawn from XML).

Labels in SKOS are more sophisticated than in RDFS."It defines three different types of label: preferred; alternative and hidden. Each is an rdfs:subPropertyOf rdfs:label, "which provides a human readable version of the name of each resource".


For instance…

agrovoc_7030 rdfs:label "Sheep"@en
agrovoc_7030 rdfs:label "Ovin"@fr


we are all agrovoc_7030. Also NAL:38846 
The "semantic relations" SKOS defines correspond to what the authors describe as "familiar terms" derived from thesaurus standards. These include skos:broader, skos:narrower and skos:related. skos:broader and skos:narrower are closely related – but not identical – to rdfs:superClassOf and subClassOf.

One of the best things about SKOS is the way it allows different vocabularies to be linked: including the many controlled vocabularies and thesauri that were developed before computers existed. Because it is RDF-based, every term has a URI. "This makes it possible to make statements about how a term in one vocabulary relates to a term in another", and SKOS provides skos:exactMatch, skos:narrowMatch, skos:broadMatch and skos:closeMatch, so that different people can make individual judgements about the nature of the relationships between different entities: Anyone can say Anything about Any topic.

A major facet of SKOS is its inclusion of "the notion of Concept Scheme", which Allemang and Hendler define as
"a largely informal collection of concepts, corresponding roughly to a particular thesaurus or knowledge organization system."
So, for instance where different thesauri have different identifiers for sheep like SKOS, and the US National Agricultural Library (NAL), where a sheep is NAL:38846, we can specify which thesaurus each belongs to using skos:inScheme followed by the URI for the scheme. The significance of this is that the "explicit statement of a concept scheme makes this relationship explicit and queryable with SPARQL".

Also useful in SKOS is the idea of a Top Concept. Whilst it's always possible to identify the property at the top of a property tree with a SPARQL query, it's "convenient to indicate it". There's a handy graph illustrating AGROVOC's Top Concepts on its FAQ page.

Finally, the authors explore why it is that SKOS has become so popular. They identify four major factors:
  1. simplicity
  2. "the ease with which a vocabulary can be transformed from other systems into SKOS"
  3. the fact that, before SKOS there wasn't a standard way to represent a vocabulary in digital form
  4. "translation into SKOS turns … locally unique identifiers into globally unique identifiers", making it much easier to relate disparate vocabularies to each other.
Next week - Basic Owl

Thursday 4 June 2015

Semantic Web for the Working Ontologist: chapter 9

This week we're on to "Using RDF-Plus in the wild", where the authors introduce us to two real-world applications of the principles and practices explained in chapters 1-8.

Both "are about setting up an infrastructure for a particular web community". Data,gov, the US government project to put public information online in machine-readable form and FOAF – the acronym for Friend of a Friend, which is "a machine-readable ontology describing persons, their activities and their relations to other people and objects." (quote from Wikipedia).

 The subset of US government data the authors focus in on is the list of cases "about alleged violations of Title VIII of the Fair Housing Act" filed with the Office of Fair Housing/Equal Opportunity (FHEO). The original data gives a case number, and a list of possible reasons for the complaint – each of which is marked with either a 0 or a 1, to indicate whether or not it is the basis of the complaint –  as well as the county that the omplaint is filed in/relates to. The authors take us through ways of using RDFS and SPARQL to organise and process this data to make it more meaningful and useful, enabling human beings to work out where, for instance, alleged violations of Title VIII of the Fair Housing Act cluster, and which reasons form the basis of the most cases.

FOAF – started in 2000 – "began with a simple observation"
If we are to support social networks on the Web, individuals must be able to take control of their own data, host it as they please, manage it using whatever tools they please, but still interact with other users, regardless of the choices these other users make. (p196)
This, more than Data.gov, demonstrates the adaptability of RDF, SPARQL and their associated languages, standards, frameworks, technologies – or however they should be described.

What's key is that "because Anyone can say Anything about Any topic, FOAF allows anyone to make novel statements about people, projects, and so on and to relate these statements to other statements already made."

Everything, in other words, can be connected to everything else, but by machines, and in ways that, ultimately, make sense to humans.

I think my inner hippie is showing…

My inner hippie. Yup. That's what I'm like on the inside.