Thursday, 6 August 2015

Semantic Web for the Working Ontologist: Chapter 14

In this week's chapter, Dean Allemang and Jim Hendler cover "Good and bad modeling practices". To be clear, some of the "bad" modeling practices they include, are bad only in the context of the semantic web: they are standard in object systems and only become problematic when ported into an open environment where Anyone can say Anything about Anything, as opposed to a closed data system.

The chapter opens by outlining three ways to start model-building
  1. "find models on the Web that suit your needs" - that way you don't end up wasting time and other resources redoing work that somebody's already been done
  2. "leverage information assets that already have value for your organization": the information you're working with is likely to already be "vetted"
  3. start from scratch, using "standard engineering practices … including the development of requirements definitions and test cases.
Whichever route you choose, there are questions that must be answered: is this model useful? What do we need this model to do? "This poses two issues for the modeler: How do I express my intended purpose for a model? How do I determine whether a model satisfies some purpose?' One way to do this is to frame "competency questions" – ie, questions the model will need to answer – before developing it.


The AAA assumption adds a massive element of complexity to the entire process, because
"On the Semantic Web, it is expected that a model will be merged with other information, often from unanticipated sources. This means that the design on a semantic model must not only respond to known requirements … but also express a range of variation that anticipates to some extent the organization of the information with which it might be merged."
All a bit mind-boggling, really.

The advice the authors give for dealing with this involves quoting the March Hare from Alice in Wonderland "say what you mean and mean what you say"

"Say what you mean and mean what you say"

Which translates into ensuring that
  • the names you use for entities are meaningful
  • you follow simple conventions (such as starting class and individual names with uppercase letters, property names with lowercase letters and naming classes with singular, rather than plural, nouns)
  • you plan carefully in order to distinguish classes from individuals (this can be tricky)
 Once assembled, your shiny new model can be tested by by ensuring it answers the competency questions framed beforehand. Analysing "the inferences that the model entails" can determine "whether it maintains consistent answers to possible competency questions from multiple sources.




The remainder of the chapter is taken up with analysis of four common modeling errors:
  1. Rampant classism -– where everything is defined as a class, even if it should be an individual
  2. Exclusivity – the flawed assumption that "the only candidates for membership in a subclass are those things that are already known to be members of a superclass".
  3. Objectification – where a system is built for the web that "has the same meaning and behaviour as an object system", which doesn't take into account "AAA, Open World and Nonunique Naming"
  4. Creeping conceptualization – when good modelers go bad (oh, ok, just get carried away) and "the idea of 'design for reuse' gets confused with 'say everything you can'" as modelers try to anticipate every conceivable use for their model and model all conceivable uses.
Ultimately, the authors say the way of telling if you've built a model that is useful and conforms to the assumptions inherent in the Semantic Web is "by making sure that the inferences it supports are useful and meaningful". Which seems slightly tautological, but hey, what do I know?



Next week I'm on holiday, and, in a shock break with tradition, am staying somewhere with no wifi. So I won't be blogging. Or – probably – coping with the lack of connectivity. Back in a fortnight…

No comments:

Post a Comment