Thursday, 27 August 2015

HayleyWorld: brief musings on content and design

If you're new to this blog, and aren't familiar with my zoeography project, you may want to read one or more of my introductory posts for context before getting stuck in here.

So, what's new about it? – what I'm trying to do
A "zoeography"? – why I've invented a new term for this form
Introducing William Hayley er, introduces my subject, William Hayley.

I've written three posts about thinking through the reader journey (one, two, three), and in this post I want to think through a couple of other elements of the project:

Firstly, content.

I've sorted and edited into the first person all the nuggets of text from Hayley's Memoirs that will form the backbone of the narrative. I've also keyworded and tagged them with their original Memoir volume and page numbers, dates, places and the people they refer to. Most of them will fit into the main HayleyWorld keyword topics, which, at the moment are scheduled to be (in alphabetical order)
  • children
  • death
  • education
  • friendship
  • love
  • madness
  • marriage
  • medicine
  • poetry 
  • politics
  • theatre
These may change: it's possible that some subjects might have insufficient content, while others may need to be broken down into sub-topics. Some might overlap to such an extent that, having gone through one, a reader might be left with nothing substantial when they get to the other. This may turn out to be the case with, say love and marriage and/or children and education. There's also a possibility that madness may turn out to be a subset of medicine. This probably wouldn't be a problem for anyone reading madness before medicine, but would be if the order's reversed.

There will also be some extracts that fall outside the main topics. Given that the reader journey is partly driven by choices of these topics, how can I ensure these don't get lost? Do I need to ensure they don't get lost: if they don't fit in, are they superfluous? How will I know?

At the moment, I'm wondering about giving at least some of these another keyword – asides – and maybe treating them as comments and anecdotes for William Hayley to drop into the conversation, not quite randomly

I'm also thinking about the look of the app. Hayley's words, thoughts and feelings will take centre page: mine, and others' – with the exception of his first wife Eliza's – will, I hope, appear as marginalia. In addition, I'm planning pop-up content, which will include Hayley's poems, extracts of unpublished correspondence and the lik. Most, but not all, of this will be reader-controlled. 

A while back, one of Contentment's designers mocked up a few ideas. I particularly like this one, which allows for up to two voices (or subjects) in the margin.

It's challenging to adapt something like that for mobile, while keeping the 18th century feel. I'm wondering if exploring the art of the 18th century portrait miniature (or even the transient late eighteenth/early nineteenth century fashion for eye miniatures) might yield some ideas. But the main issue is how we can make the thing feel comfortable and enjoyable to read, whilst also keeping a sense of Hayley's time and place…

Will hopefully have more to report in a couple of weeks.

Next week: the final chapter of Semantic Web for the Working Ontologist.

Thursday, 20 August 2015

Semantic Web for the Working Ontologist: Chapter 15

This week: "Expert modeling in OWL". In this – the penultimate chapter – Allemang and Hendler give provide a brief outline of four subsets of OWL 2, each of which is tailored to particular modeling requirements. They also describe, tantalisingly, how the OWL 2 standard "is rich in modeling constructs that go beyond the scope of this book".

OWL 2 is backward compatible with 1, which means that all the modeling techniques taught in this book remain valid, and that the additional constructs are additions to/refinements of, rather than replacements for OWL 1 constructs and practices.

Each of the four OWL 2 subsets uses the "same set of modeling constructs": ie: the same properties and classes, the same syntax. They differ in that each is tailored to serve a different purpose

OWL 2DL – D is for decidability
For projects where decidability is key. A system "is decidable if there exists an effective method such that for every formula in the system the method is capable of deciding whether the formula is valid (is a theorem) in the system or not." In other words, it's designed for applications where precise and discrete definitions of things/entities are crucial.

OWL DL is designed to enable modelers to create algorithms that can "determine which classes [in a given model] are equivalent to other classes, which classes are subclasses of other classes, and which individuals of are members of which classes."
OWL 2 EL – E is for executable
For projects that are mostly about federating data from a variety of sources in order to provide an "integrated picture of some sort of domain". In this type of modeling – used, for instance, by search engines that need to provide good rather than perfect answers (partly because they need to take into account the fact that humans ask good rather than perfect questions) – "the model describes how information can be transformed into a uniform structure".

So OWL 2 EL is designed to "improve computational complexity".

OWL 2 QL – Q is for Query
This subset of OWL 2 is designed for working with/leveraging relational databases, that require "fast responses to queries" applied to huge, specified data sets.

"Queries against an OWL 2 QL ontology and corresponding data can be rewritten faithfully into SQL".

OWL 2 RL – R is for Rules
This subset of OWL 2 is restricted to enable compatibility with rules-based processing. This is particularly useful for multipart  properties – where properties relate to each other in ways other than the hierarchichal. For instance, the concept of "aunt". I can only be an aunt if I am the sister of someone who is a parent. "Aunt" is thus a multipart property because it is made up of more than one property: parent and sister in that – diasy-chained – order.

Interestingly, multipart predicates were left out of OWL 1 "because they were thought to cause undecidability". But more recent work has demonstrated that "under certain conditions" this need not be the case.

From multipart predicates (which the authors illustrate with an example model of "'A child should have the same species as its parent"')
":Elsie :hasParent :Lulu
:Lulu :hasSpecies :Cow"
So "we can infer that
 :Elsie :hasSpecies :Cow"
Elsie and Lulu. Or two other cows.
 Incidentally, the authors also briefly discuss metamodeling ("using a model to describe another model), recommending the use of the Class-Individual Mirror pattern for this.

Next week - um, not sure yet. Might write about anxiety, connectedness and correspondence. Or something else. May be suffering from undecidability…

Thursday, 6 August 2015

Semantic Web for the Working Ontologist: Chapter 14

In this week's chapter, Dean Allemang and Jim Hendler cover "Good and bad modeling practices". To be clear, some of the "bad" modeling practices they include, are bad only in the context of the semantic web: they are standard in object systems and only become problematic when ported into an open environment where Anyone can say Anything about Anything, as opposed to a closed data system.

The chapter opens by outlining three ways to start model-building
  1. "find models on the Web that suit your needs" - that way you don't end up wasting time and other resources redoing work that somebody's already been done
  2. "leverage information assets that already have value for your organization": the information you're working with is likely to already be "vetted"
  3. start from scratch, using "standard engineering practices … including the development of requirements definitions and test cases.
Whichever route you choose, there are questions that must be answered: is this model useful? What do we need this model to do? "This poses two issues for the modeler: How do I express my intended purpose for a model? How do I determine whether a model satisfies some purpose?' One way to do this is to frame "competency questions" – ie, questions the model will need to answer – before developing it.

The AAA assumption adds a massive element of complexity to the entire process, because
"On the Semantic Web, it is expected that a model will be merged with other information, often from unanticipated sources. This means that the design on a semantic model must not only respond to known requirements … but also express a range of variation that anticipates to some extent the organization of the information with which it might be merged."
All a bit mind-boggling, really.

The advice the authors give for dealing with this involves quoting the March Hare from Alice in Wonderland "say what you mean and mean what you say"

"Say what you mean and mean what you say"

Which translates into ensuring that
  • the names you use for entities are meaningful
  • you follow simple conventions (such as starting class and individual names with uppercase letters, property names with lowercase letters and naming classes with singular, rather than plural, nouns)
  • you plan carefully in order to distinguish classes from individuals (this can be tricky)
 Once assembled, your shiny new model can be tested by by ensuring it answers the competency questions framed beforehand. Analysing "the inferences that the model entails" can determine "whether it maintains consistent answers to possible competency questions from multiple sources.

The remainder of the chapter is taken up with analysis of four common modeling errors:
  1. Rampant classism -– where everything is defined as a class, even if it should be an individual
  2. Exclusivity – the flawed assumption that "the only candidates for membership in a subclass are those things that are already known to be members of a superclass".
  3. Objectification – where a system is built for the web that "has the same meaning and behaviour as an object system", which doesn't take into account "AAA, Open World and Nonunique Naming"
  4. Creeping conceptualization – when good modelers go bad (oh, ok, just get carried away) and "the idea of 'design for reuse' gets confused with 'say everything you can'" as modelers try to anticipate every conceivable use for their model and model all conceivable uses.
Ultimately, the authors say the way of telling if you've built a model that is useful and conforms to the assumptions inherent in the Semantic Web is "by making sure that the inferences it supports are useful and meaningful". Which seems slightly tautological, but hey, what do I know?

Next week I'm on holiday, and, in a shock break with tradition, am staying somewhere with no wifi. So I won't be blogging. Or – probably – coping with the lack of connectivity. Back in a fortnight…