Semiological, or almost entirely?
Mike Harper:
Semiotics, which is clearly older than the semantic web, tells us you can’t always map signs to real world objects. You can do it for things like, say, the Taj Mahal, but not for things like democracy, justice etc. So they map to concepts. Trouble is, you’re talking really about what’s inside someone else’s head. And you can’t really be sure what that is. So, the argument goes, stuff like RDF is just “syntactic sugar". It’s neatly structured but can’t escape the fact that the tags, urns etc have to have an agreed meaning ... I can’t bring myself to agree with this completely. In practice people seem to get by. I think there must be a feedback loop involved. If you interpret a statement about X and act on it, and your interpretation is wrong, and the interpretation matters in this case, something bad will probably happen. You will then revise your understanding of what is meant by X.
This is all good phenomenological stuff - see the Schutz quote above. One of Schutz's great arguments was that there is no definitional God's eye view - there is only human social experience, including the experience of making and using signs.
So surely the semantic web can work in small ways where all parties are agreed on the meaning of the vocabulary.
The trouble is - as Clay pointed out back here - that if you've got that level of agreement among all participants you don't need the semantics. If you're all using the same schema anyway, your respective schemas don't need to describe themselves - and if they do need to describe themselves, there needs to be a common language they can do it in, and hence a higher level of shared context.
What you can do is say "I'm using [x] to mean $FOO, which is a subtype of $BAR but does not overlap with $BAZ; how about you?" Or rather, "On 2005-06-03, writing in Manchester (England/UK/EU), I used [x] to mean $FOO..." and so on. That, to me, is (or rather will be) where it gets interesting - the point is not to encode semiotics but to encode semantics in such a way that the semiotics can be inferred.
Or rather, in such a way that the semiotics can't not be inferred. Which they need to be. Once you get away from the physical sciences and their geek spinoffs, it's very, very hard to reach a final level of granularity. You can map the physical contours of France in exactly the same way that you can map Britain - and with enough data you could map Britain 100 years ago and map France 100 years ago in exactly the same way. What you can't do is chart the number of suicides or street thefts or families in poverty or users of illegal drugs or asylum applications or hospital admissions in Britain and compare them with the figures for Britain 100 years ago, let alone with French figures. This is not because the data isn't there, but (in all those cases) because it's the product of a complex set of social interactions - and, as such, it doesn't have a stable meaning, in time or in space.
This is what I mean about inferring semiotics: figures on 'drug use', to take the most obvious example, are produced in particular ways and classified using particular criteria, which correspond to patterns of public health and law enforcement activity as well as to broader social attitudes. The data doesn't contain or express those attitudes and patterns of activity - but if you don't know about them it's effectively meaningless. ("Hey, look, there are twice as many people using drugs! Oh, wait, there are twice as many substances classified as drugs. Never mind.") The only way forward, it seems to me, is to (as it were) factory-stamp data with the conditions of its production, as far as they can be established: "this source on 'drugs' covers this period in this jurisdiction, and consequently uses definitions derived from this legislation, including this but excluding this and this".
That's what I'd like to do, anyway.
Semiotics, which is clearly older than the semantic web, tells us you can’t always map signs to real world objects. You can do it for things like, say, the Taj Mahal, but not for things like democracy, justice etc. So they map to concepts. Trouble is, you’re talking really about what’s inside someone else’s head. And you can’t really be sure what that is. So, the argument goes, stuff like RDF is just “syntactic sugar". It’s neatly structured but can’t escape the fact that the tags, urns etc have to have an agreed meaning ... I can’t bring myself to agree with this completely. In practice people seem to get by. I think there must be a feedback loop involved. If you interpret a statement about X and act on it, and your interpretation is wrong, and the interpretation matters in this case, something bad will probably happen. You will then revise your understanding of what is meant by X.
This is all good phenomenological stuff - see the Schutz quote above. One of Schutz's great arguments was that there is no definitional God's eye view - there is only human social experience, including the experience of making and using signs.
So surely the semantic web can work in small ways where all parties are agreed on the meaning of the vocabulary.
The trouble is - as Clay pointed out back here - that if you've got that level of agreement among all participants you don't need the semantics. If you're all using the same schema anyway, your respective schemas don't need to describe themselves - and if they do need to describe themselves, there needs to be a common language they can do it in, and hence a higher level of shared context.
What you can do is say "I'm using [x] to mean $FOO, which is a subtype of $BAR but does not overlap with $BAZ; how about you?" Or rather, "On 2005-06-03, writing in Manchester (England/UK/EU), I used [x] to mean $FOO..." and so on. That, to me, is (or rather will be) where it gets interesting - the point is not to encode semiotics but to encode semantics in such a way that the semiotics can be inferred.
Or rather, in such a way that the semiotics can't not be inferred. Which they need to be. Once you get away from the physical sciences and their geek spinoffs, it's very, very hard to reach a final level of granularity. You can map the physical contours of France in exactly the same way that you can map Britain - and with enough data you could map Britain 100 years ago and map France 100 years ago in exactly the same way. What you can't do is chart the number of suicides or street thefts or families in poverty or users of illegal drugs or asylum applications or hospital admissions in Britain and compare them with the figures for Britain 100 years ago, let alone with French figures. This is not because the data isn't there, but (in all those cases) because it's the product of a complex set of social interactions - and, as such, it doesn't have a stable meaning, in time or in space.
This is what I mean about inferring semiotics: figures on 'drug use', to take the most obvious example, are produced in particular ways and classified using particular criteria, which correspond to patterns of public health and law enforcement activity as well as to broader social attitudes. The data doesn't contain or express those attitudes and patterns of activity - but if you don't know about them it's effectively meaningless. ("Hey, look, there are twice as many people using drugs! Oh, wait, there are twice as many substances classified as drugs. Never mind.") The only way forward, it seems to me, is to (as it were) factory-stamp data with the conditions of its production, as far as they can be established: "this source on 'drugs' covers this period in this jurisdiction, and consequently uses definitions derived from this legislation, including this but excluding this and this".
That's what I'd like to do, anyway.
5 Comments:
Reading this makes me stand back and rethink what it means to encode data semantically. In fact I think it points to the limits of what could be achieved.
Isn't it a bit like the problem of translating material between languages where the cultures are very different, or ancient languages where the culture has disappeared? Some meaning is bound to be lost.
I like your point about drug use. My favourite statistic is the percentage of people who are obese, which shifts alarmingly whenever anyone changes the definition of obese.
By Anonymous, at 5/6/05 20:40
Isn't it a bit like the problem of translating material between languages where the cultures are very different, or ancient languages where the culture has disappeared? Some meaning is bound to be lost.
Very like. (Even in living languages, it's surprisingly hard to find one-to-one correspondences. You end up with situations like the European Union defining 'jam' as something made from fruit, then including a footnote saying that for the purposes of the definition 'fruit' includes carrots - they make a lot of carrot jam in Portugal, apparently.) But yes, there's more variation than you might think within a single language, particularly when people start counting things & shaping definitions according to what they want to count: it's not so much apple-the-fruit vs Apple-the-company as "sales of all Apple products" vs "sales of all Apple hardware" vs "sales of all Apple computers" vs "sales of all Macs"...
I like your point about drug use. My favourite statistic is the percentage of people who are obese, which shifts alarmingly whenever anyone changes the definition of obese.
We had a live example of this during the recent election campaign: the Conservatives claimed that police recorded crime figures had shot up in the last few years. Which, indeed, they had. They'd gone up because the police had been told to record more things as crimes; police figures had for many years been substantially lower than the figures given by victim surveys, and the government wanted to try to close the gap. It was quite a clever move by the Conservatives, as it was difficult for the government to point out that there hadn't (necessarily) been a real increase in crime without acknowledging that police figures had previously been too low - kind of "when did you stop cooking the figures?"
By Phil, at 6/6/05 10:15
Mike - I suspect I've just explained a load of stuff you already knew. Sorry about that - I was convinced you were American for some reason.
By Phil, at 6/6/05 10:20
No problem, it might be the .com domain that did it. I'm actually a Yorkshireman.
It is interesting to hear about the crime figures, because I don't usually pay much attention to what our politicians say.
By Anonymous, at 6/6/05 23:25
Yeah i don't know what you are trying to say but Nirvana is awesome!! They are the best ever!!
By Anonymous, at 13/6/05 13:00
Post a Comment
<< Home