Tagging ambiguities Sunday, November 16, 2008

I blogged a couple of months ago about taking part in the ReScope study on delicious usage. It's encouraged me to reflect on the way I tag.

Open tagging leads to relatively chaotic tag choices. This is often raised as a criticism of the "widsom of crowds" approach to tagging, where a community owned folksonomy of tags emerges (as opposed to a centrally managed taxonomy).

Marieke Guy and Emma Tonkin published a paper in D-Lib magazine on this entitled Folksonomies: Tidying up Tags?, in which they explored the extent to which tags used in delicious and flickr were based on dictionary words. They drew up a list of some of the possible ways that tag usage can be ambiguous.

Polysemy relates to a word being used in different contexts with different but related shades of meaning. So for example in my own tagging practice I have sometimes used the tag identity to refer to the ways that subjects construct and assert their own identities, with particular reference to activity on the web. Other times I have used it in a more technical sense, around the validation and authentication of tokens to confer access to resources.


Homonyms are words which are spelled the same but actually have different roots and therefore not related. For example, I have used the tag owl to refer to the bird that hunts by night (and in particular to the Owl of Minerva, that Hegel says does not take flight until darkness falls). But I've also used it in its sense as an acronym for Web Ontology language.

There is also a problem with synonyms. For example, I have used the two tags "QR" and qrcode where it would have been more useful to have used a single tag (and I will unify on "qrcode").

Guy and Tonkin also identified problems where different users chose to categorise similar phenomena at very different levels. I have tagged at different levels of categorisation for example where I have sometimes tagged an event as a conference, but other times used a shared specific "channel" style tag for it such as altc2008.

Finally there are differences in spelling, use of plurals, simple data entry mistakes and so on.

And these are just issues that arise with one single user tagging over a 3 year period.

Delicious had 1 million registered users in Sept 2006 and 5.3 million in Nov 2008 with 180 million unique URLs saved.

So clearly there's a rich and complex tag soup out there.

And yet .. there is absolutely no way that I would use a formal classification system to add metadata to the web pages I am interested in. Folksonomy is noisy, but at least there's some signal in there.

Owl of Minerva image from: http://commons.wikimedia.org/wiki/File:Owl_of_Minerva.jpg
Licensed under: http://en.wikipedia.org/wiki/GNU_Free_Documentation_License

0 comments:

Subscribe to: Comments (Atom)