Issue 12073

Decide on open issues regarding Tags

12073
Reporter: lfrancke
Type: Task
Summary: Decide on open issues regarding Tags
Priority: Minor
Resolution: Fixed
Status: Closed
Created: 2012-10-24 00:35:13.16
Updated: 2013-12-06 13:54:36.199
Resolved: 2013-12-06 13:54:36.168
        
Description: Open issues from the discussion listed on the Wiki http://dev.gbif.org/wiki/display/POR/Tags

* case sensitivity of namespaces and predicates
* rename predicate to key
* when no predicate is given where does the value go

Additionally I find the API TagNamespace confusing especially {{getPrefix}} and the URL stuff?!

Examples of Tagging APIs:

* Delicious (https://delicious.com/) was the first ever service to make tags popular and they are using simple labels which can be private or public.
* Flickr has two kinds of tags (https://secure.flickr.com/help/tags)
** Machine tags
** Simple tags/labels
* OpenStreetMap (https://wiki.openstreetmap.org/wiki/API_v0.6#Tags) has key-value pairs which are case-sensitive and mirror the Map semantic from Java (one value per key)
* Tumblr/Wordpress etc. (http://www.tumblr.com/docs/en/api/v2) have tags as simple string "labels"
* The Wikipedia article: https://en.wikipedia.org/wiki/Tag_(metadata)
* HTML meta tags have a name and content]]>
    


Author: mdoering@gbif.org
Created: 2012-10-24 10:06:58.107
Updated: 2012-10-24 10:06:58.107
        
The namespace prefix is a good catch, we never sorted that out.
Initially we thought of a namespace like one in xml which is a URI. Because its nasty to deal with in many cases, we decided to allow a prefix, basically a unique alias, to alternatively represent it. So "harvesting.gbif.org" might be just "hit" and you can add the same tag with hit:ignore=true or harvesting.gbif.org:ignore=true

Definitely needs further discussions
    


Author: mdoering@gbif.org
Created: 2012-10-24 11:40:30.918
Updated: 2012-10-24 11:40:30.918
        
our tag api is mostly modelled after flickr machine tags.
Quoting the "What are machine tags?" section of
http://www.flickr.com/help/tags/

---
Machine tags always have 3 parts just like the Upcoming example "upcoming:event=428084".

namespace:predicate=value

a namespace, i.e. upcoming [who is going to care about this tag]
a predicate, i.e. event [what does this apply to]
a value, i.e. 123456 [which one is this]
To see this in another example, you can record location information by entering latitude and longitude as geo:lat=12.345678 & geo:lon=12.345678
    


Author: mdoering@gbif.org
Created: 2012-10-24 11:41:35.367
Updated: 2012-10-24 11:41:35.367
        
Also in the above flickr link it says:
"Namespaces and predicates are case-insensitive"
    


Author: mdoering@gbif.org
Created: 2012-10-24 11:43:17.562
Updated: 2012-10-24 11:43:48.375
        
The official announcement Flickr made about machine tags in the Flickr API group:
http://www.flickr.com/groups/api/discuss/72157594497877875
    


Author: mdoering@gbif.org
Comment: http://tagaholic.me/2009/03/26/what-are-machine-tags.html
Created: 2012-10-24 11:47:50.163
Updated: 2012-10-24 11:47:50.163


Author: mdoering@gbif.org
Created: 2012-10-24 11:50:03.624
Updated: 2012-10-24 11:50:19.355
        
jquery lib to process machine tags:
https://github.com/cldwalker/machinetag.js
    


Author: lfrancke@gbif.org
Created: 2012-10-24 11:59:34.363
Updated: 2012-10-24 12:05:53.333
        
You really really like the Flickr API, eh?

I've added links to the issue.

I think the problem with all three open issues comes from the fact that we all have different backgrounds, the term _tag_ is overloaded with at least three different concepts and we're trying to use all three of these concepts at the same time and need to find a model that supports it.

* RDF like tags (see Flickr)
* Key-value semantic (OpenStreetMap)
* Labels

None of these is correct or wrong. They all do different things and we need all of them.

* The portal mostly exposes simple labels
* The crawler (and I suspect a lot of other tools to follow like DwC Validator) need key-value semantics
* RDF kind of semantic I'm not actually sure about?

For my use of _tags_ the terminology of predicate and value is wrong no matter how you turn it. For other uses it may be correct.

Maybe it all boils down to just one simple question:
* Which of these use-cases we want to support using _tags_
** If we decide that we want all of them we'll never get a terminology or even semantic that suits all of them and we need to live with that
** If we decide to drop something (like key-value stuff) where does that live?

Edit: I think the other issues (case-sensitivity, no predicate given) follow the exact same thing. Having the value as the key in a key-value approach is clearly wrong, similarly I wouldn't expect my key-values to be changed at all, for labels and RDF this might be different again.
    


Author: mdoering@gbif.org
Created: 2012-10-24 12:14:50.973
Updated: 2012-10-24 12:14:50.973
        
Lars, you are correct there are at least these 3 different types of tagging and we modelled the current one after flickrs machine tags.
Reasons being that its in use in our community already and they do tag images with dwc terms for example, see EOLs flickr group.

We internally at least also want to tag datasets with dwc terms to give some machine readbale context to things humans can derive from the metadata descriptions. So we need more than just a key value map for that.

We felt that plain text keywords/tags are not worth keeping separate and could easily be integrated in the structured 3 parted tags. I still think this is a reasonable thing to do, in particular as the structured tag can always be represented as a single string and therefore be passed around even in applications that only recognize simple tags without losing its structure.
    


Author: trobertson@gbif.org
Created: 2012-10-24 12:24:21.169
Updated: 2012-10-24 12:24:21.169
        
I have written my thoughts as an example API:
  http://dev.gbif.org/code/cru/CR-POR-86

It is not complete (e.g. service will throw permission exceptions) but I wanted to keep it simple for illustrative purposes
    


Author: lfrancke@gbif.org
Created: 2012-10-24 12:32:13.856
Updated: 2012-10-24 12:32:13.856
        
{quote}We internally at least also want to tag datasets with dwc terms to give some machine readbale context to things humans can derive from the metadata descriptions. So we need more than just a key value map for that.{quote}

What does that mean? We can tag datasets with dwc terms in all three "concepts".
    


Author: fmendez@gbif.org
Comment: Tags were implemented in Registry 2.0
Created: 2013-12-06 13:54:36.199
Updated: 2013-12-06 13:54:36.199