Issue 14338

Revise how we handle failed field validation for non-essential fields

14338
Reporter: kbraak
Type: Task
Summary: Revise how we handle failed field validation for non-essential fields
Priority: Major
Status: Open
Created: 2013-11-07 15:08:35.958
Updated: 2014-04-02 17:41:58.341
        
Description: We experienced a crawl today that failed due to bad metadata.

Specifically, the problem with the metadata was that the resource contact's position was equal to "-", which is too short according to the 2 character minimum assertion on its corresponding column in registry database schema. The contact position is non-essential, relative to the contact name or email address.

We should decide which field validation failures should prevent persistence, (and ultimately crawling) or whether to just replace the bad values with null if they fail validation.

Of course it would be ideal if the IPT imposed the same field validation as the registry does.

It would also be nice, if the crawler could validate the metadata just like it validates its occurrence-id/triplet. That way if it doesn't validate, the crawler could halt the crawl and explain the reason why.



 ]]>
    


Author: kbraak@gbif.org
Comment: Issue for the IPT to impose the same field validation as the Registry, added in Google Code: https://code.google.com/p/gbif-providertoolkit/issues/detail?id=1038
Created: 2013-12-12 16:05:26.824
Updated: 2013-12-12 16:05:26.824