Issue 17107

US-style verbatim dates not parsed correctly by provider

17107
Reporter: rdmpage
Assignee: cgendreau
Type: Bug
Summary: US-style verbatim dates not parsed correctly by provider
Description: Occurrences provided by TTU Mammals, such as http://www.gbif.org/occurrence/911715813 have incorrect dates. In the example record, GBIF interprets the date as "Jan 1, 2001 12:00:00 AM", the verbatim record gives the day as "7", and the verbatim date is "7/19/2001", in other words, July 19, 2001. It looks like the error originates with the data from the provider, and I've posted an issue about this on Vertnet's github https://github.com/ttu-vertnet/ttu-mammals/issues/9 but it is worth noting here as well.
Priority: Major
Status: Open
Created: 2015-02-07 11:56:33.315
Updated: 2016-02-05 16:49:11.527


Author: mdoering@gbif.org
Comment: Date interpreting is rather complex in dwc unfortunately due to the wealth of options it can come in. We look primarily at the eventDate which is give here as just "2001". As the month is missing adding the day alone does not help when we try to create a proper timestamp in the interpreted record. We probably should try to look into the verbatimEventDate content and try to interpret that in case the eventDate or the year/month/day fields are not fully populated. 
Created: 2015-02-07 13:21:24.933
Updated: 2015-02-07 13:21:24.933


Author: rdmpage
Comment: Yes, I think looking at the verbatim date (in this case "7/19/2001") would help. If the data provider is from the US, and there are a number of dates in the same data set where the middle number is outside the range [1-12] and the first number is never bigger than 12 then it's pretty clear that the dates are in US format. The TTU dataset has a bunch of issues, see also the presence of eastings and northing values in the verbatim lat and long fields https://github.com/ttu-vertnet/ttu-mammals/issues/11  Just a world of hurt!
Created: 2015-02-07 13:28:28.607
Updated: 2015-02-07 13:28:28.607