Comments (5)
Data extraction module now support this metric. Sample report output:
{
"8137b32e-f762-11e1-a439-00145eb45e9a": {
"NUMBER_OF_RECORDS": 756426,
"BASISOFRECORDS": {
"UNKNOWN": 6,
"OBSERVATION": 11660,
"PRESERVED_SPECIMEN": 744760
},
"TAXON_MATCHES": {
"TAXON_MATCH_HIGHERRANK": 63987,
"TAXON_MATCH_FUZZY": 17272,
"TAXON_MATCH_COMPLETE": 607443,
"TAXON_NOT_PROVIDED": 67724
}
}
}
(I took the freedom to make the constants uppercase for the sake of consistency).
from gbif-dataset-metrics.
@niconoe I think we're missing TAXON_MATCH_NONE
for records that have a taxon, but it could not be matched. At least there is a column taxon_match_none
in the cartodb table created by @peterdesmet .
from gbif-dataset-metrics.
test data is written to cartodb. Only taxon_match_none
is still missing. All values for that column are set to 0.
from gbif-dataset-metrics.
Well, I just had a quick look and it seems the code support it, but that most records that trigger this issue at GBIF have no scientificName at all: http://www.gbif.org/occurrence/search?ISSUE=TAXON_MATCH_NONE
As I blindly implemented Peter's algorithm above, I think we will return TAXON_NOT_PROVIDED for those (unlike GBIF services, this algorithm will put each row in a sigle category... Is that desirable?). And the data extractor (so far) doesn't return TAXON_MATCH_* counters at all if they don't have corresponding record.
Should Peter's algorithm be changed ? Would you like that the report contains TAXON_MATCH_NONE: 0 (instead of nothing) ?
Thx !
from gbif-dataset-metrics.
Ok, no then everything is fine. The aggregator fills in zeros for tags that are not found, so there is no need to add that to the extractor.
from gbif-dataset-metrics.
Related Issues (20)
- Add metrics for SAMPLING_EVENT datasets with occurrences
- Get metrics from facets
- Please update stats of http://www.gbif.org/dataset/83598aa6-f762-11e1-a439-00145eb45e9a/stats HOT 3
- please re-process TNHC Ichthyology GBIF data set HOT 4
- Request for metrics HOT 2
- improve taxonomic overview HOT 1
- Request for metrics HOT 4
- Update metrics for Belgium HOT 1
- Sample of images: url HOT 9
- Run aggregator on EC2
- Reduce vulnerability with HTTPS on CartoDB
- Push code to gh-pages HOT 1
- Move data to production
- Updated metrics HOT 1
- http://www.gbif.org/dataset/3c428404-893c-44da-bb4a-6c19d8fb676a/stats HOT 3
- Recursive loop never ends when number of downloads is too low
- Update metrics request: INBO Bird tracking dataset HOT 7
- updating of metrics HOT 4
- Dataset metrics, please HOT 7
- Let's update metrics from Belgian datasets HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gbif-dataset-metrics.