trias-project / indicators Goto Github PK
View Code? Open in Web Editor NEW📈 Alien species indicators
Home Page: https://trias-project.github.io/indicators/
License: MIT License
📈 Alien species indicators
Home Page: https://trias-project.github.io/indicators/
License: MIT License
This issue describes how data will be filtered to get to the data frame described in #17. Some of these will be tackled in the unified checklist.
Filters:
I used the first underscore (_
) to distinguish pathway level 1 and pathway level 2 (see pathway table). This is good for all possibilities except natural_dispersal
, which is a pathway level1, and should be not divided in two levels.
Thanks to @SanderDevisscher to find this bug. I try to solve it now.
As noticed while commenting #57 , we can download shapefiles of Belgian regions from this federal site: https://data.gov.be/en/dataset/fb1e2993-2020-428c-9188-eb5f75e284b9
I have just compared the shapefiles from our server, as uploaded by @timadriaens, with the downloaded ones, via R.
I see the following:
I transofrmed both shapefiles to WGS84 before plotting them with mapview
/leaflet
of course.
@timadriaens , can you please check it via a GiS software?
in #48 we found that data are published with a delay of approximately two years. So, we limited to analyze data up to 2017. However, it would be nice to have a screening for appearing species (which is done by decision rules, see #49 (comment)) up to present year. In this way we can provide a more effective policy relevant output.
Be sure to change checklist indicators so that all of the piepelins count taxa (key), not species (speciesKey).
This issue will follow the part of issue #21 dedicated to occurrenceStatus
(point 3).
Based on what @qgroom wrote on that issue (see #21 (comment)) and what he told me today, we decided to change this filter.
I take taxa with occurrenceStatus
not equal to ABSENT
, EXCLUDED
or EXTINCT
. In R words:
df %>% filter(!status %in% c("ABSENT", "EXCLUDED", "EXTINCT"))
@peterdesmet , @LienReyserhove, @timadriaens, something to add about?
Usage:
gbif_download_meta("../data/output/gbif_downloads.csv")
The script would:
gbif_download_key
occ_download_meta()
to get status information for all downloads in the listExample of a response you'll get: http://api.gbif.org/v1/occurrence/download/0000251-150304104939900
@timadriaens found up that the list is not anymore updated.
He will update it by modifying file data/input/eu_concern_species.tsv
.
... to the reduced scope of indicators (rather than the full pipeline).
From e-mail of @amyjsdavis:
By the way, Diederik and I are very enthusiastic about the data cube and your approach to handling spatial uncertainty. We ask that you consider adding one attribute that identifies whether the grid cell contains at least one presence with an "exact" location, or whether the presences in the grid cell have all been randomly assigned. This attribute will be very useful to our modeling if it turns out that we need to reduce the amount of the uncertainty in our model.
Notes regarding GBIF match for modelling species:
Podarcis sicula
(https://www.gbif.org/species/2469233) considered synonym of Podarcis siculus
. I propose to update our spelling to Podarcis siculus
. OK?
Persicaria wallichii
(https://www.gbif.org/species/6391908) considered synonym of Koenigia polystachya
(https://www.gbif.org/species/8848208) which has a whole number of synonyms (see on the left). Do we keep our restricted Persicaria wallichii
taxon concept or widen to Koenigia polystachya
?
Aspius aspius
(https://www.gbif.org/species/2360181) considered synonym of Leuciscus aspius
(https://www.gbif.org/species/5851603) with a number of synonyms. Keep restricted or widen?
Astacus leptodactylus
(https://www.gbif.org/species/4417551) considered synonym of Pontastacus leptodactylus
(https://www.gbif.org/species/8946295) with 3 synonyms. Keep restricted or widen?
Mimulus guttatus
matches with 2:
Mimulus guttatus Fisch.
(https://www.gbif.org/species/7887942) considered synonym of Erythranthe lutea (L.) G.L.Nesom
(https://www.gbif.org/species/7730307) with a number of synonyms.Mimulus guttatus Fisch. ex DC.
(https://www.gbif.org/species/6070603) considered synonym of different species Erythranthe guttata (DC.) G.L.Nesom
(https://www.gbif.org/species/7346102) with a number of synonyms.Which one of the two is it? I'll then add that author. Also: keep restricted or widen?
Hi, this came up when checking a unverified, false record of a supposedly new alien species for Belgium in the wnm.be data (Vespa orientalis). Waarnemingen and observations publish all records with IdentificationVerificationStatus on gbif (which is ok!). However, for the pipeline, the models etc. it is imperative we only use validated occurrences. Therefore: the pipeline needs a line to subset data based on IdentificationVerificationStatus
Perhaps we can do some sort of sensitivity analysis to see how this impacts (I'm sure there is no time)...
Question to @damianooldoni @peterdesmet @qgroom @SoVDH , anticipating that perhaps many datasets/records on gbif do not even have a IdentificationVerificationStatus : what do we do if that field is not filled?
Temporal trends in first record rates of alien species (number of first records per year/per x year interval) for alien species (based on Seebens et al. 2017). This indicator provides information on the number of new introductions in time, for instance the rate of increase of alien species introductions and the accumulation rate of alien species (Rabitsch et al. 2016). The information will be updated and refined as the checklist is further supplemented.
The data retrieved by GBIF would be organized in a (tidy) data.frame
key | nubKey | scientificName | datasetKey | species | genus | family | order | class | phylum | kingdom | rank | speciesKey | taxonomicStatus | acceptedKey | accepted | locationId | locality | country | status | first_observed | last_observed | establishmentMeans | native range | origin | invasion stage | habitat | pathway_level1 | pathway_level2 |
---|
This data should be already filtered as explained in #21 .
This data output will serve as input for the plots based on group_by-like pipes, e.g.
df %>%
group_by(class) %>%
count()
It is suggested to write these series of group_by()
in a function for plotting because of the high number of combinations of information.
See issue #18 about temporal information (first_observed
and last_observed
) in checklists.
The decision rules described in documentation of apply_decison_rules()
from trias
package should be added to https://trias-project.github.io/indicators/07_occurrence_indicators_modelling.html#33_decision_rules
To be done during update of indicators before end 2020.
According to the download list, last GBIF download (https://doi.org/10.15468/dl.9unif7) used the eu_concern_species.tsv list to query for taxa. However, only 38 of the 49 species are queried. These are 11 missing:
checklist_scientificName | backbone_taxonKey |
---|---|
Alopochen aegyptiacus | 2498252 |
Elodea nuttallii | 5329212 |
Gunnera tinctoria | 2984306 |
Heracleum mantegazzianum | 3034825 |
Heracleum persicum | 3628745 |
Myriophyllum aquaticum | 5361785 |
Myriophyllum heterophyllum | 5361762 |
Parthenium hysterophorus | 3086784 |
Pennisetum setaceum | 2706134 |
Persicaria perfoliata | 4033648 |
Pueraria lobata | 9035634 |
@damianooldoni can you figure out what might cause this? Is it a character limit on the querystring in the URL?
At this stage of programming, it should be important to use the same terms to refer to taxon in all pipelines.
Do we use nubKey
and taxonomicStatus
as in pipeline get_taxa or gbif_taxonKey
and gbif_species_status
as in pipeline occurrence?
I would personally opt for nubKey
and taxonomicStatus as referred in checklists.
I would also consequently rename the columns of the input file eu_concern_species.tsv
.
New species on one of the checklists without any occurrences should actually appear in the list as "appearing". An example is the recently discovered Procambarus acutus which was meanwhile added to the macroinvertebrates checklist, but does not appear yet in any of our datasets on gbif (pending publication of the records...). Is it possible to retreive those in the indicator pipeline @damianooldoni e.g. for the separate list of appearing spp?
When trying to match taxon data from gbif (using code described here) with distribution data from the "include distribution regions" - branch it seems the taxonKeys do not match.
Below a subset of 2 species to illustrate the issue:
Species | taxonKey in TaxonData | taxonKey in Distributions |
---|---|---|
Procambarus clarkii | 152543866 | 140563018 |
Vespa velutina | 152544481 | 148438120 |
ps: I used following code to read distribution data:
distributions_unified <- read_csv("https://raw.githubusercontent.com/
trias-project/unified-checklist/include-distribution-
regions/data/interim/distributions_unified.csv")
@damianooldoni FYI:
Species | taxon_key | records in BE | type |
---|---|---|---|
Pieris rapae | 1920496 | 175197 | butterfly |
Pararge aegeria | 8049830 | 145023 | butterfly |
Vanessa atalanta | 1898286 | 153714 | butterfly |
Aglais io | 4535827 | 130204 | butterfly |
Anas platyrhynchos | 9761484 | 101445 | bird |
Rutilus rutilus | 2359706 | 90766 | fish |
Cirsium arvense | 3113414 | 40449 | plant |
Based on discussion in #46 , we would like to know how many of our cells are occupied at a certain rank level (in our case, we thought about kingdom, but the issue can be extended easily to any other rank). As we are working at 1km scale (~50k cells) and we are working on a time span of around 50 years, let's say, this means sending a lot of queries to GBIF, which is not recommended (see interesting discussion about here: ropensci/rgbif#320).
So, how to get this aggregated(!) numbers in a quite fast way and without sending thousands of queries?
If I see how fast the GBIF tool observation trend works, I think a possible solution is in the air.
This is the link to the repo behind the tool: https://github.com/gbif/species-population.
Moreover, GBIF has already an API for doing it, they are writing the documentation for it: see gbif/species-population#6. However, the issue is quite old, their plans are very likely changed.
Analyzing the requests my computer does while searching for Branta canadensis
against higher taxon Anatidae
, I see some requests which are linked to binary files with .mvt
extension:
and if we zoom a lot:
Notice the /v2
instead of /v1
, the GBIF API standard version. So, the tool loads some binary files (downloadable, just click on the links) and query on them. The directory structure is very likely linked to zoom level and geographical area. I have tried to opened one of these files in R via ReadBin()
but it didn't work. I am afraid I need to know the file structure before hand.
Another way: in the issue cited before (ropensci/rgbif#320) Tim Robertson was speaking about SQL API which is under development: is it the right alternative to get what we want?
@stijnvanhoey, @peterdesmet : what do you think we can implement and what's at GBIF side? Thanks!
New species have been added to the EU concern list. File in data/input/eu_concern_species.tsv
updated. Still, field entry_into_force
is empty as new version of the list is still not officially published. See #51.
@timadriaens : add date to missing taxa when it will be available. Thanks.
It seems WP3 and WP4 both need to clean GBIF occurrence data.
We should :
How to make plots with covariates more clear?
More in @ToonVanDaele 's repo: ToonVanDaele/trias-test#10
The data output we get by merging taxonomic information with distribution and description extensions, is "fake" tidy. I wrote fake because most of the description is saved by following the Entity-Value-Model (thanks @stijnvanhoey to find the right word for it)
For each taxon it appears like the t able here below:
taxonKey | type | description |
---|---|---|
141264591 | pathway | cbd_2014_pathway:escape_horticulture |
141264591 | native range | Southern America (WGSRPD:8) |
141264591 | origin | vagrant |
I tidy already the pathways by creating two new columns pathway_level1
and pathway_level2
:
taxonKey | type | description | pathway_level1 | pathway_level2 |
---|---|---|---|---|
141264591 | pathway | cbd_2014_pathway:escape_horticulture | escape | horticulture |
141264591 | native_range | Southern America (WGSRPD:8) | NA | NA |
141264591 | origin | vagrant | NA | NA |
However, I am thinking this half-tidy approach can create confusion, because it mixes EAV and not EAV (typical representation) models!
Would it be better to have description data 100% tidy like table here below?
taxonKey | pathway_level1 | pathway_level2 | origin | native_range |
---|---|---|---|---|
141264591 | escape | horticulture | vagrant | Southern America (WGSRPD:8) |
I think it would be more understandable. And it would make plot workflow easier I think. What do you think, @stijnvanhoey, @SanderDevisscher and @Yasmine-Verzelen ? If you agree then I will implement it in the workflow and I will export a new test data.frame so that you all can use it asap. Thanks.
Download based on the modelling species is available at https://doi.org/10.15468/dl.6cljf9
No occurrences were found in Belgium for:
The following taxa are listed under a different species name
(i.e. the name of the species GBIF considers the accepted one). scientific name
is still the original one + only taxa were returned for the name we looked for, as we wanted:
Astacus leptodactylus
→ Pontastacus leptodactylus
Mimulus guttatus Fisch. ex DC.
→ Erythranthe guttata
And to know what synonym names were lumped in with the download, check this file (yellow rows): modelling_species_in_download.xlsx
Note: for most (all?) of these, it's very good that they get lumped, because e.g. there are no occurrences in Belgium published under Neovison vison
: all are published as Mustela vison
which we wouldn't get if GBIF didn't look for synonyms.
Let me know if you consider all the above acceptable.
As discussed with @damianooldoni, he will create a Rmd name grid_size_effect.Rmd
coordinateUncertaintyInMeters
which is radius of gridTable:
Year | Occ | Coordunc | 100km | 10km | 1km | 1km (downscaling) | 1km (circle based) |
---|---|---|---|---|---|---|---|
2011 | count | mode | % (count) | % (count) | % (count) | % (count) | % (count) |
2012 | count | mode | % (count) | % (count) | % (count) | % (count) | % (count) |
2013 | count | mode | % (count) | % (count) | % (count) | % (count) | % (count) |
2014 | count | mode | % (count) | % (count) | % (count) | % (count) | % (count) |
2015 | count | mode | % (count) | % (count) | % (count) | % (count) | % (count) |
Totals | total | mode | count | count | count | count | count |
Chart:
Comparing % (AOO) of 1km
, 1km (downscaling)
, 1km (circle based)
per year
@amyjsdavis , @DiederikStrubbe : is there a link where I can download a world wide grid with cell size 1x1km? This is the link for EU countries, which you both already know, but I don't find anything similar at world level. For Belgium, I use these shapefiles combined with function st_read()
from sf
package which is very flexible about file formats. If no link available, maybe do you have something I can use as well? Thanks!
The assessment of data publishing delay is very important for a correct use of segmented regression or any other analysis for studying the emerging status of alien species. If we see a sensible decrease in data publication, then we can assume that the decrease in number of occurrences at species level is not realistic. A easy but effective way to find it, is to get number of occurrences with geographic coordinates in Belgium for each kingdom during the last years.
You can manually search via GBIF site if you want. Here below a link to get all occurrences for:
year = 2017
kingdom = Animalia
hasCoordinate=TRUE
country=BE
Here below the graphs. You can make them on your laptop, by using the code in this gist.
You can see that two years is a good threshold for publishing delay. Only Fungi do better in Belgium. With the popularity of Citizen Science projects like iNaturalist this delay will eventually decrease in the future. But, at the moment 2016 seems the last year to use for performing segmented regression. At the end of 2019, we could then include 2017.
@qgroom , @timadriaens , @ToonVanDaele , what do you think about?
Hi, just a detail for the occurrence indicator graphs Y-axis legend: Occupancy, in principle, is mostly used for the probability that a site is occupied (and expressed between 0 and 1) cf. site occupancy models. What we look at in TrIAS is Area of Occupancy the way IUCN use it to quantify range size for species. This is an interesting paper about the concept.
@damianooldoni we should probably change the legend for the Y axis to "Area of Occupancy (km2)" to avoid confusion.
It would be nice to have some very basic data in Harmonia on which the more advanced indicators/indexes are based. This always helps interpreting trend graphs. The ones I can think of are:
I believe for now these are 'byproducts' of the pipeline, perhaps we should think of a suitable output (simple barplot graph).
This issue describes a part of the general workflow for assessing the emerging status of alien species, as discussed on Friday, 15 Feb 2019 by @damianooldoni , @timadriaens and @ToonVanDaele .
We start from the output of occ-processing repository called cube_belgium.csv
as mentioned in trias-project/occ-cube-alien#3. This file contains occurrences with (at least) the following key columns:
taxonKey
speciesKey
kingdomKey
year
CELLCODE
(grid id from European Environment)Grouping by speciesKey
and year
, we get the number of occurrences per year (x: year, y: n_occs). We work at year level, no more detailed temporary information used. The research effort bias of area of occupancy (AOO) already corrected at this stage (for details about research effort bias correction, see #46). Working at species level can be not always the case, issue discussed separately (see trias-project/unified-checklist#35).
AOO and occurrences are time series (x: year, y: occurrences or y: AOO).
Although we could have data before 1950, we start analysis from 1950, the birth date of invasion ecology (cit. @timadriaens 😃 ).
After extracting the limit cases, we set occ and AOO equal to zero for years with no occurrences as only years with occurrences are present in the cube. Segmented regression will be applied to the AOO and occ time series separately. So, for each of the two time series and for each year, the slope of the last segment and its confidence interval is evaluated as a categorical variable. We can have three situations:
For each year and species we can then apply a decision table to define the status of emergency of the species:
AOO | n. occurrences | emerging status |
---|---|---|
decrease | decrease | not emerging |
decrease | stable | not emerging |
decrease | increase | potentially emerging |
stable | decrease | not emerging |
stable | stable | not emerging |
stable | increase | potentially emerging |
increase | decrease | potentially emerging |
increase | stable | potentially emerging |
increase | increase | emerging |
This will end up in an output like this:
species | year | emerging status |
---|---|---|
A | 1950 | potentially emerging |
... | ... | ... |
A | 2012 | not emerging |
A | 2013 | pot. emerging |
A | 2014 | pot. emerging |
A | 2015 | pot. emerging |
A | 2016 | emerging |
... | ... | ... |
B | 2012 | not emerging |
B | 2013 | pot. emerging |
B | 2014 | emerging |
B | 2015 | emerging |
B | 2016 | not emerging |
Next steps: how to aggregate this emerging labels in order to estimate the general emerging status of a species?
My two cents: as our analysis is future oriented, the emerging status in the recent past should definitely weight more in the finale decision than the status in the far past.
@ToonVanDaele , @timadriaens : please comment if you think I missed something or you have new thoughts about it.
shouldn't this be ncells = sum(pa_obs)
?
I propose to adapt the current columns type
and description
in the preprocessing phase (after gbif download and before scripting indicators:
type | description | key |
---|---|---|
native range | cultivated origin | 141264581 |
origin | introduced | 141264581 |
pathway | cbd_2014:escape_horticulture | 141264581 |
native range | temperate Asia | 141264583 |
origin | vagrant | 141264583 |
pathway | cbd_2014:escape_horticulture | 141264583 |
to three seperate columns:
native range | origin | pathway | key |
---|---|---|---|
cultivated origin | introduced | cbd_2014:escape_horticulture | 141264581 |
temperate Asia | vagrant | cbd_2014:escape_horticulture | 141264583 |
This will simplify the implementation of #17 #20 and require minimal adaptation to #19...
Usage:
gbif_download(
taxa="https://raw.githubusercontent.com/trias-project/alien-plants-belgium/afbd2805de77afd79fb74669c403d40f1416661b/data/processed/taxon.csv",
country="BE", # default
output="../data/output/gbif_downloads.csv" # default
)
The script would:
checklist
:
gbif_nubKey
gbif_nubKey
contain numbers onlycountry
:
output
:
data/output/gbif_downloads.csv
:
gbif_download_key
: GBIF download UUIDinput_checklist
: checklist (parameter)input_country
: country (parameter)Hi, upon wanting to check emergence status of Atlantic ivy (Hedera hibernica), I noticed differences in taxon matching of several datases:
In GRISS Belgium, there is only Hedera hibernica (G.Kirchn.) Bean (such as in the Manual of Alien Plants) (gbif key 8410115).
Fact: 3 different gbif keys for the same thing. Consequence: the species does not pop up as emerging in the trias indicator flow, whilst probably every field person will say it is clearly emerging.
Solution?
some ideas to visually do more with introduction pathways of alien species in the checklist based on a similar exercise by Van Wilgen & Wilson 2017 (The status of Biological Invasions and their management in South Africa). I know that for TrIAS we have opted for a tabular view but just to show some possibilities for future use - planning to code these graphs for NARA-T see this issue:
Aantal soorten per pathway (gerangschikt op CBD level1 en level2)
evolution of pathways in time (based on first introduction date)
It would be nice to draft the pathway 'indicator' table for the species of the Union list as this links directly to policy on action plans etc.
On https://github.com/trias-project/pipeline/blob/master/data/output/checklist_taxa.tsv GitHub indicates that it cannot display the file nicely because "Illegal quoting in line 2710.". That line contains:
Sedum "Autumn Joy" (S. spectabile Boreau x telephium L.)
This indicator measures the trends of all alien species introductions (publication of first observations). At the national level this indicator is useful to measure the trends in the presence/occurrence of alien (and potentially invasive) species and inform decisions to do with prevention of alien species introduction and the management and control of invasive species causing impacts on biodiversity and ecosystems. It is based on the same information for #17 but is an alternative representation in line with international policy indicators on (invasive) alien species such as EU headline indicators, SEBI, Aichi, IPBES.
Data needs and data output are the same as #17.
A lineplot is envisaged with colours for breakdowns. Visualizing uncertainty due to use of time periods is discussed in #18
Another very interesting indicator that we did not touch upon yet because of time constraints, is the proportion of occurrences of an alien species in protected areas. This is imo a very important one that can directly inform risk assessment but also risk management evaluations.
Ideally, this is a geographic subset of the weighted trend, but we need not make things too complicated. So, can we, based on the Area of Occupancy we already have, do an intersect with the N2000 network for Belgium (the official delimitation is on our gis server but probably best to take the one on EEA website)? It could be reported per year (e.g. since 2004, when Europe agreed on the Belgian N2000 areas) which is more interesting internationally and for other countries.
The goal of this issue is to discuss the best way of compensate the research effort bias. Based on interesting discussions with @qgroom and @timadriaens, I am working on two different ideas:
This issue starts from point 4 of @qgroom' comment on issue 40:
Occupancy values are clearly sensitive to research effort. To improve it we need to aggregate along years. The (lack of) research effort is also a source of underestimation of occupancy: some areas are scanned at a certain year, others in other years. So, the question is: how many years should we use to get the optimal aggregation span? To do it, we should plot occupancy vs number of aggregation years (aggregation span). Hopefully we get the same curve as the curve of occurrence vs research effort (search literature for references). We should see a kind of saturation point. Do it species by species (year by year): the saturation point will be probably different among taxa. That would be obviously a problem as we aim to hold it simple, i.e. using a single aggregation span. Of course, we will have to make a decision at the end (number of years to use for temporal aggregation should not be too high, otherwise we loose policy relevancy), but investigation is needed.
I am investigating the research effort bias correction on branch research_effort_bias_corrction
, more specifically in ./src/_research_effort_bias.Rmd
.
Here below some plots showing the occupancy vs. time window from 2007 to 2018. I don't see a clear saturation curve valid for all species and all years. I think correcting research effort bias by working on temporal dimension will be not effective as we want to use temporal dimension to detect changes in occupancy as well.
As alternative, @timadriaens and I discussed yesterday a possible alternative: why not working on spatial scale? We can calculate yearly occupancy dividing #occupied cells at species level by #occupied cells at kingdom level instead of dividing by #cells of Belgium. This way we will remove all cells not showing any research effort at all. I am still working on making this calculation feasible (technical discussion opened in #47).
Meanwhile, ideas and comments about temporal and/or spatial solution are welcome!
Originally reported in trias-project/alien-macroinvertebrates#25
Physella acuta
is a species in the Alien macroinvertebrates checklist (alien-macroinvertebrates-checklist:taxon:57
). Haitia acuta
should be a synonym of it... and thus find occurrence records named as such (including 167 occurrences in the Alien macroinvertebrates occurrence dataset, e.g. PB:Ugent:AqE:2342
). The GBIF backbone however considers both 2 accepted species:
Physella acuta thus won't return records for Haitia acuta.
Note: Haitia acuta is the only species in the macroinvertebrate occurrence dataset that is not listed by the same name in the checklist.
The indicator shows the number of non-native plant and animal species introduced in Belgium via a certain pathway. It is based on a checklist of alien species, composed of various existing sources and databases. The information will be updated and refined as the checklist is further supplemented. The available information on introduction pathways was organized following the Convention on Biological Diversity standard (CBD 2014).
This indicator uses the same data output as issue #17. It is a specific group by (per pathway) on the same dataframe.
Based on previous conversations, we found that no worldwide grid is available and it is actually not needed for risk assessment (RA). We need it at European level for RA and at Belgian level for indicators. Based on what I can find from European Environmental Agency page I cannot find any grid at 1x1km resolution at European level, only 10x10km and 100x100km.
@DiederikStrubbe, @amyjsdavis : do you need to calculate occupancy at European level at 1x1km or is it fine for you to get 1x1km resolution at Belgian level?
If you really need, do you have ideas how to get it? Maybe we can build it by using all reference grids at country level and join them together but we will have a lot of problems at country borders where duplicates will occur. Something to think about... Let me know. If you have a shape file for it, could you please upload it in ./data/external/
? Thanks.
Based on remarks of @qgroom :
PRESERVED_SPECIMEN
should also be included in the basis of record.verbatimCoordinateSystem
field.coordinateUncertaintyInMeters
is it possible to tell the coordinate uncertainty from the number of significant digits in the long/lat?coordinateUncertaintyInMeters
vs. time would be useful ancillary information.We currently have scientificNames with:
2772890 134087647 Loncomelos brevistylus (Wolfner ) Dost<U+00E1>l 9ff7d317-609b-4c08-bd86-3bc404b77c42 Loncomelos brevistylum (Wolfner) Dost<U+00E1>l Plantae SYNONYM 2772885 Ornithogalum pyramidale L.
As you notice, these issues appear in the checklist_scientificName
and backbone_scientificName
.
checklist_taxa[433,3]: "Loncomelos brevistylus (Wolfner ) Dost\u00e1l"
So, must be related to how the data are read from the API (because of our own code or rgbif).
This issue follows point 4 of #21 (comment)
Based on discussion with @timadriaens and @qgroom we think to filter based on establishmentMeans
equal to one of the following terms: INTRODUCED
, NATURALISED
, INVASIVE
, ASSISTED COLONISATION
. These are the new terms which will be proposed soon and hopefully approved and implemented. Is there a problem on doing that?
Requeste of @timadriaens: add analysis at protected area level.
A tabular file as output:
protected_area_id | type | year | taxonKey | obs | ncells | coverage |
---|---|---|---|---|---|---|
BE3748432 | A | 2010 | 124245 | 34 | 15 | 0.30 |
BE3748432 | A | 2011 | 124245 | 44 | 29 | 0.58 |
BE3748432 | A | 2012 | 124245 | 58 | 32 | 0.64 |
BE3748432 | A | 2013 | 124245 | 69 | 36 | 0.72 |
BE3748432 | A | 2014 | 124245 | 88 | 43 | 0.86 |
BE3748432 | A | 2015 | 124245 | 92 | 45 | 0.90 |
BE3748432 | A | 2016 | 124245 | 95 | 47 | 0.94 |
BE3748432 | A | 2017 | 124245 | 99 | 48 | 0.96 |
From e-mail of Thomas Verleye:
We have pre-selected 5 species for PRA based on initial data availability and trends in occurrences (when we’ll face additional bottlenecks regarding data availability or expert identification we might reduce this number during the coming weeks, tbc):
@damianooldoni: would it be possible to create a GAM for C. fornicata & M. Leidyi? Thanks in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.