GithubHelp home page GithubHelp logo

azizka / sampbias Goto Github PK

View Code? Open in Web Editor NEW
31.0 10.0 8.0 122.25 MB

Sampbias is a method and tool to 1) visualize the distribution of occurrence records and species in any user-provided dataset, 2) quantify the biasing effect of geographic features related to human accessibility, such as proximity to cities, rivers or roads, and 3) create publication-level graphs of these biasing effects in space.

R 98.82% TeX 1.18%

sampbias's Introduction

sampbias 2.0.0

Project Status: Active – The project has reached a stable, usable state and is being actively developed. Build Status codecov.io

Sampbias has been updated to version 2.0 on github to adapt to the retirement of sp and raster. The update may not be compatible with analysis-pipelines build with version <= 1.x

Sampbias is a statistical method to evaluate and visualise geographic sampling biases in species distribution datasets, implemented as an R package.

Description

Species occurrence datasets derived from biological collections or human observations are widely used in biological sciences, including ecology, conservation, systematics and evolution. However, such data are often geographically biased, with remote areas being strongly under sampled. Although spatial and taxonomic biases are widely recognised by the scientific community, few attempts have been made to quantify their strength and to discern among different sources of biases. The implications of not considering biases in biodiversity research have not yet been thoroughly assessed, but are likely to be substantial. Therefore, it is advisable that any study dealing with species occurrence data - either carefully validated or directly downloaded - should assess the biases covered by this package.

Sampbias is a method and tool to 1) visualize the distribution of occurrence records and species in any user-provided dataset, 2) quantify the biasing effect of geographic features related to human accessibility, such as proximity to cities, rivers or roads, and 3) create publication-level graphs of these biasing effects in space.

The results of sampbias can be used to identify priority for further collection or digitalisation efforts, provide bias surfaces for species distribution modelling, or assess the reliability of scientific results based on publicly available species distribution data.

Examples

Example datasets for sampbias and a tutorial on how to use it are provided with the package.

For the impatient

#installing the package
install.packages("devtools")  
require("devtools")
install_github("azizka/sampbias")
library(sampbias)

#reading a csv file as downloaded from GBIF and provided in the example data folder
example.in <- read.csv(system.file("extdata", "mammals_borneo.csv",
package="sampbias"), sep = "\t")

#running sampbias
example.out <- calculate_bias(x = example.in)
summary(example.out)
plot(example.out)

sampbias's People

Contributors

azizka avatar brunovilela avatar dsilvestro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sampbias's Issues

Unable to project results through space

Hi there!
I am trying to use sampbias R package to estimate the bias of roads and project the result through space (dataset of 33,805 species occurrences across Tasmania).

While bias calculation goes without any issue, and I am able to summarize the results, I got the following error when projecting the sampbias' output: Error in [.data.frame(ras, , ord) : undefined columns selected (please see attached screenshot).

Could you please advise on what I must change/do to deal with this error? I am using R version 4.0.4 (2021-02-15); sampbias 1.0.4 (I installed sampbias and its dependencies on March 25th, 2021).

Bias_road_error

issue with calculate_bias in equal area grid

Hi everyone,

I would like to run the “calculate_bias” function in equal-area projection at 5 km resolutions in Africa.

I created the dummy raster in both R and ArcGis, but I keep getting these errors when I run calculate_bias "Error in .memtrimlayer(x, padding = padding, values = values, ...) : only NA values found" and "Error in compareRaster(x, mask) : different extent".

Apparently, the function runs just with the example raster "ea_raster". Does anyone have any suggestions, please?

Thanks for the help!

problem with function 'coordinates'

Hi Alex,

I am trying to use the function calculate_bias for a terrestrial dataset, and keep getting the following message:


Error in h(simpleError(msg, call)) :
error in evaluating the argument 'obj' in selecting a method for function 'coordinates': undefined columns selected


I've downloaded the latest version of the package, and re-initiated the session (see Session info below), but the error doesn't seem to change. It seems to be related to the 'sp' package.

Many thanks for your help!

Best,

Dylan

R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_CL.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=es_CL.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=es_CL.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=es_CL.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] sampbias_1.0.4 rgeos_0.5-5 raster_3.3-13 speciesgeocodeR_2.0-10 tmap_3.2
[6] sf_0.9-6 forcats_0.5.0 stringr_1.4.0 purrr_0.3.4 readr_1.3.1
[11] tidyr_1.1.2 tibble_3.0.3 tidyverse_1.3.0 dggridR_2.0.4 dplyr_1.0.2
[16] ggplot2_3.3.2 rgdal_1.5-16 sp_1.4-4

loaded via a namespace (and not attached):
[1] nlme_3.1-149 fs_1.5.0 lubridate_1.7.9 RColorBrewer_1.1-2 httr_1.4.2 tools_4.0.2
[7] backports_1.1.10 R6_2.4.1 vegan_2.5-6 KernSmooth_2.23-17 DBI_1.1.0 mgcv_1.8-33
[13] colorspace_1.4-1 permute_0.9-5 withr_2.3.0 gridExtra_2.3 tidyselect_1.1.0 leaflet_2.0.3
[19] compiler_4.0.2 leafem_0.1.3 cli_2.0.2 rvest_0.3.6 xml2_1.3.2 scales_1.1.1
[25] classInt_0.4-3 digest_0.6.25 base64enc_0.1-3 dichromat_2.0-0 pkgconfig_2.0.3 htmltools_0.5.0
[31] dbplyr_1.4.4 htmlwidgets_1.5.1 rlang_0.4.7 readxl_1.3.1 rstudioapi_0.11 generics_0.0.2
[37] jsonlite_1.7.1 crosstalk_1.1.0.1 magrittr_1.5 geosphere_1.5-10 Matrix_1.2-18 Rcpp_1.0.5
[43] munsell_0.5.0 fansi_0.4.1 viridis_0.5.1 ape_5.4-1 abind_1.4-5 lifecycle_0.2.0
[49] stringi_1.5.3 leafsync_0.1.0 MASS_7.3-53 tmaptools_3.1 grid_4.0.2 blob_1.2.1
[55] parallel_4.0.2 crayon_1.3.4 lattice_0.20-41 cowplot_1.1.0 stars_0.4-3 haven_2.3.1
[61] splines_4.0.2 hms_0.5.3 pillar_1.4.6 codetools_0.2-16 reprex_0.3.0 XML_3.99-0.5
[67] picante_1.8.2 glue_1.4.2 packrat_0.5.0 modelr_0.1.8 png_0.1-7 vctrs_0.3.4
[73] cellranger_1.1.0 gtable_0.3.0 assertthat_0.2.1 lwgeom_0.2-5 broom_0.7.0 e1071_1.7-3
[79] class_7.3-17 viridisLite_0.3.0 units_0.6-7 cluster_2.1.0 ellipsis_0.3.1

cropping of raster

Hi,
just testing out the function calculate_bias(), which threw this error:
" Error in strsplit(row.names(xy), " ") : non-character argument "

I narrowed it down to the following lines in dis_rast.R:
gaz.crop <- lapply(gaz, function(k) {
raster::crop(k, cut.off)
})

My area didn't have any rivers in it, so the cropping function failed. How about rewriting it to account for this?

Issues with occurences using Sampbias

Hi everyone,

I am currently trying to debias plant occurrences from the GBIF datapool.

Associating BBike roads, railways, and other human points/amenities, I tried to run Sampbias.

That said, although I feel like I tried everything (when formatting/inputting CSV/TXT of occurences), R Studio (V 3.6.3.) and Sampbias always give me that error:

In the first TXT file you'll find is the script I readapted, do you know what's wrong?

In the second TXT file errors I get (one "big" apparently, columns names not recoginzed, although it can plot it correctly, one "warinng" when importing SHP points and lines for "human bias")

You'll find input used (CSV for Betula GBIF occurrences, SHP for SHP of human infrastructures I am willing to test).

Thanks for considering this message!
Romain

Sampbias_Data_Script_Errors.zip

Error while using default settings calculate_bias

Thanks for your work and for putting this as a R package!

I have been trying to use it for some dataset I have my hands on, and I get the following error message:

b_oo <- calculate_bias(oo_df)
Adjusting to terrestrial surface...
 Done.

Creating occurrence raster...
 Done

Calculating distance raster...
Error in .local(x, y, ...) : 
  RasterLayer has no NA cells (for which to compute a distance)
In addition: Warning messages:
1: In calculate_bias(oo_df) : 'gaz' not found, using standard gazetteers
2: In dis_rast(gaz = gaz, ras = occ.out, buffer = buffer) :
  Evening buffer. Buffer set to 2

Here is the str of oo_df:

'data.frame':	1696015 obs. of  3 variables:  $ species         : Factor w/ 26 levels 
"Bachstelze","Baumpieper",..: 24 24 24 24 24 24 24 24 24 24 ...  
$ decimalLongitude: num  13.73 10.55 10.57 8.31 8.32 ...  
$ decimalLatitude : num  51.2 54 48.3 53.2 53.2 ...
--

unable crop for signature "sf"

Hi everyone,

I'm currently trying to dealing with a problem. I'm testing the function calculate_bias() with freshwater gastropods from Brazil.
I'm receiving this error message: "unable to find an inherited method for function ‘crop’ for signature ‘"sf"’".
I think that's because I'm using the geobr package for loaded Brazilian shapefiles directly in R, and the package is built over sf.
There's an alternative to resolve this? Any method to restrict the samples in calculate_bias() easily?
Maybe a link to rnaturalearth can help me?

Thanks for building this package. It is extremely necessary!
Thanks for considering this message.
Brunno.

cannot make custom gazetteers

Hi there, I am trying to use this package to examine whether the reported abundance of a species at a particular location depends on the distance of that location from the closest sampling station. Thus, I need to create a custome gazetteer that contains all the sampling stations. I cannot find instructions or examples on how to make a custom gazetteer. Please help me create a custom gazetteer. Thanks.

I tried using the following code to create a raster, gaz_sta, and then running calculcate_bias() with gaz = out, but it didn't work. I also tried running calculate bias() with gaz = gaz_sta, and that also didn't work.

stations <- read_csv([insert attached file]) %>%
dplyr::select(Sta_ID, DLat_Dec, DLon_Dec) %>%
mutate(latitude = DLat_Dec,
longitude = DLon_Dec) %>%
dplyr::select(-DLat_Dec, -DLon_Dec)

sp_stations <- SpatialPointsDataFrame(coords = stations[,3:2], data = data.frame(stations[,1]))

lin <- data.frame(long = seq(min(stations$longitude), max(stations$longitude), by = 1),
lat = seq(min(stations$latitude), max(stations$latitude), by = min(stations$latitude) - max(stations$latitude) / 24))

lin <- sp::SpatialLinesDataFrame(sl = sp::SpatialLines(list(sp::Lines(sp::Line(lin), ID="B1"))),
data = data.frame("B", row.names = "B1"))

gaz_sta <- list(point.structure = sp_stations, lines.structure = lin)

ras <- raster::raster(raster::extent(min(stations$longitude),max(stations$longitude),min(stations$latitude),max(stations$latitude)),
res = min(stations$latitude) - max(stations$latitude) / 24)

out <- dis_rast(gaz_sta, ras)

this doesn't work: example.out <- calculate_bias(x = test_s2, gaz = out, res = 0.1)

this doesn't work, either: example.out <- calculate_bias(x = test_s2, gaz = gaz_sta, res = 0.1)

Station_ID.csv

cannot derive coordinates from non-numeric matrix

My input data is the output of the R package Coordinate Cleaner. Majority of species occurance data has worked in SampBias, but a handful get the following error:

Creating occurrence raster...
Adjusting to terrestrial surface...
Calculating distance raster...
Error in .local(obj, ...) :
cannot derive coordinates from non-numeric matrix

From my understanding, it is telling me that the coordinates are not in a numeric format. However, when I check using, class(example.in$decimalLatitude), it shows as being numeric.

Any idea what is going on?

H.

The number of biases to plot needs to be a factor of 1000

Hi,

First of all, this is a great package, thanks!

Playing around, I have found that it is not possible to use the function plot.sampbias() when the number of biases to be plotted is not a factor of 1000. Specifically, I was trying to plot biases based on three spatial features, and I got the following error message:

Warning: longer object length is not a multiple of shorter object lengthError in data.frame(dist = plo2_dist, rate = plo2_w[4] * exp(-plo2_w[5:(length(plo2_w) - : arguments imply differing number of rows: 1000, 3

To solve this, I had to change the current line 48 in plot.sampbias.R, which establishes the number of points to be plotted along the X axis from:
plo2_dist <- seq(1,1000,length.out=1000)

to:
plo2_dist <- seq(1,999,length.out=999)

This way, since 999 is a multiple of 3, the code worked with no error messages.

I think that this issue should be easy to fix adding an if statement. Plots do not change perceptibly when the plo2_dist value is 1000 vs. 999. By fixing this, the code would allow to plot 1 to 5 biases (it would again not work for 6 biases, but that may be too many to plot anyway).

Cheers and congratulations again on the package!

enable user-provided rasters

Enable the users to provide custom rasters to calculate_bias and dis_rast to enable different cartographic projections

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.