boyanangelov / sdmbench Goto Github PK

View Code? Open in Web Editor NEW

18.0 18.0 2.0 1.2 MB

Benchmarking Species Distribution Models

License: MIT License

R 86.20% CSS 0.05% TeX 13.75%

benchmarking ecology machine-learning r sdm

sdmbench's Issues

Consider rgbif::occ_data?

from JOSS review

I noticed that in get_benchmarking_data() you use rgbif::occ_search. You might consider rgbif::occ_data as it has all the same user interface (parameters) but only spends time internally getting and parsing the occurrence data and completely drops the other data that you don't want.

Also note that occ_search/occ_data use the GBIF occurrence API, which is limited to 200K results for any one query. So if you know or think you might need more than that you should use the download API via rgbif::occ_download and friends

also also note that we now have an interface to the GBIF maps API via rgbif::map_fetch which is a super fast way to get a raster map of occurrences, so you're not getting the actual occurrence data but a quick summary of the data- see https://www.gbif.org/developer/maps - Don't know if this is appropriate or not for your use case(s) but worth knowing about

package-level helpfile

Something I always look for when getting started with a new package is a package-level helpfile, so I can do ?sdmbench and see some general information about the package, how to get started, and see a roadmap to the available functions.

sdmbench doesn't have one of these yet, but it would be great if it did. You can create one of these with roxygen by documenting a NULL in a script in the R/ folder, like spocc does here.

R version dependency required?

Hi Boyan, I'm Nick, one of the reviewers of sdmbench for JOSS. I'm starting my code review now, so will post issues as they come up, then I'll summarise them in the review thread later.

I just went to install the package following the readme instructions:

devtools::install_github("boyanangelov/sdmbench")

and it failed to install with the error:

ERROR: this R is version 3.4.3, package 'sdmbench' requires R >= 3.4.4

which is because there's a dependency on a very recent version of R in DESCRIPTION.

Is that dependency on a super-recent R version really necessary?

If not, it might be worth removing this entry altogether.

If it is necessary, it would be worth mentioning that in the installation information.

undocumented code objects

R CMD check flagged some exported functions that are not documented:

‘customPredictFunGBM’ ‘customPredictFunKSVM’ ‘customPredictFunLogreg’ ‘customPredictFunMultinom’ ‘customPredictFunNB’ ‘customPredictFunXGB’

perhaps these were exported by mistake?

Add possibility of using custom user observation data

Add a parameter to filter GBIF occurrences by country

error in benchmark_sdm

after #4 was resolved I continued trying the examples in the README, and ran into a problem

benchmarking_data <- get_benchmarking_data("Loxodonta africana", limit = 1200, climate_resolution = 10)
bmr <- benchmark_sdm(benchmarking_data$df_data,
                         learners = learners,
                         dataset_type = "block",
                         sample = FALSE)
#> Error in benchmark_sdm(benchmarking_data$df_data, learners = learners,  :
#>   Assertion on 'blocking' failed: Must have length 11199, but has length 0.

Session Info

Session info -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 setting  value
 version  R version 3.5.1 Patched (2018-08-12 r75119)
 system   x86_64, darwin15.6.0
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 tz       US/Pacific
 date     2018-09-24

Packages ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 package      * version    date       source
 .py            0.0.16     <NA>       local
 abind          1.4-5      2016-07-21 cran (@1.4-5)
 assertthat     0.2.0      2017-04-11 CRAN (R 3.5.0)
 backports      1.1.2      2017-12-13 CRAN (R 3.5.0)
 base         * 3.5.1      2018-08-13 local
 BBmisc         1.11       2017-03-10 cran (@1.11)
 bindr          0.1.1      2018-03-13 CRAN (R 3.5.0)
 bindrcpp       0.2.2      2018-03-29 CRAN (R 3.5.0)
 broom          0.5.0      2018-07-17 CRAN (R 3.5.1)
 checkmate      1.8.5      2017-10-24 CRAN (R 3.5.0)
 class          7.3-14     2015-08-30 CRAN (R 3.5.1)
 colorspace     1.3-2      2016-12-14 CRAN (R 3.5.0)
 compiler       3.5.1      2018-08-13 local
 crayon         1.3.4      2017-09-16 CRAN (R 3.5.0)
 crul           0.6.0      2018-07-10 cran (@0.6.0)
 curl           3.2        2018-03-28 CRAN (R 3.5.0)
 CVST           0.2-2      2018-05-26 cran (@0.2-2)
 data.table     1.11.4     2018-05-27 CRAN (R 3.5.0)
 datasets     * 3.5.1      2018-08-13 local
 ddalpha        1.3.4      2018-06-23 cran (@1.3.4)
 DEoptimR       1.0-8      2016-11-19 CRAN (R 3.5.0)
 devtools       1.13.6     2018-06-27 CRAN (R 3.5.0)
 digest         0.6.17     2018-09-12 CRAN (R 3.5.1)
 dimRed         0.1.0      2017-05-04 cran (@0.1.0)
 dismo          1.1-4      2017-01-09 CRAN (R 3.5.0)
 docopt         0.6        2018-08-03 CRAN (R 3.5.1)
 dplyr          0.7.6      2018-06-29 CRAN (R 3.5.0)
 DRR            0.0.3      2018-01-06 cran (@0.0.3)
 fastmatch      1.1-0      2017-01-28 CRAN (R 3.5.0)
 geoaxe         0.1.0      2016-02-19 CRAN (R 3.5.0)
 geometry       0.3-6      2015-09-09 cran (@0.3-6)
 ggplot2        3.0.0      2018-07-03 CRAN (R 3.5.0)
 glue           1.3.0      2018-07-17 CRAN (R 3.5.1)
 gower          0.1.2      2017-02-23 cran (@0.1.2)
 graphics     * 3.5.1      2018-08-13 local
 grDevices    * 3.5.1      2018-08-13 local
 grid           3.5.1      2018-08-13 local
 gtable         0.2.0      2016-02-26 CRAN (R 3.5.0)
 httpcode       0.2.0      2016-11-14 cran (@0.2.0)
 httr           1.3.1      2017-08-20 CRAN (R 3.5.0)
 ipred          0.9-7      2018-08-14 cran (@0.9-7)
 jsonlite       1.5        2017-06-01 CRAN (R 3.5.0)
 kernlab        0.9-27     2018-08-10 CRAN (R 3.5.0)
 lattice        0.20-35    2017-03-25 CRAN (R 3.5.1)
 lava           1.6.3      2018-08-10 cran (@1.6.3)
 lazyeval       0.2.1      2017-10-29 CRAN (R 3.5.0)
 lubridate      1.7.4      2018-04-11 CRAN (R 3.5.0)
 magic          1.5-9      2018-09-17 CRAN (R 3.5.1)
 magrittr       1.5        2014-11-22 CRAN (R 3.5.0)
 MASS           7.3-50     2018-04-30 CRAN (R 3.5.1)
 Matrix         1.2-14     2018-04-13 CRAN (R 3.5.1)
 memoise        1.1.0      2017-04-21 CRAN (R 3.5.0)
 methods      * 3.5.1      2018-08-13 local
 mlr            2.13       2018-08-28 CRAN (R 3.5.0)
 munsell        0.5.0      2018-06-12 CRAN (R 3.5.0)
 nlme           3.1-137    2018-04-07 CRAN (R 3.5.1)
 nnet           7.3-12     2016-02-02 CRAN (R 3.5.1)
 oai            0.2.2.9315 2018-05-31 local (ropensci/oai@NA)
 parallel       3.5.1      2018-08-13 local
 parallelMap    1.3        2015-06-10 cran (@1.3)
 ParamHelpers   1.11       2018-06-25 cran (@1.11)
 pillar         1.3.0      2018-07-14 CRAN (R 3.5.0)
 pkgconfig      2.0.2      2018-08-16 CRAN (R 3.5.1)
 pls            2.7-0      2018-08-21 CRAN (R 3.5.1)
 plyr           1.8.4      2016-06-08 CRAN (R 3.5.0)
 prodlim        2018.04.18 2018-04-18 cran (@2018.04)
 purrr          0.2.5      2018-05-29 CRAN (R 3.5.0)
 qlcMatrix      0.9.7      2018-04-20 CRAN (R 3.5.0)
 R6             2.2.2      2017-06-17 CRAN (R 3.5.0)
 raster         2.6-7      2017-11-13 CRAN (R 3.5.0)
 Rcpp           0.12.18    2018-07-23 CRAN (R 3.5.1)
 RcppRoll       0.3.0      2018-06-05 cran (@0.3.0)
 recipes        0.1.3      2018-06-16 cran (@0.1.3)
 rgbif          1.0.2.9421 2018-09-24 local (ropensci/rgbif@43cf71c)
 rgdal          1.3-4      2018-08-03 CRAN (R 3.5.0)
 rgeos          0.3-28     2018-06-08 CRAN (R 3.5.0)
 rlang          0.2.2      2018-08-16 CRAN (R 3.5.0)
 robustbase     0.93-2     2018-07-27 CRAN (R 3.5.0)
 rpart          4.1-13     2018-02-23 CRAN (R 3.5.1)
 rtichoke       0.2.1      <NA>       local
 scales         1.0.0      2018-08-09 CRAN (R 3.5.1)
 scrubr         0.1.3.9321 2018-05-08 local (ropensci/scrubr@NA)
 sdmbench     * 0.1.2      2018-09-24 local (boyanangelov/sdmbench@cb00187)
 sfsmisc        1.1-2      2018-03-05 cran (@1.1-2)
 slam           0.1-43     2018-04-23 CRAN (R 3.5.0)
 sp             1.3-1      2018-06-05 CRAN (R 3.5.0)
 sparsesvd      0.1-4      2018-02-15 CRAN (R 3.5.0)
 splines        3.5.1      2018-08-13 local
 stats        * 3.5.1      2018-08-13 local
 stringi        1.2.4      2018-07-20 CRAN (R 3.5.0)
 stringr        1.3.1      2018-05-10 CRAN (R 3.5.0)
 survival       2.42-6     2018-07-13 CRAN (R 3.5.1)
 tibble         1.4.2      2018-01-22 CRAN (R 3.5.0)
 tidyr          0.8.1      2018-05-18 CRAN (R 3.5.0)
 tidyselect     0.2.4      2018-02-26 CRAN (R 3.5.0)
 timeDate       3043.102   2018-02-21 cran (@3043.10)
 tools          3.5.1      2018-08-13 local
 triebeard      0.3.0      2016-08-04 CRAN (R 3.5.0)
 urltools       1.7.1      2018-08-03 CRAN (R 3.5.1)
 utils        * 3.5.1      2018-08-13 local
 whisker        0.3-2      2013-04-28 CRAN (R 3.5.0)
 withr          2.1.2      2018-03-15 CRAN (R 3.5.0)
 xml2           1.2.0      2018-01-24 CRAN (R 3.5.0)

spellcheck

devtools::spell_check() flags a few spelling mistakes in the documentation, e.g.:

WORD	FOUND IN
acessed	benchmark_sdm.Rd:20
inbalanced	benchmark_sdm.Rd:17
leanring	train_dl.Rd:10
occurence	benchmark_sdm.Rd:11, customPredictFun.Rd:12, evaluate_dl.Rd:12
occurences	partition_data.Rd:12
partitionined	partition_data.Rd:19
vlaue	get_benchmarking_data.Rd:18

goodpractice checks

I used the goodpractice package to run some automated code checks of sdmbench. Below are some super minor style/robustness issues it flagged that you could change if you wanted to.

On two lines (here and here) you used = for assignment, instead of <- which you used everywhere else.

On quite a few lines of the shiny server file there are more than 80 characters per line, which makes it a bit difficult to read (particularly here). It would be worth reformatting that code to have shorter lines

On this line of the shiny server, you used the code pattern: 1:length(x), but seq_along(x) is (very mildly) preferable (in general), since in the case x has length 0 (e.g. an empty list), 1:length(x) returns c(1L, 0L), but seq_along returns integer(0).

That's all it found though, and these are very minor, which is great! I run those checks on packages quite regularly, and the reports are rarely as short as that.

vignette

from JOSS review:

I couldn't install while also building the vignette

devtools::install_github("boyanangelov/sdmbench", build_vignettes=TRUE, force = TRUE)
* checking for file ‘/private/var/folders/fc/n7g_vrvn0sx_st0p8lxb3ts40000gn/T/Rtmph6H98S/devtools35453b08bf41/boyanangelov-sdmbench-f34ed13/DESCRIPTION’ ... OK
* preparing ‘sdmbench’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... ERROR
sh: /usr/local/bin/virtualenv: /usr/local/opt/python/bin/python3.6: bad interpreter: No such file or directory
Quitting from lines 12-18 (sdmbench_vignette.Rmd)
Error: processing vignette 'sdmbench_vignette.Rmd' failed with diagnostics:
Error 126 occurred creating virtualenv at ~/.virtualenvs/r-tensorflow
Execution halted
Installation failed: Command failed (1)

I guess a python virtualenv problem, but not sure how to solve. Reason I bring up is that the vignette here https://boyanangelov.com/materials/sdmbench_vignette.html has examples that aren't in line with the version of the package at v0.1.2, e.g.,

benchmarking_data <- get_benchmarking_data("Ornithorhynchus anatinus", limit = 1200, bioclim_resolution = 10)
#> Error in get_benchmarking_data("Ornithorhynchus anatinus", limit = 1200,  :
#>   unused argument (bioclim_resolution = 10)

another example: had to change benchmarking_data$raster_data$bioclim_data$bio1 to benchmarking_data$raster_data$climate_variables$bio1 get the first plot eg to work

Coordinates clean function

Add TSS and Kappa metrics

Uncertainty map for predictions

function examples

The following functions are mentioned in the package-level helpfile (note the links don't work), so will probably be used regularly, but they have only very minimal (or no) examples:
run_sdmbench()
get_benchmarking_data()
partition_data()
benchmark_sdm()
get_best_model_results()
plot_sdm_map()

It would be helpful to have more involved examples, with comments, showing the effects of the different arguments, and how you might use them in different situations. Scott's spocc package is again a good example of this, the major functions all have multiple examples.

Your examples seem to have very long lines too, so it's hard to read them. I'd recommend sticking to an 80-characters-per-line limit for these too.

boyanangelov / sdmbench Goto Github PK

sdmbench's Issues

Consider rgbif::occ_data?

package-level helpfile

R version dependency required?

undocumented code objects

Add possibility of using custom user observation data

Add a parameter to filter GBIF occurrences by country

error in benchmark_sdm

spellcheck

goodpractice checks

vignette

Coordinates clean function

Add TSS and Kappa metrics

Uncertainty map for predictions

function examples

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs