GithubHelp home page GithubHelp logo

boyanangelov / sdmbench Goto Github PK

View Code? Open in Web Editor NEW
18.0 1.0 2.0 1.2 MB

Benchmarking Species Distribution Models

License: MIT License

R 86.20% CSS 0.05% TeX 13.75%
sdm ecology machine-learning benchmarking r

sdmbench's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

sdmbench's Issues

error in benchmark_sdm

after #4 was resolved I continued trying the examples in the README, and ran into a problem

benchmarking_data <- get_benchmarking_data("Loxodonta africana", limit = 1200, climate_resolution = 10)
bmr <- benchmark_sdm(benchmarking_data$df_data,
                         learners = learners,
                         dataset_type = "block",
                         sample = FALSE)
#> Error in benchmark_sdm(benchmarking_data$df_data, learners = learners,  :
#>   Assertion on 'blocking' failed: Must have length 11199, but has length 0.
Session Info
Session info -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 setting  value
 version  R version 3.5.1 Patched (2018-08-12 r75119)
 system   x86_64, darwin15.6.0
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 tz       US/Pacific
 date     2018-09-24

Packages ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 package      * version    date       source
 .py            0.0.16     <NA>       local
 abind          1.4-5      2016-07-21 cran (@1.4-5)
 assertthat     0.2.0      2017-04-11 CRAN (R 3.5.0)
 backports      1.1.2      2017-12-13 CRAN (R 3.5.0)
 base         * 3.5.1      2018-08-13 local
 BBmisc         1.11       2017-03-10 cran (@1.11)
 bindr          0.1.1      2018-03-13 CRAN (R 3.5.0)
 bindrcpp       0.2.2      2018-03-29 CRAN (R 3.5.0)
 broom          0.5.0      2018-07-17 CRAN (R 3.5.1)
 checkmate      1.8.5      2017-10-24 CRAN (R 3.5.0)
 class          7.3-14     2015-08-30 CRAN (R 3.5.1)
 colorspace     1.3-2      2016-12-14 CRAN (R 3.5.0)
 compiler       3.5.1      2018-08-13 local
 crayon         1.3.4      2017-09-16 CRAN (R 3.5.0)
 crul           0.6.0      2018-07-10 cran (@0.6.0)
 curl           3.2        2018-03-28 CRAN (R 3.5.0)
 CVST           0.2-2      2018-05-26 cran (@0.2-2)
 data.table     1.11.4     2018-05-27 CRAN (R 3.5.0)
 datasets     * 3.5.1      2018-08-13 local
 ddalpha        1.3.4      2018-06-23 cran (@1.3.4)
 DEoptimR       1.0-8      2016-11-19 CRAN (R 3.5.0)
 devtools       1.13.6     2018-06-27 CRAN (R 3.5.0)
 digest         0.6.17     2018-09-12 CRAN (R 3.5.1)
 dimRed         0.1.0      2017-05-04 cran (@0.1.0)
 dismo          1.1-4      2017-01-09 CRAN (R 3.5.0)
 docopt         0.6        2018-08-03 CRAN (R 3.5.1)
 dplyr          0.7.6      2018-06-29 CRAN (R 3.5.0)
 DRR            0.0.3      2018-01-06 cran (@0.0.3)
 fastmatch      1.1-0      2017-01-28 CRAN (R 3.5.0)
 geoaxe         0.1.0      2016-02-19 CRAN (R 3.5.0)
 geometry       0.3-6      2015-09-09 cran (@0.3-6)
 ggplot2        3.0.0      2018-07-03 CRAN (R 3.5.0)
 glue           1.3.0      2018-07-17 CRAN (R 3.5.1)
 gower          0.1.2      2017-02-23 cran (@0.1.2)
 graphics     * 3.5.1      2018-08-13 local
 grDevices    * 3.5.1      2018-08-13 local
 grid           3.5.1      2018-08-13 local
 gtable         0.2.0      2016-02-26 CRAN (R 3.5.0)
 httpcode       0.2.0      2016-11-14 cran (@0.2.0)
 httr           1.3.1      2017-08-20 CRAN (R 3.5.0)
 ipred          0.9-7      2018-08-14 cran (@0.9-7)
 jsonlite       1.5        2017-06-01 CRAN (R 3.5.0)
 kernlab        0.9-27     2018-08-10 CRAN (R 3.5.0)
 lattice        0.20-35    2017-03-25 CRAN (R 3.5.1)
 lava           1.6.3      2018-08-10 cran (@1.6.3)
 lazyeval       0.2.1      2017-10-29 CRAN (R 3.5.0)
 lubridate      1.7.4      2018-04-11 CRAN (R 3.5.0)
 magic          1.5-9      2018-09-17 CRAN (R 3.5.1)
 magrittr       1.5        2014-11-22 CRAN (R 3.5.0)
 MASS           7.3-50     2018-04-30 CRAN (R 3.5.1)
 Matrix         1.2-14     2018-04-13 CRAN (R 3.5.1)
 memoise        1.1.0      2017-04-21 CRAN (R 3.5.0)
 methods      * 3.5.1      2018-08-13 local
 mlr            2.13       2018-08-28 CRAN (R 3.5.0)
 munsell        0.5.0      2018-06-12 CRAN (R 3.5.0)
 nlme           3.1-137    2018-04-07 CRAN (R 3.5.1)
 nnet           7.3-12     2016-02-02 CRAN (R 3.5.1)
 oai            0.2.2.9315 2018-05-31 local (ropensci/oai@NA)
 parallel       3.5.1      2018-08-13 local
 parallelMap    1.3        2015-06-10 cran (@1.3)
 ParamHelpers   1.11       2018-06-25 cran (@1.11)
 pillar         1.3.0      2018-07-14 CRAN (R 3.5.0)
 pkgconfig      2.0.2      2018-08-16 CRAN (R 3.5.1)
 pls            2.7-0      2018-08-21 CRAN (R 3.5.1)
 plyr           1.8.4      2016-06-08 CRAN (R 3.5.0)
 prodlim        2018.04.18 2018-04-18 cran (@2018.04)
 purrr          0.2.5      2018-05-29 CRAN (R 3.5.0)
 qlcMatrix      0.9.7      2018-04-20 CRAN (R 3.5.0)
 R6             2.2.2      2017-06-17 CRAN (R 3.5.0)
 raster         2.6-7      2017-11-13 CRAN (R 3.5.0)
 Rcpp           0.12.18    2018-07-23 CRAN (R 3.5.1)
 RcppRoll       0.3.0      2018-06-05 cran (@0.3.0)
 recipes        0.1.3      2018-06-16 cran (@0.1.3)
 rgbif          1.0.2.9421 2018-09-24 local (ropensci/rgbif@43cf71c)
 rgdal          1.3-4      2018-08-03 CRAN (R 3.5.0)
 rgeos          0.3-28     2018-06-08 CRAN (R 3.5.0)
 rlang          0.2.2      2018-08-16 CRAN (R 3.5.0)
 robustbase     0.93-2     2018-07-27 CRAN (R 3.5.0)
 rpart          4.1-13     2018-02-23 CRAN (R 3.5.1)
 rtichoke       0.2.1      <NA>       local
 scales         1.0.0      2018-08-09 CRAN (R 3.5.1)
 scrubr         0.1.3.9321 2018-05-08 local (ropensci/scrubr@NA)
 sdmbench     * 0.1.2      2018-09-24 local (boyanangelov/sdmbench@cb00187)
 sfsmisc        1.1-2      2018-03-05 cran (@1.1-2)
 slam           0.1-43     2018-04-23 CRAN (R 3.5.0)
 sp             1.3-1      2018-06-05 CRAN (R 3.5.0)
 sparsesvd      0.1-4      2018-02-15 CRAN (R 3.5.0)
 splines        3.5.1      2018-08-13 local
 stats        * 3.5.1      2018-08-13 local
 stringi        1.2.4      2018-07-20 CRAN (R 3.5.0)
 stringr        1.3.1      2018-05-10 CRAN (R 3.5.0)
 survival       2.42-6     2018-07-13 CRAN (R 3.5.1)
 tibble         1.4.2      2018-01-22 CRAN (R 3.5.0)
 tidyr          0.8.1      2018-05-18 CRAN (R 3.5.0)
 tidyselect     0.2.4      2018-02-26 CRAN (R 3.5.0)
 timeDate       3043.102   2018-02-21 cran (@3043.10)
 tools          3.5.1      2018-08-13 local
 triebeard      0.3.0      2016-08-04 CRAN (R 3.5.0)
 urltools       1.7.1      2018-08-03 CRAN (R 3.5.1)
 utils        * 3.5.1      2018-08-13 local
 whisker        0.3-2      2013-04-28 CRAN (R 3.5.0)
 withr          2.1.2      2018-03-15 CRAN (R 3.5.0)
 xml2           1.2.0      2018-01-24 CRAN (R 3.5.0)

Consider rgbif::occ_data?

from JOSS review

I noticed that in get_benchmarking_data() you use rgbif::occ_search. You might consider rgbif::occ_data as it has all the same user interface (parameters) but only spends time internally getting and parsing the occurrence data and completely drops the other data that you don't want.

Also note that occ_search/occ_data use the GBIF occurrence API, which is limited to 200K results for any one query. So if you know or think you might need more than that you should use the download API via rgbif::occ_download and friends

also also note that we now have an interface to the GBIF maps API via rgbif::map_fetch which is a super fast way to get a raster map of occurrences, so you're not getting the actual occurrence data but a quick summary of the data- see https://www.gbif.org/developer/maps - Don't know if this is appropriate or not for your use case(s) but worth knowing about

vignette

from JOSS review:

I couldn't install while also building the vignette

devtools::install_github("boyanangelov/sdmbench", build_vignettes=TRUE, force = TRUE)
* checking for file/private/var/folders/fc/n7g_vrvn0sx_st0p8lxb3ts40000gn/T/Rtmph6H98S/devtools35453b08bf41/boyanangelov-sdmbench-f34ed13/DESCRIPTION... OK
* preparingsdmbench:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... ERROR
sh: /usr/local/bin/virtualenv: /usr/local/opt/python/bin/python3.6: bad interpreter: No such file or directory
Quitting from lines 12-18 (sdmbench_vignette.Rmd)
Error: processing vignette 'sdmbench_vignette.Rmd' failed with diagnostics:
Error 126 occurred creating virtualenv at ~/.virtualenvs/r-tensorflow
Execution halted
Installation failed: Command failed (1)

I guess a python virtualenv problem, but not sure how to solve. Reason I bring up is that the vignette here https://boyanangelov.com/materials/sdmbench_vignette.html has examples that aren't in line with the version of the package at v0.1.2, e.g.,

benchmarking_data <- get_benchmarking_data("Ornithorhynchus anatinus", limit = 1200, bioclim_resolution = 10)
#> Error in get_benchmarking_data("Ornithorhynchus anatinus", limit = 1200,  :
#>   unused argument (bioclim_resolution = 10)

another example: had to change benchmarking_data$raster_data$bioclim_data$bio1 to benchmarking_data$raster_data$climate_variables$bio1 get the first plot eg to work

goodpractice checks

I used the goodpractice package to run some automated code checks of sdmbench. Below are some super minor style/robustness issues it flagged that you could change if you wanted to.

On two lines (here and here) you used = for assignment, instead of <- which you used everywhere else.

On quite a few lines of the shiny server file there are more than 80 characters per line, which makes it a bit difficult to read (particularly here). It would be worth reformatting that code to have shorter lines

On this line of the shiny server, you used the code pattern: 1:length(x), but seq_along(x) is (very mildly) preferable (in general), since in the case x has length 0 (e.g. an empty list), 1:length(x) returns c(1L, 0L), but seq_along returns integer(0).

That's all it found though, and these are very minor, which is great! I run those checks on packages quite regularly, and the reports are rarely as short as that.

function examples

The following functions are mentioned in the package-level helpfile (note the links don't work), so will probably be used regularly, but they have only very minimal (or no) examples:
run_sdmbench()
get_benchmarking_data()
partition_data()
benchmark_sdm()
get_best_model_results()
plot_sdm_map()

It would be helpful to have more involved examples, with comments, showing the effects of the different arguments, and how you might use them in different situations. Scott's spocc package is again a good example of this, the major functions all have multiple examples.

Your examples seem to have very long lines too, so it's hard to read them. I'd recommend sticking to an 80-characters-per-line limit for these too.

spellcheck

devtools::spell_check() flags a few spelling mistakes in the documentation, e.g.:

WORD FOUND IN
acessed benchmark_sdm.Rd:20
inbalanced benchmark_sdm.Rd:17
leanring train_dl.Rd:10
occurence benchmark_sdm.Rd:11, customPredictFun.Rd:12, evaluate_dl.Rd:12
occurences partition_data.Rd:12
partitionined partition_data.Rd:19
vlaue get_benchmarking_data.Rd:18

R version dependency required?

Hi Boyan, I'm Nick, one of the reviewers of sdmbench for JOSS. I'm starting my code review now, so will post issues as they come up, then I'll summarise them in the review thread later.


I just went to install the package following the readme instructions:

devtools::install_github("boyanangelov/sdmbench")

and it failed to install with the error:

ERROR: this R is version 3.4.3, package 'sdmbench' requires R >= 3.4.4

which is because there's a dependency on a very recent version of R in DESCRIPTION.

Is that dependency on a super-recent R version really necessary?

If not, it might be worth removing this entry altogether.

If it is necessary, it would be worth mentioning that in the installation information.

package-level helpfile

Something I always look for when getting started with a new package is a package-level helpfile, so I can do ?sdmbench and see some general information about the package, how to get started, and see a roadmap to the available functions.

sdmbench doesn't have one of these yet, but it would be great if it did. You can create one of these with roxygen by documenting a NULL in a script in the R/ folder, like spocc does here.

undocumented code objects

R CMD check flagged some exported functions that are not documented:

‘customPredictFunGBM’ ‘customPredictFunKSVM’ ‘customPredictFunLogreg’ ‘customPredictFunMultinom’ ‘customPredictFunNB’ ‘customPredictFunXGB’

perhaps these were exported by mistake?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.