rspatial / geodata Goto Github PK

download geographic data

License: GNU General Public License v3.0

R 100.00%

geodata's Introduction

rspatial

R package with data sets used in the material on the https://rspatial.org website to teach spatial data analysis with R.

You can install the package like this

remotes::install_github("rspatial/rspatial")

geodata's People

Contributors

Stargazers

Watchers

Forkers

gpiras ani-ghosh aramburumerlos wangdata arrendi morandiaye agwise-eia plantarum ms609 ambarbosa andtise qitao1998 gcostaneto

geodata's Issues

GADM update for upcoming 4.1. version: of Kyrgyzstan borders

Hello,
I know that the new version 4.1. is expected to be released next month (April 2022). I wanted to point out that the Kyrgyzstan border needs an update. Well, in fact it needs it for quite a long time. On the picture below, in yellow is a shapefile from the national dataset, in blue the GADM lvl 1 (but that concerns all levels obviously). You can see that over the eastern border there are small areas that do not belong to Kyrgyzstan, but China.

Here is google maps for a comparison, showing borders following the national dataset.

I am just asking to include this in the upcoming update. Thank you!

'sp_occurrence' documentation suggestions

In the 'sp_occurrence' help file (not run) examples, plot(gs) tries to plot an entire dataframe, which results in Error in plot.new() : figure margins too large. I think it should be plot(gs$lon, gs$lat).

It would also be very useful to provide a usage example for 'args'. From the argument description, it's difficult to uderstand how exactly it should be specified. For example, is it possible to download only records where the Year column is either NA or within a given range of values, or more recent than a given year?

Cheers and thanks again for the great package!

sp_occurrence suggests unsupported * wildcard

The documentation for sp_occurrence suggests using an * as a wildcard:

geodata/man/sp_occurrence.Rd

Line 40 in 6c78465

 \item{species}{character. species name. Use '*' to download the entire genus. Append '*' to the species name to get all naming variants (e.g. with and witout species author name) and sub-taxa } 

This doesn't work for me, and after some searching I think the API doesn't support wildcards https://discourse.gbif.org/t/searching-on-catalogue-number/3202

If I try the example from https://rspatial.org/sdm/2_sdm_occdata.html#importing-occurrence-data I get zero results:

library(geodata)
sp_occurrence("solanum", "acaule*", download=FALSE)

[1] 0

Similarly,

sp_occurrence("solanum", "*", download=FALSE)

[1] 0

Perhaps I don't understand, as I'm not familiar with the GBIF API. But if this should work, I think there's something missing in the documentation, and possibly also the rspatial tutorial, to explain it.

WorldClim: .check_cmip6() does not allow for the time period 2081–2100

Unless there is something I missed, and a good reason not to include them, the function cmip6_world() currently does not allow the download of future climate data from WorldClim for the period 2081–2100. They are however available, as can be seen from the description page or from a (randomly selected) download page.

It all comes down to modify the .check_cmip6() function (line 150 in the worldclim.R file) with:

    stopifnot(time %in% c("2021-2040", "2041-2060", "2061-2080", "2081-2100"))

osm/highways/USA_highways.gpkg doesn't exist

Hiya!

Running this roads_us <- geodata::osm("United States", "highways", path = tempdir())

returns this error cannot open URL 'https://geodata.ucdavis.edu/geodata/osm/highways/USA_highways.gpkg': HTTP status was '404 Not Found'

It looks like USA_highways.gpkg doesn't exist in https://geodata.ucdavis.edu/geodata/osm/highways/

Thanks for the great package!

`cmip6_tile()` ignores `res` argument

When running cmip6_tile(), no matter what you provide as the resargument (even values that supposedly aren't allowed), it downloads the layers at 0.5 arc-min resolution:

wc.fut <- geodata::cmip6_tile('MRI-ESM2-0',
                              '585',
                              '2061-2080',
                              var = 'bioc',
                              res = 2.5,  # but tried also res=10, res=115, etc.
                              path = tempdir(),
                              lon = 80.69787,
                              lat = 7.621672)

res(wc.fut)
# [1] 0.008333333 0.008333333

A fix would be greatly appreciated, especially because the download at this resolution takes a looong time :)
Regards!

Issue with trim

Looks like a newer version of terra no longer supports characters for the function trim resulting in the following error and affecting a number of the geodata functions

country <- "Kenya"
### works 
### geodata_0.3-1 terra_1.3-9
> toupper(trim(country[1]))
[1] "KENYA"

### doesn't work 
### geodata_0.3-1 terra_1.4-3  
> toupper(trim(country[1]))
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘trim’ for signature ‘"character"’

World maps with different countries at different resolutions

Hi,

I've used world() to download world maps at all five different resolutions. I notice that the countries available are different with each resolution. There are 252 polygons at resolution 2:

> w2
 class       : SpatVector 
 geometry    : polygons 
 dimensions  : 252, 2  (geometries, attributes)
 extent      : -180, 180, -90, 83.65833  (xmin, xmax, ymin, ymax)
 coord. ref. : +proj=longlat +datum=WGS84 +no_defs 
 names       : GID_0      NAME_0
 type        : <chr>       <chr>
 values      :   ABW       Aruba
                 AFG Afghanistan
                 AGO      Angola

But only 231 at the highest resolution. In particular, the following countries are missing:

> w2$NAME_0[! w2$NAME_0 %in% w1$NAME_0]
[1] "Gibraltar"                           
[2] "Monaco"                              
[3] "Maldives"                            
[4] "Marshall Islands"                    
[5] "Tuvalu"                              
[6] "United States Minor Outlying Islands"
[7] "Paracel Islands"

I checked the polygons, and Gibraltar isn't subsumed within Spain, it's just missing from the resolution = 1 map. It would make sense if the highest resolution maps included some countries that were too small to represent in the lower resolution layers, but this seems to be the opposite - shouldn't everything that appears in the second highest resolution map also appear in the highest resolution map?

I'm not sure if this is a geodata issue, or something upstream at GADM?

download a subregion

I want to download West Africa map and wonder if possible with geodata. Otherwise is it possible to download many individual countries at the same time or download 2 contiguous countries or merge them?

'sp_occurrence' now returns only one row

This was previously working fine (except for the 'geo' argument, mentioned in this other issue), but now it's returning only one row, even if the console output says there are more:

occ_gbif <- sp_occurrence(genus = "Daboia", species = "mauritanica")
# trying URL 'https://api.gbif.org/v1/occurrence/search?scientificname=Daboia+mauritanica&limit=1&coordinatestatus=true'
# Content type 'application/json' length 3922 bytes
# ==================================================
# downloaded 3922 bytes
# 
# 105 records found
# 0-105
# 105 records downloaded

nrow(occ_gbif)
# [1] 1

str(occ_gbif)
# 'data.frame':	1 obs. of  73 variables:
#  $ acceptedScientificName       : chr "Daboia mauritanica Gray, 1849"
#  $ acceptedTaxonKey             : int 5789511
# [...]

`sp_occurrence` changes original coordinate column names

I often like to have students use geodata::sp_occurrence instead of the more widely used rgbif functions, because it's simpler to install and the resulting object is simpler to handle. However, this breaks downstream code that uses the original GBIF column names "decimalLongitude" and "decimalLatitude", and makes it difficult to have everyone use whichever function they prefer and then use the remaining scripts seamlessly. Is there a good reason for geodata::sp_occurrence to change these names to 'lon' and 'lat'? Could an option be provided for keeping the original column names? Cheers

The package cannot open connection to UC Davis database website

Hi I was trying to run gadm function in geodata package, but since yesterday, it's been returnning connection errors:

Error in file(con, "r") : 
  cannot open the connection to 'https://geodata.ucdavis.edu/gadm/gadm4.1.txt'
In addition: Warning message:
In file(con, "r") :
  URL 'https://geodata.ucdavis.edu/gadm/gadm4.1.txt': Timeout of 6000 seconds was reached
trying URL 'https://geodata.ucdavis.edu/gadm/gadm4.1/pck/gadm41_MOZ_1_pk.rds'
Error in utils::download.file(url = url, destfile = filename, quiet = quiet,  : 
  cannot open URL 'https://geodata.ucdavis.edu/gadm/gadm4.1/pck/gadm41_MOZ_1_pk.rds'
download failed

Here's my code:
moz <- geodata::gadm(country="MOZ", level=1, path=tempdir())

Not sure if this is because the serve is down or sometihng else?

Error "download failed" when using `geodata::worldclim_global()`

Download fails when using

file_path <- paste0(dirname(here::here()), "/silvoarable_review/DATASET.FROM.SCRIPT/")

geodata::worldclim_global(var = 'bio', res = 2.5, path = file_path)

Error message:

trying URL 'https://geodata.ucdavis.edu/climate/worldclim/2_1/base/wc2.1_2.5m_bio.zip'
Content type 'application/zip' length 658405521 bytes (627.9 MB)
===========
downloaded 149.5 MB

Warning: downloaded length 156811264 != reported length 658405521Warning: URL 'https://geodata.ucdavis.edu/climate/worldclim/2_1/base/wc2.1_2.5m_bio.zip': Timeout of 60 seconds was reachedError in utils::download.file(url = url, destfile = filename, quiet = quiet,  : 
  download from 'https://geodata.ucdavis.edu/climate/worldclim/2_1/base/wc2.1_2.5m_bio.zip' failed
Error in .downloadDirect(paste0(.wcurl, "base/", zip), pzip, ...) : 
  download failed

However, the old function from the raster package seems to work fine (?):

raster::getData(name = 'worldclim', var = 'bio', res = 2.5, path = file_path)

Warning: getData will be removed in a future version of raster
. Please use the geodata package insteadtrying URL 'https://biogeo.ucdavis.edu/data/climate/worldclim/1_4/grid/cur/bio_2-5m_bil.zip'
Content type 'application/zip' length 129319755 bytes (123.3 MB)
==================================================
downloaded 123.3 MB

class      : RasterStack 
dimensions : 3600, 8640, 31104000, 19  (nrow, ncol, ncell, nlayers)
resolution : 0.04166667, 0.04166667  (x, y)
extent     : -180, 180, -60, 90  (xmin, xmax, ymin, ymax)
crs        : +proj=longlat +datum=WGS84 +no_defs 
names      :  bio1,  bio2,  bio3,  bio4,  bio5,  bio6,  bio7,  bio8,  bio9, bio10, bio11, bio12, bio13, bio14, bio15, ... 
min values :  -278,     9,     8,    64,   -86,  -559,    53,  -278,  -501,  -127,  -506,     0,     0,     0,     0, ... 
max values :   319,   213,    96, 22704,   489,   258,   725,   376,   365,   382,   289, 10577,  2437,   697,   265, ...

Would it be possible to add a function to download GPW's population count data?

Currently, geodata only has a function for downloading the Gridded Population of the World (GPW) population density data via: geodata::population(). Is it possible to get the GPW population count data added as a function, or perhaps an argument within the population() call? This would be very helpful for some reproducible code I am trying to write.

Issues with the installation of geodata

Greetings,

I have terra version 1.5-17 installed on a macOS ventura 13.3.1. I want to install the geodata package using the following command:

remotes::install_github("rspatial/geodata"),

but it requires version 1.6.41 of terra package. I was unable to update the terra package to the version required by geodata. Any help?

Best regards,
KD

worldclim_tile does not return correct WorldClim bioclimatic layers

The function worldclim_tile, to download climate data from WorldClim (version 2.1) in tiles, does not return the correct WorldClim bioclimatic layers.

worldclim_tile gets layers from UCDavis (https://geodata.ucdavis.edu/climate/worldclim/2_1/tiles/tile/) and not from WorldClim directly (https://www.worldclim.org/data/worldclim21.html), so seems like something happened when preparing the tiles.

Here an example:

bioclim <- worldclim_tile(
  var = "bio",
  res = 0.5,
  lon = 51.19113,
  lat = 25.28342,
  path = tempdir(),
  version = "2.1"
)

summary(bioclim)

tile_32_wc2.1_30s_bio_13 tile_32_wc2.1_30s_bio_14
Min. :32.94 Min. : 33.4
1st Qu.:43.99 1st Qu.:146.4
Median :51.59 Median :444.0
Mean :56.53 Mean :425.7
3rd Qu.:70.71 3rd Qu.:651.2
Max. :91.41 Max. :961.8
NA's :31810 NA's :31810

Note the values for tile_32_wc2.1_30s_bio_13 (Precipitation of Wettest Month) are lower than the values for tile_32_wc2.1_30s_bio_14 (Precipitation of Driest Month). The layers downloaded directly from WorldClim seem to be OK.

Session Info:
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] mapview_2.11.0.9006 geodata_0.5-8 sf_1.0-14 terra_1.7-39 janitor_2.2.0
[6] lubridate_1.9.2 forcats_1.0.0 stringr_1.5.0 dplyr_1.1.2 purrr_1.0.1
[11] readr_2.1.4 tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.2 tidyverse_2.0.0

loaded via a namespace (and not attached):
[1] tidyselect_1.2.0 lattice_0.20-45 snakecase_0.11.0 colorspace_2.1-0 vctrs_0.6.2
[6] generics_0.1.3 stats4_4.2.0 htmltools_0.5.6 base64enc_0.1-3 utf8_1.2.3
[11] rlang_1.1.1 e1071_1.7-13 pillar_1.9.0 glue_1.6.2 withr_2.5.0
[16] DBI_1.1.3 sp_2.0-0 lifecycle_1.0.3 munsell_0.5.0 gtable_0.3.3
[21] raster_3.6-23 htmlwidgets_1.6.2 codetools_0.2-18 tzdb_0.4.0 fastmap_1.1.1
[26] crosstalk_1.2.0 class_7.3-20 fansi_1.0.4 leafem_0.2.0 Rcpp_1.0.11
[31] KernSmooth_2.23-20 satellite_1.0.4 scales_1.2.1 classInt_0.4-9 leaflet_2.1.2
[36] hms_1.1.3 png_0.1-8 digest_0.6.33 stringi_1.7.12 grid_4.2.0
[41] cli_3.6.1 tools_4.2.0 magrittr_2.0.3 proxy_0.4-27 pkgconfig_2.0.3
[46] timechange_0.2.0 rstudioapi_0.15.0 R6_2.5.1 units_0.8-3 compiler_4.2.0

Server down for maintenance?

Hi, we are trying to use the geodata package but when attempting to get worldclim data, the following error appears:

The geodata server is down for maintenance.
It is expected to be back online on April 13, 2023.

It's past that date, so we're not sure what is going on.

Thanks!

Indicate WorldClim version in docs

Hi,

I'm starting to switch my code over to geodata from raster::getData. Comparing the data collected by each function, I was surprised that the layers didn't match - variables are recorded on different scales, with different extents, and different ranges.

After looking at the URLs for each function, I see now that getData downloads WorldClim version 1.4, and worldclim_global downloads version 2.1. Perhaps you could add this to the help page for worldclim_global? It might help others making the switch and finding the results of their analyses have changed.

Thanks all your work building and sharing these tools, they're indispensable for our work.

Best,

Tyler

`elevation*` and others: delete .zip file after use

Functions that download raster maps, such as elevation* and population, download a .zip file and then save a .tif file alongside it. It seems that the .zip is not needed further, so I'd suggest adding unlink(pzip) at the end of these functions, to save users disk space.

Citing GBIF data properly

Hello I am writing from GBIF.

I am doing a small outreach to those R packages that use GBIF occurrence search.

Under the terms of the GBIF data user agreement, users who download data agree to cite a DOI. Good citation also rewards data-publishing institutions and individuals by reinforcing the value of sharing open data and demonstrating its impact to their funders.

https://docs.ropensci.org/rgbif/articles/gbif_citations.html
https://www.gbif.org/citation-guidelines

Unfortunately, when using the occurrence search, rather than the occurrence download, one does not receive a citable DOI.

Because occurrence search is easier for some users to use, we have created something called derived datasets, which allows users to create a citable DOI after they have pulled the data from the GBIF public API.

https://www.gbif.org/derived-dataset

As a package maintainer, it would be appreciated by GBIF, if you could remind users in the documentation or with warning messages to cite the GBIF mediated data properly, perhaps by linking to one of these articles:

https://docs.ropensci.org/rgbif/articles/gbif_citations.html
https://www.gbif.org/citation-guidelines
https://www.gbif.org/derived-dataset

Also important to remind users to keep the datasetKey column because this allows for proper attribution to the original data providers.

`gadm` silently downloads only the first element of `country`

The code below doesn't complain, but silently downloads only the first country:

countries <- gadm(country = c("Portugal", "Spain", "France"), level = 1, path = tempdir())
unique(countries$COUNTRY)
# [1] "Portugal"

It would be helpful to emit at least a warning message about that. Cheers!

`sp_occurrence` obscure error when species not in GBIF

If we misspell a taxon name, sp_occurrence adequately returns 0 when download=FALSE, i.e. zero records on GBIF for the specified taxon. But if download=TRUE (the default), a puzzling error message appears:

sp_occurrence(genus = "Pathera", species = "leo", download = FALSE)
# [1] 0

sp_occurrence(genus = "Pathera", species = "leo")
# Error in sp_occurrence(genus = "Pathera", species = "leo") : 
  start <= end is not TRUE

Could it instead emit a message indicating what the problem is? Cheers!

cmip6_tile error

I am using cmip6_tile of geodata but download failed error is issued:
bio10 <- cmip6_tile(lon=-15, lat=15, model="CNRM-CM6-1", ssp="585", time="2061-2080", var="bio", res=10, path=tempdir())
trying URL 'https://geodata.ucdavis.edu/cmip6/tiles/CNRM-CM6-1/ssp585/wc2.1_30s_bioc_CNRM-CM6-1_ssp585_2061-2080_tile-30.tif'
download failed

Any idea on solving this?

elevation_3s() fails with "EXPIRED_CERTIFICATE"

Trying to run the elevation_3s() fails with:

Error in utils::download.file(url = url, destfile = filename, quiet = quiet,  : 
  cannot open URL 'https://srtm.csi.cgiar.org/wp-content/uploads/files/srtm_5x5/TIFF/srtm_44_06.zip'
download failed

When trying to access the download URL directly I get:

NET::ERR_CERT_DATE_INVALID
Subject: srtm.csi.cgiar.org
Issuer: R3
Expires on: 2 Nov 2022

Is there anyone we can contact to address this problem, and install a new certificate?

worldclim_countries() and worlrldclim_tiles() always fetch 30-arcmin rasters, despite "res" argument

I've been finding that the worldclim_countries and worldclim_tile functions always download the 30-arcmin version of rasters despite specifying res to be something else. This happens even after I clean out the previously downloaded files and restart R.

library(geodata)
bios <- geodata::worldclim_country('MDG', var='bio', res=10, path=getwd())

bios

class       : SpatRaster 
dimensions  : 1740, 900, 19  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : 43, 50.5, -26, -11.5  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
source      : MDG_wc2.1_30s_bio.tif 
names       : wc2.1~bio_1, wc2.1~bio_2, wc2.1~bio_3, wc2.1~bio_4, wc2.1~bio_5, wc2.1~bio_6, ... 
min values  :    11.29167,     6.12500,    54.04762,    90.91286,        18.6,         2.0, ... 
max values  :    28.05000,    16.55833,    77.26189,   327.99295,        36.8,        21.2, ...

Same thing for:

other variables (tmin, tmax, prec)
worldclim_tile

However, worldclim_global does get the correct resolution rasters when res is specified.

> sessionInfo()
R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                           LC_TIME=English_United States.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] geodata_0.4-11 terra_1.6-17  

loaded via a namespace (and not attached):
[1] compiler_4.2.1   tools_4.2.1      Rcpp_1.0.9       codetools_0.2-18

The package can't connect to server hosted at uc davis

Error in utils::download.file(url = url, destfile = filename, quiet = quiet, :
cannot open URL 'https://geodata.ucdavis.edu/climate/worldclim/2_1/base/wc2.1_5m_elev.zip'
In addition: Warning message:
In utils::download.file(url = url, destfile = filename, quiet = quiet, :
InternetOpenUrl failed: 'A connection with the server could not be established'
Error in .downloadDirect(theurl, pzip, unzip = TRUE, ...) :
download failed

Question about soil_word function

Sorry for taking your time. However, I have the following problem when using the soil_word function:
For example:
gph <- soil_world(var="bdod", depth=100, path=tempdir())
Report an error:
Error in soil_world(var = "bdod", depth = 100, path = tempdir()) :
file not yet available: bdod_60-100cm_mean_30s.tif
The above error occurs when the depth is 100 or 200. This warning does not appear when the parameter depth is set to other depths.

Reinstalling the package didn't work either. Looking forward to your reply.

'population' and 'elevation_global' invalid URLs

Hello,
The functions below are currently returning "no such file or directory" or "invalid 'url' argument" errors. Maybe their files have been renamed or (re)moved in the server?

elev <- elevation_global(res = 10, path = tempdir())
pop <- population(year = 2020, res = 10, path = tempdir())

How to use the new CMIP6 data with geodata

I am trying to adapt the example from here (https://rstudio-pubs-static.s3.amazonaws.com/224303_df34f170cd9144cda6477ae8232887f7.html), which uses the CMIP5 data (now obsolete), to the latest WorldClim models (http://www.worldclim.org/) using the new CMIP6 data,

For retrieving the data, I am using the geodata package.

Here is the script as far as I could run it:

[edited to just show the relevant part]

library(geodata)
# Get climate data
currentEnv <- worldclim_global(var="bio", res=2.5, path=getwd())
currentEnv <- dropLayer(currentEnv, c("bio2", "bio3", "bio4", "bio10", "bio11", "bio13", "bio14", "bio15"))
#Error in (function (classes, fdef, mtable)  : 
#  unable to find an inherited method for function ‘dropLayer’ for signature ‘"SpatRaster"’

Could a working example of using the latest CMIP6 data with geodata be provided?

Feature request: marine variables

Dear Robert,
By now 'geodata' already downloads a great set of environmental variables, but it's still missing the marine environment. When you get a chance for another enhancement, it would be great if 'geodata' could also access the Bio-ORACLE (https://bio-oracle.org/) dataset of present and future marine variables.
Cheers!

No data in 30s bioclim

When downloading 30 arcsecond WorldClim data using the cmip6_world function, the resulting raster files are empty (no data?). All values are NA or NaN.

future <- cmip6_world(model = "CanESM5",
                      ssp = "585",  
                      time = "2061-2080",
                      var = "bioc", 
                      res = 0.5,
                      path = ".")

summary(future)
     wc2_1            wc2_2            wc2_3            wc2_4            wc2_5            wc2_6            wc2_7       
 Min.   : NA      Min.   : NA      Min.   : NA      Min.   : NA      Min.   : NA      Min.   : NA      Min.   : NA     
 1st Qu.: NA      1st Qu.: NA      1st Qu.: NA      1st Qu.: NA      1st Qu.: NA      1st Qu.: NA      1st Qu.: NA     
 Median : NA      Median : NA      Median : NA      Median : NA      Median : NA      Median : NA      Median : NA     
 Mean   :NaN      Mean   :NaN      Mean   :NaN      Mean   :NaN      Mean   :NaN      Mean   :NaN      Mean   :NaN     
 3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA     
 Max.   : NA      Max.   : NA      Max.   : NA      Max.   : NA      Max.   : NA      Max.   : NA      Max.   : NA     
 NA's   :100352   NA's   :100352   NA's   :100352   NA's   :100352   NA's   :100352   NA's   :100352   NA's   :100352  
     wc2_8            wc2_9            wc2_10           wc2_11           wc2_12           wc2_13           wc2_14      
 Min.   : NA      Min.   : NA      Min.   : NA      Min.   : NA      Min.   : NA      Min.   : NA      Min.   : NA     
 1st Qu.: NA      1st Qu.: NA      1st Qu.: NA      1st Qu.: NA      1st Qu.: NA      1st Qu.: NA      1st Qu.: NA     
 Median : NA      Median : NA      Median : NA      Median : NA      Median : NA      Median : NA      Median : NA     
 Mean   :NaN      Mean   :NaN      Mean   :NaN      Mean   :NaN      Mean   :NaN      Mean   :NaN      Mean   :NaN     
 3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA     
 Max.   : NA      Max.   : NA      Max.   : NA      Max.   : NA      Max.   : NA      Max.   : NA      Max.   : NA     
 NA's   :100352   NA's   :100352   NA's   :100352   NA's   :100352   NA's   :100352   NA's   :100352   NA's   :100352  
     wc2_15           wc2_16           wc2_17           wc2_18           wc2_19      
 Min.   : NA      Min.   : NA      Min.   : NA      Min.   : NA      Min.   : NA     
 1st Qu.: NA      1st Qu.: NA      1st Qu.: NA      1st Qu.: NA      1st Qu.: NA     
 Median : NA      Median : NA      Median : NA      Median : NA      Median : NA     
 Mean   :NaN      Mean   :NaN      Mean   :NaN      Mean   :NaN      Mean   :NaN     
 3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA      3rd Qu.: NA     
 Max.   : NA      Max.   : NA      Max.   : NA      Max.   : NA      Max.   : NA     
 NA's   :100352   NA's   :100352   NA's   :100352   NA's   :100352   NA's   :100352

Downloads failing for model="GFDL-ESM4", ssp="585", res=2.5, time="2041-2060, tmin and tmax

I am trying to run the following lines of code:

cur.tmax <- cmip6_world(ssp="585", model=cur.gcm, time="2041-2060", var="tmax", res=2.5)
cur.tmin <- cmip6_world(ssp="585", model=cur.gcm, time="2041-2060", var="tmin", res=2.5)

and receive the following error:

trying URL 'https://geodata.ucdavis.edu/cmip6/2.5m/GFDL-ESM4/ssp585/wc2.1_2.5m_tmax_GFDL-ESM4_ssp585_2041-2060.tif'
download failed

(equivalent error for tmin).

Downloading those files directly from the URL also does not work. The precipitation variable works okay, all other tmin and tmax work for the specified time period and ssp using other GCMs.

`worldclim_country()` for more than one country

When the country argument for worldclim_country() has length >1, only the first element is used, and the other are silently ignored.

tavg <- geodata::worldclim_country(country = c("Luxembourg", "Netherlands", "Belgium"), var = "tavg", path = tempdir())
terra::plot(tavg[[1]])
terra::plot(vect(system.file("ex/lux.shp", package="terra")), add = TRUE)  # only Luxembourg was taken

I think there should be at least a warning about this, or (ideally) the function could take more than one country at a time.

worldclim_country is full of NaNs for US

It works for a smaller country like Iceland:
wc <- worldclim_country("Iceland", "bio", ".") wc

class : SpatRaster
dimensions : 840, 1920, 12 (nrow, ncol, nlyr)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -28, -12, 60, 67 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source : ISL_wc2.1_30s_tavg.tif
names : ISL_w~~avg_1, ISL_w~~avg_2, ISL_w~~avg_3, ISL_w~~avg_4, ISL_w~~avg_5, ISL_w~~avg_6, ...
min values : -8.8, -9.8, -11.2, -10.2, -6.1, -2.7, ...
max values : 6.3, 5.9, 5.4, 5.9, 7.7, 10.8, ...

while
wc4 <- worldclim_country("US", "bio", ".") wc4

class : SpatRaster
dimensions : 6540, 43140, 19 (nrow, ncol, nlyr)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -179.5, 180, 18.5, 73 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source : USA_wc2.1_30s_bio.tif
names : wc2.1~~bio_1, wc2.1~~bio_2, wc2.1~~bio_3, wc2.1~~bio_4, wc2.1~~bio_5, wc2.1~~bio_6, ...

Notice the lack of min and max values and the extent that doesn't make sense. If I take the summary, it is nothing but NaNs

summary(wc4)

wc2.1_30s_bio_1 wc2.1_30s_bio_2 wc2.1_30s_bio_3
Min. : NA Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA Median : NA
Mean :NaN Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA Max. : NA
NA's :100812 NA's :100812 NA's :100812
wc2.1_30s_bio_4 wc2.1_30s_bio_5 wc2.1_30s_bio_6
Min. : NA Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA Median : NA
Mean :NaN Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA Max. : NA
NA's :100812 NA's :100812 NA's :100812
wc2.1_30s_bio_7 wc2.1_30s_bio_8 wc2.1_30s_bio_9
Min. : NA Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA Median : NA
Mean :NaN Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA Max. : NA
NA's :100812 NA's :100812 NA's :100812
wc2.1_30s_bio_10 wc2.1_30s_bio_11 wc2.1_30s_bio_12
Min. : NA Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA Median : NA
Mean :NaN Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA Max. : NA
NA's :100812 NA's :100812 NA's :100812
wc2.1_30s_bio_13 wc2.1_30s_bio_14 wc2.1_30s_bio_15
Min. : NA Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA Median : NA
Mean :NaN Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA Max. : NA
NA's :100812 NA's :100812 NA's :100812
wc2.1_30s_bio_16 wc2.1_30s_bio_17 wc2.1_30s_bio_18
Min. : NA Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA Median : NA
Mean :NaN Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA Max. : NA
NA's :100812 NA's :100812 NA's :100812
wc2.1_30s_bio_19
Min. : NA
1st Qu.: NA
Median : NA
Mean :NaN
3rd Qu.: NA
Max. : NA
NA's :100812
Warning message:
[summary] used a sample

'sp_occurrence' with 'geo=TRUE' still returns NA coordinates

The 'geo' argument says "only records that have a georeference (longitude and latitude values) will be downloaded", but this doesn't seem to be the case:

gf <- sp_occurrence(genus = "Daboia", species = "mauritanica", geo = F)
gt <- sp_occurrence(genus = "Daboia", species = "mauritanica", geo = T)
all.equal(gf, gt)  # TRUE
gt[ , c("lon", "lat")]  # NA coordinates still there

Cheers!

geodata dowloads fail with 'Couldn't connect to server'

All geodata functions (at least all the several ones I've tried) have been giving 'Couldn't connect to server' errors since yesterday. Looks like the server is down?

soil_ph <- soil_world(var = "phh2o", depth = 5, stat = "mean", path = tempdir())
# Error in file(con, "r") : 
#   cannot open the connection to 'https://geodata.ucdavis.edu/geodata/soil/soilgrids/files.txt'
# In addition: Warning message:
# In file(con, "r") :
#   URL 'https://geodata.ucdavis.edu/geodata/soil/soilgrids/files.txt': status was 'Couldn't connect to server'

gadm(country="FRA", level=1, path=tempdir())
# Error in file(con, "r") : 
#   cannot open the connection to 'https://geodata.ucdavis.edu/gadm/gadm3.6.txt'
# In addition: Warning message:
# In file(con, "r") :
#   URL 'https://geodata.ucdavis.edu/gadm/gadm3.6.txt': status was 'Couldn't connect to server'

`gadm` requires specifying `level`, though it should be 1 by default

The "Usage" section of the gadm() help file says level=1 by default, but the function fails if level isn't provided:

countries <- gadm(country = "Portugal", path = tempdir())

Error in gadm(country = "Portugal", path = tempdir()) : 
  provide a "level=" argument; levels can be 0, 1, or 2 for most countries, and higher for some

...whereas for world() the "Usage" defaults are working as expected. Cheers

soil types unavailable?

Hi I'm trying to get the soil_world data, specifically trying to map soil classification in the Mediterranean and Anatolian regions.

no matter the soil type I'm getting:

file not yet available: Gleysols_30s.tif

Feature request: Global 1-km Consensus Land Cover

Could you consider possibility to add procedure for downloading and processing rasters from EarthEnv, namely "Global 1-km Consensus Land Cover"?
This resource provides raster not very fine resolution in compare with ESA WorldCover, and there is no descete classification into particular class. It provides set of 12 quantative rasters, each of which represents the probability of a land area being in one of these classes.
This resource is widely used in research, related article DOI: 10.1111/geb.12182 has 353 citations.

Broken URLs in soil_af_isda()

Hello,

Considering function soil_af_isda(), the URLs to download some of the variables appear to be broken, namely: bdr, clay, C.tot, N.tot, sand, silt and texture. All others variables are working as expected.

Thanks for your generosity.

`world()` returning zero geometries

This was working fine with 'geodata' version 0.4.1, but after updating to 0.4.3, world() now returns an object with no geometries or attributes:

world(resolution = 5, level = 0, path = tempdir())

The same thing is happening for other values of 'resolution' too. gadm() seems to keep working normally, though.

International and state borders in water create issue determining centroid.

I am able to get the center latitude and longitude of each US county and Canadian census geographic unit (CGU). However, there are some false counties that arise through the process, particularly in the Great Lakes region. In this region, there are some random polygons within the Great Lakes (state/international borders) that produce these false counties. Is there a way to remove these aquatic polygons so that I only get the central latitude and longitude of the land polygons?

library(ggplot2)
library(dplyr)
library(geodata)
library(sf)

# Gets the polygons of each US county and Canadian CGU

aa <- gadm(country = c('CAN', 'USA'), level = 2, path=tempdir()) %>%
  st_as_sf()

# Gets the central latitude and longitude of each county and CGU

centroids_aa <- aa %>% 
  st_centroid(of_largest_polygon = TRUE)

# Map of centroids (there are obviously incorrect counties and CGUs for sites within the Great Lakes) 

ggplot() +
  geom_sf(data = aa,
          mapping = aes(geometry = geometry),
          color = "black") +
  geom_sf(data = centroids_aa,
          mapping = aes(geometry = geometry),
          color = "black",
          fill = 'red',
          shape = 21,
          size = 2) +
  coord_sf(xlim = c(-93, -81.5), ylim = c(41.5, 50), expand = F) +
  theme_bw() +
  theme(plot.margin = grid::unit(c(2,2,0,2), "mm"),
        text = element_text(size = 16),
        axis.text.x = element_text(size = 14, color = "black"),
        axis.text.y = element_text(size = 14, color = "black"),
        panel.grid.major = element_blank())

link to Stack Overflow question on the same issue.

Bioclim variables mis-identified in worldclim_tile() output

I'm using worldclim_tile() to obtain Bioclim layers at 0.5-min resolution for the first time, and I've run into what seems to be incorrect labeling of the layers. Running

BClim <- worldclim_tile(var="bio", res=0.5, lat=33.75, lon=-118, path="data/")

Returns a RasterStack with layers labeled in order from 1 to 19. However, the values in those layers don't align with anything like what they should be, given the identity of the variables. I first noticed it in comparing bio12 (annual precipitation) and bio13 (precip in the wettest month) — the former should, arithmetically, always be larger than the latter. However, the range of values for the bio12 layer in the stack downloaded above is about 5 to 25, and the range for the bio13 layer is about 26 to 66. (Squinting at hist() outputs to get those.) Subtracting the bio13 layer values from bio12 layer values gets me between about -54 and -8.

Looking across the 19 layers with plot() shows ranges of values that look like they should correspond to Bioclim variables, but not the ones labeled.

plot(BClim)

As one final check, I went back to getData() from raster, which pulls down WoldClim 1.4 instead of 2.1, and obtained values that make much more sense for the labeled variables.

BClim2 <- getData(name="worldclim", var="bio", res=0.5, lat=33.75, lon=-118, path="data/")
plot(BClim2)

Download failing for crop_spam()

Unable to download spam crop data -

`z <- geodata::crop_spam(crop = "whea", path = tempdir(), var = "harv_area")

`trying URL 'https://s3.amazonaws.com/mapspam/2010/v1.1/geotiff/spam2010v1r1_global_harv_area.geotiff.zip'
download failed

At least one of the GCMs has a different scale (range of data)

Dear Robert,

I just realised that one of the GCMs (among 8 of them I checked), has a totally different range of data (seems rescaled) than the current time (also compared to the other GCMs) which would really cause an issue in climate change studies (projections). The problematic GCM model is "BCC-CSM2-MR". I checked several others, c("CanESM5", "CNRM-CM6-1","CNRM-ESM2-1", "IPSL-CM6A-LR", "MIROC-ES2L", "MIROC6", "MRI-ESM2-0"), and they were OK.

Cheers,
Babak

create 'path' if it doesn't exist

Not a big deal, but the functions with a 'path' argument throw an error if "dir.exists(path) is not TRUE". Unless there's a reason otherwise, it would be useful to add them something like if(!dir.exists(path)) dir.create(path, recursive = TRUE)

England has <NA> values when using gadm(country = "GBR", level = 1, path = "...")

NAME_1, VARNAME_1, NL_NAME_1, TYPE_1 columns are left empty for England at administrative level 1 - consistent issue across GADM versions and resolution level.

Corrupted `cmip6_files.txt` in `.check_cmip6`

I'm running a GIS practical using geodata and had an odd issue crop up twice - once myself during testing and once from a student. The cmip6_files.txt file used in .check_cmip6 somehow got truncated: I don't know the mechanism by which this happened but the resulting empty file then causes the cmip6_* commands to fail with:

utils::read.table(tmpfile, sep = "_") :

Of course the ideal solution is to find out what truncated it - I don't think it is anything I or the student did as it is quite an obscure file to accidentally truncate - but maybe .check_cmip6 could defensively wrap the file load in a try statement?

geodata/R/worldclim.R

Lines 184 to 195 in 35c7cc8

 tmpfile <- file.path(tempdir(), "cmip6_files.txt") 

 if (!file.exists(tmpfile)) { 

 suppressWarnings(try(utils::download.file(paste0(.c6url, "files.txt"), tmpfile, quiet=TRUE), silent=TRUE)) 

 } 

 if (file.exists(tmpfile)) { 

 ff <- utils::read.table(tmpfile, sep="_") 

 i <- ff[,1] == var & ff[,2] == model & ff[,3] == paste0("ssp", ssp) & ff[,4] == paste0(time, ".tif") 

 if (sum(i) != 1) { 

 stop("This dataset is not available") 

 } 

 } 

 }

cmip_tile doesn't seem to work - HTTP status was '404 Not Found'

Based on the help file, I believe I've specified all of the inputs correctly, and the function does not seem to be able to retrieve any of these files:

cmip6_tile(-105, 70, "CanESM5", "245", "2041-2060", "bioc", path=tempdir(), res = 2.5)
trying URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-3.tif'
Error in utils::download.file(url = url, destfile = filename, quiet = quiet, :
cannot open URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-3.tif'
In addition: Warning messages:
1: In utils::download.file(url = url, destfile = filename, quiet = quiet, :
downloaded length 0 != reported length 282
2: In utils::download.file(url = url, destfile = filename, quiet = quiet, :
cannot open URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-3.tif': HTTP status was '404 Not Found'
Error in .downloadDirect(turl, outfname[i], ...) : download failed

cmip6_tile(-105, 70, "CanESM5", "245", "2041-2060", "bioc", path=tempdir(), res = 10)
trying URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-3.tif'
Error in utils::download.file(url = url, destfile = filename, quiet = quiet, :
cannot open URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-3.tif'
In addition: Warning messages:
1: In utils::download.file(url = url, destfile = filename, quiet = quiet, :
downloaded length 0 != reported length 282
2: In utils::download.file(url = url, destfile = filename, quiet = quiet, :
cannot open URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-3.tif': HTTP status was '404 Not Found'
Error in .downloadDirect(turl, outfname[i], ...) : download failed

cmip6_tile(-60, 45, "CanESM5", "245", "2041-2060", "bioc", path=tempdir(), res = 10)
trying URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-17.tif'
Error in utils::download.file(url = url, destfile = filename, quiet = quiet, :
cannot open URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-17.tif'
In addition: Warning messages:
1: In utils::download.file(url = url, destfile = filename, quiet = quiet, :
downloaded length 0 != reported length 282
2: In utils::download.file(url = url, destfile = filename, quiet = quiet, :
cannot open URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-17.tif': HTTP status was '404 Not Found'
Error in .downloadDirect(turl, outfname[i], ...) : download failed

cmip6_tile(0, 0, "CanESM5", "245", "2041-2060", "bioc", path=tempdir(), res = 10)
trying URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-43.tif'
Error in utils::download.file(url = url, destfile = filename, quiet = quiet, :
cannot open URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-43.tif'
In addition: Warning messages:
1: In utils::download.file(url = url, destfile = filename, quiet = quiet, :
downloaded length 0 != reported length 282
2: In utils::download.file(url = url, destfile = filename, quiet = quiet, :
cannot open URL 'https://geodata.ucdavis.edu/cmip6/tiles/CanESM5/ssp245/wc2.1_30s_bioc_CanESM5_ssp245_2041-2060_tile-43.tif': HTTP status was '404 Not Found'
Error in .downloadDirect(turl, outfname[i], ...) : download failed

feature request: queries for country_codes

I have a suggestion for the convenience function country_codes. It works fine as is, but it might be nice to allow users to query it directly for a particular country or region.

Here's my idea:

country_codes <- function(query = NULL) {
	path <- system.file(package="geodata")
	#d <- utils::read.csv(paste(path, "/ex/countries.csv", sep=""), stringsAsFactors=FALSE, encoding="UTF-8")
	res <- readRDS(file.path(path, "ex/countries.rds"))
        if(is.null(query)){
          res
        } else {
          hits <- apply(res, 1, function(x) any(grepl(query, x,
                                                      ignore.case = TRUE)))
          res[hits, ]
        }
}

The default behaviour is unchanged. But if I want to quickly look up a particular location, I can do:

> country_codes("mexico")
      NAME ISO3 ISO2 NAME_ISO NAME_FAO                      NAME_LOCAL
145 Mexico  MEX   MX   MEXICO   Mexico México|Estados Unidos Mexicanos
    SOVEREIGN       UNREGION1 UNREGION2     continent
145    México Central America  Americas North America

country_codes("mx")
      NAME ISO3 ISO2 NAME_ISO NAME_FAO                      NAME_LOCAL
145 Mexico  MEX   MX   MEXICO   Mexico México|Estados Unidos Mexicanos
    SOVEREIGN       UNREGION1 UNREGION2     continent
145    México Central America  Americas North America

> country_codes("samoa")
              NAME ISO3 ISO2       NAME_ISO       NAME_FAO     NAME_LOCAL
6   American Samoa  ASM   AS AMERICAN SAMOA American Samoa American Samoa
197          Samoa  WSM   WS          SAMOA          Samoa          Samoa
        SOVEREIGN UNREGION1 UNREGION2 continent
6   United States Polynesia   Oceania   Oceania
197         Samoa Polynesia   Oceania   Oceania

> country_codes("guinea")
                 NAME ISO3 ISO2          NAME_ISO          NAME_FAO
71  Equatorial Guinea  GNQ   GQ EQUATORIAL GUINEA Equatorial Guinea
96      Guinea-Bissau  GNB   GW     GUINEA-BISSAU     Guinea-Bissau
97             Guinea  GIN   GN            GUINEA            Guinea
175  Papua New Guinea  PNG   PG  PAPUA NEW GUINEA  Papua New Guinea
           NAME_LOCAL         SOVEREIGN      UNREGION1 UNREGION2 continent
71  Guinea Ecuatorial Equatorial Guinea  Middle Africa    Africa    Africa
96       Guiné-Bissau     Guinea-Bissau Western Africa    Africa    Africa
97             Guinee            Guinea Western Africa    Africa    Africa
175    Papua Niu Gini  Papua New Guinea      Melanesia   Oceania   Oceania

Not anything you can't do with an extra line or two of code, but maybe useful nevertheless?

Thanks for all your work, I'm really enjoying terra et al. as I update my scripts!