brry / rdwd Goto Github PK

View Code? Open in Web Editor NEW

67.0 11.0 12.0 175.92 MB

download climate data from DWD (German Weather Service)

Home Page: https://bookdown.org/brry/rdwd

R 100.00%

rdwd's Issues

Error when downloading grid data

Hello, I am struggling to download data from "ftp://opendata.dwd.de/climate_environment/CDC/grids_germany".

Am I doing something fundamentally wrong?
Thanks for help!

Indexing

radfiles <- indexFTP("daily/Project_TRY/air_temperature_mean/", 
                     base = gridbase, sleep = 1)

Remove PDF files

radfiles <- radfiles[-(1:2)]

If I would not write the full URL, I get an error message, that dataDWD() wants a URL beginning with ftp://...

radfiles <- paste('ftp://opendata.dwd.de/climate_environment/CDC/grids_germany/', 
                  radfiles, 
                  sep = '')

Pick smaller subset for speed

radfiles <- radfiles[1:3]

data <- dataDWD(radfiles, dir = './DWDdata/Data/', sleep = 5)

And this ist the outpu I am getting:

dataDWD -> dirDWD: adding to directory 'C:/Users/Massimo/Documents/DWDdata/Data'
dataDWD -> newFilename: creating 3 files:
C:/Users/Massimo/Documents/DWDdata/Data/ftp:__opendata.dwd.de_climate_environment_CDC_grids_germany_daily_Project_TRY_air_temperature_mean__TT_199501_daymean.nc.gz
C:/Users/Massimo/Documents/DWDdata/Data/ftp:__opendata.dwd.de_climate_environment_CDC_grids_germany_daily_Project_TRY_air_temperature_mean__TT_199502_daymean.nc.gz
  (and 1 more)
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 11s
Error: in dataDWD -> readDWD -> checkFile :  The 3 files  C:/Users/Massimo/Documents/DWDdata/Data/ftp:__opendata.dwd.de_climate_environment_CDC_grids_germany_daily_Project_TRY_air_temperature_mean__TT_199501_daymean.nc.gz, C:/Users/Massimo/Documents/DWDdata/Data/ftp:__opendata.dwd.de_climate_environment_CDC_grids_germany_daily_Project_TRY_air_temperature_mean__TT_199502_daymean.nc.gz (and 1 others)
  do not exist. Current getwd: C:/Users/Massimo/Documents/DWDdata/Data

dataDWD: should dbin default be TRUE?

Consider the implications of setting the default call to download.file with wb=TRUE

https://bookdown.org/brry/rdwd/raster-data.html#binary-file-errors

Update behaviour of dataDWD

First of all, Thanks for this amazing package!

I only found a small inconvinience when downloading data with force not being TRUE but a vector such as c(Inf, 6). I left a comment on a PR at the corresponding line where I think the bug is at: f45b080#r136216003

air_temperature history hourly data link is not working.

Dear Berry,

I noticed that the historical links for air_temperature variable is not working

selectDWD(id = 5906,res = "hourly",var = 'air_temperature',per = 'h')

"ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/historical/stundenwerte_TU_05906_19480101_20191231_hist.zip"

The above ftp link doesnot exsist in Opendwd FTP portal

When i checked the ftp data portal, the data till 2020 is now moved to historical folder.
ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/historical/stundenwerte_TU_05906_19480101_20201231_hist.zip

This link works. Can you update the selectDWD function to give appropriate links?

newFilename: The following folder does not exist: ''

Dear brry,

first, thanks so much for the handy package! I started using it today and encountered an error whose origins I could not determine. The following is my code:

download <- selectDWD(id = stations, res = "daily", outvec = TRUE)
res1 <- dataDWD(file = download, dir = "Y:/This/Is/My/Path/dataDWD")

R keeps telling me:
dataDWD -> dirDWD: adding to directory 'Y:/This/Is/My/Path/dataDWD'
Error: dataDWD -> newFilename: The following folder does not exist: ''

This happens even when I use:
res1 <- dataDWD(file = download)

Then, it creates the folder correctly at the current getwd() location, but kinda refuses to fill it.

Do you know any help?

website in documentations

in all documentation SeeAlso sections, point to the appropriate part of the website

duplicate files on DWD server

Hi Berry,
thanks for your nice package, it simplifies access to DWD database a lot.
When playing around with monthly "kl"-data I realized that there are multiple files for the same station ID on the server (usually with overlapping time periods) and there are even differences between the fileIndex of the package and an index created using the function indexFTP. Here´s a reprex to show what I mean:

suppressWarnings(library("tidyverse"))
suppressWarnings(library("rdwd"))

data(fileIndex) 

ind1 = fileIndex %>% 
 filter(res == "monthly" & var == "kl" & per == "historical") %>% 
 filter(str_detect(string = path, pattern = ".\\.zip$")) %>% 
 as_tibble()
 
ind1 %>% filter(gdata::duplicated2(id))
#> # A tibble: 10 x 8
#>    res    var   per       id start      end        ismeta path                  
#>    <chr>  <chr> <chr>  <int> <date>     <date>     <lgl>  <chr>                 
#>  1 month~ kl    histo~   116 1899-06-01 1966-12-31 FALSE  monthly/kl/historical~
#>  2 month~ kl    histo~   116 1899-06-01 1973-05-31 FALSE  monthly/kl/historical~
#>  3 month~ kl    histo~   896 1963-01-01 2018-12-31 FALSE  monthly/kl/historical~
#>  4 month~ kl    histo~   896 1991-01-01 2018-12-31 FALSE  monthly/kl/historical~
#>  5 month~ kl    histo~  1568 1958-05-01 1970-11-30 FALSE  monthly/kl/historical~
#>  6 month~ kl    histo~  1568 1958-05-01 1982-12-31 FALSE  monthly/kl/historical~
#>  7 month~ kl    histo~  1762 1897-01-01 2006-12-31 FALSE  monthly/kl/historical~
#>  8 month~ kl    histo~  1762 1991-01-01 2006-12-31 FALSE  monthly/kl/historical~
#>  9 month~ kl    histo~  5424 1949-03-01 2018-12-31 FALSE  monthly/kl/historical~
#> 10 month~ kl    histo~  5424 2007-06-01 2018-12-31 FALSE  monthly/kl/historical~

ind2 = indexFTP(folder = "monthly/kl/historical", quiet = TRUE) %>% 
 str_subset(pattern = ".\\.zip$") %>% 
 enframe() %>% 
 mutate(id = str_sub(string = value, start = 38, end = 42),
        id = as.numeric(id))

ind2 %>% filter(gdata::duplicated2(id))
#> # A tibble: 51 x 3
#>     name value                                                                id
#>    <int> <chr>                                                             <dbl>
#>  1    46 monthly/kl/historical/monatswerte_KL_00231_18790101_19740430_his~   231
#>  2    47 monthly/kl/historical/monatswerte_KL_00231_18810101_19740430_his~   231
#>  3    85 monthly/kl/historical/monatswerte_KL_00410_19960601_20181130_his~   410
#>  4    86 monthly/kl/historical/monatswerte_KL_00410_19960601_20191231_his~   410
#>  5   149 monthly/kl/historical/monatswerte_KL_00729_19030701_19801231_his~   729
#>  6   150 monthly/kl/historical/monatswerte_KL_00729_19030701_19900131_his~   729
#>  7   164 monthly/kl/historical/monatswerte_KL_00840_19891201_20191231_his~   840
#>  8   165 monthly/kl/historical/monatswerte_KL_00840_19950301_20191231_his~   840
#>  9   187 monthly/kl/historical/monatswerte_KL_00954_19980101_20190531_his~   954
#> 10   188 monthly/kl/historical/monatswerte_KL_00954_19980101_20191231_his~   954
#> # ... with 41 more rows

^{Created on 2020-06-04 by the reprex package (v0.3.0)}

Session info

devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.6.1 (2019-07-05)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  German_Germany.1252         
#>  ctype    German_Germany.1252         
#>  tz       Europe/Berlin               
#>  date     2020-06-04                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package        * version  date       lib source        
#>  abind            1.4-5    2016-07-21 [1] CRAN (R 3.6.0)
#>  assertthat       0.2.1    2019-03-21 [1] CRAN (R 3.6.2)
#>  backports        1.1.7    2020-05-13 [1] CRAN (R 3.6.3)
#>  berryFunctions   1.18.2   2019-05-01 [1] CRAN (R 3.6.3)
#>  bitops           1.0-6    2013-08-17 [1] CRAN (R 3.6.0)
#>  blob             1.2.1    2020-01-20 [1] CRAN (R 3.6.3)
#>  broom            0.5.6    2020-04-20 [1] CRAN (R 3.6.3)
#>  callr            3.4.3    2020-03-28 [1] CRAN (R 3.6.3)
#>  cellranger       1.1.0    2016-07-27 [1] CRAN (R 3.6.1)
#>  cli              2.0.2    2020-02-28 [1] CRAN (R 3.6.3)
#>  colorspace       1.4-1    2019-03-18 [1] CRAN (R 3.6.1)
#>  crayon           1.3.4    2017-09-16 [1] CRAN (R 3.6.1)
#>  DBI              1.1.0    2019-12-15 [1] CRAN (R 3.6.3)
#>  dbplyr           1.4.4    2020-05-27 [1] CRAN (R 3.6.3)
#>  desc             1.2.0    2018-05-01 [1] CRAN (R 3.6.1)
#>  devtools         2.3.0    2020-04-10 [1] CRAN (R 3.6.3)
#>  digest           0.6.25   2020-02-23 [1] CRAN (R 3.6.3)
#>  dplyr          * 1.0.0    2020-05-29 [1] CRAN (R 3.6.3)
#>  ellipsis         0.3.1    2020-05-15 [1] CRAN (R 3.6.3)
#>  evaluate         0.14     2019-05-28 [1] CRAN (R 3.6.1)
#>  fansi            0.4.1    2020-01-08 [1] CRAN (R 3.6.3)
#>  forcats        * 0.5.0    2020-03-01 [1] CRAN (R 3.6.3)
#>  fs               1.4.1    2020-04-04 [1] CRAN (R 3.6.3)
#>  gdata            2.18.0   2017-06-06 [1] CRAN (R 3.6.3)
#>  generics         0.0.2    2018-11-29 [1] CRAN (R 3.6.1)
#>  ggplot2        * 3.3.1    2020-05-28 [1] CRAN (R 3.6.3)
#>  glue             1.4.1    2020-05-13 [1] CRAN (R 3.6.3)
#>  gtable           0.3.0    2019-03-25 [1] CRAN (R 3.6.1)
#>  gtools           3.8.2    2020-03-31 [1] CRAN (R 3.6.3)
#>  haven            2.3.1    2020-06-01 [1] CRAN (R 3.6.3)
#>  highr            0.8      2019-03-20 [1] CRAN (R 3.6.1)
#>  hms              0.5.3    2020-01-08 [1] CRAN (R 3.6.3)
#>  htmltools        0.4.0    2019-10-04 [1] CRAN (R 3.6.1)
#>  httr             1.4.1    2019-08-05 [1] CRAN (R 3.6.1)
#>  jsonlite         1.6.1    2020-02-02 [1] CRAN (R 3.6.3)
#>  knitr            1.28     2020-02-06 [1] CRAN (R 3.6.3)
#>  lattice          0.20-38  2018-11-04 [2] CRAN (R 3.6.1)
#>  lifecycle        0.2.0    2020-03-06 [1] CRAN (R 3.6.3)
#>  lubridate        1.7.8    2020-04-06 [1] CRAN (R 3.6.3)
#>  magrittr         1.5      2014-11-22 [1] CRAN (R 3.6.1)
#>  memoise          1.1.0    2017-04-21 [1] CRAN (R 3.6.1)
#>  modelr           0.1.8    2020-05-19 [1] CRAN (R 3.6.3)
#>  munsell          0.5.0    2018-06-12 [1] CRAN (R 3.6.1)
#>  nlme             3.1-140  2019-05-12 [2] CRAN (R 3.6.1)
#>  pbapply          1.4-2    2019-08-31 [1] CRAN (R 3.6.1)
#>  pillar           1.4.4    2020-05-05 [1] CRAN (R 3.6.3)
#>  pkgbuild         1.0.8    2020-05-07 [1] CRAN (R 3.6.3)
#>  pkgconfig        2.0.3    2019-09-22 [1] CRAN (R 3.6.1)
#>  pkgload          1.1.0    2020-05-29 [1] CRAN (R 3.6.3)
#>  prettyunits      1.1.1    2020-01-24 [1] CRAN (R 3.6.3)
#>  processx         3.4.2    2020-02-09 [1] CRAN (R 3.6.3)
#>  ps               1.3.3    2020-05-08 [1] CRAN (R 3.6.3)
#>  purrr          * 0.3.4    2020-04-17 [1] CRAN (R 3.6.3)
#>  R6               2.4.1    2019-11-12 [1] CRAN (R 3.6.1)
#>  Rcpp             1.0.4.6  2020-04-09 [1] CRAN (R 3.6.3)
#>  RCurl            1.98-1.2 2020-04-18 [1] CRAN (R 3.6.3)
#>  rdwd           * 1.3.1    2020-02-18 [1] CRAN (R 3.6.3)
#>  readr          * 1.3.1    2018-12-21 [1] CRAN (R 3.6.1)
#>  readxl           1.3.1    2019-03-13 [1] CRAN (R 3.6.1)
#>  remotes          2.1.1    2020-02-15 [1] CRAN (R 3.6.3)
#>  reprex           0.3.0    2019-05-16 [1] CRAN (R 3.6.1)
#>  rlang            0.4.6    2020-05-02 [1] CRAN (R 3.6.3)
#>  rmarkdown        2.2      2020-05-31 [1] CRAN (R 3.6.3)
#>  rprojroot        1.3-2    2018-01-03 [1] CRAN (R 3.6.1)
#>  rvest            0.3.5    2019-11-08 [1] CRAN (R 3.6.1)
#>  scales           1.1.1    2020-05-11 [1] CRAN (R 3.6.3)
#>  sessioninfo      1.1.1    2018-11-05 [1] CRAN (R 3.6.1)
#>  stringi          1.4.6    2020-02-17 [1] CRAN (R 3.6.2)
#>  stringr        * 1.4.0    2019-02-10 [1] CRAN (R 3.6.1)
#>  testthat         2.3.2    2020-03-02 [1] CRAN (R 3.6.3)
#>  tibble         * 3.0.1    2020-04-20 [1] CRAN (R 3.6.3)
#>  tidyr          * 1.1.0    2020-05-20 [1] CRAN (R 3.6.3)
#>  tidyselect       1.1.0    2020-05-11 [1] CRAN (R 3.6.3)
#>  tidyverse      * 1.3.0    2019-11-21 [1] CRAN (R 3.6.3)
#>  usethis          1.6.1    2020-04-29 [1] CRAN (R 3.6.3)
#>  utf8             1.1.4    2018-05-24 [1] CRAN (R 3.6.1)
#>  vctrs            0.3.0    2020-05-11 [1] CRAN (R 3.6.3)
#>  withr            2.2.0    2020-04-20 [1] CRAN (R 3.6.3)
#>  xfun             0.14     2020-05-20 [1] CRAN (R 3.6.3)
#>  xml2             1.3.2    2020-04-23 [1] CRAN (R 3.6.3)
#>  yaml             2.2.1    2020-02-01 [1] CRAN (R 3.6.3)
#> 
#> [1] C:/Users/hendr/Documents/R/win-library/3.6
#> [2] C:/Program Files/R/R-3.6.1/library

Any idea about why this happens? Contentwise this is more a question to the DWD I assume but why the indexFTP result is different from the index provided with the package might be a question to you.
Thanks for any feedback you might have on this issue.
Best,
Hendrik

res="10_minutes

thank you for your great package!

A suggestion for improvement could be to provide the res="10_minutes" for selectDWD. It works but there are no minutes in the MESS_DATUM column right now (all set to zero, instead of :00, :10, :20, :30, :40, :50)

library(rdwd)
link <- selectDWD("Kiel-Holtenau", res="10_minutes", var="air_temperature", per="recent") 
file <- dataDWD(link, read=FALSE)
air_temperature <- readDWD(file, varnames=TRUE)
air_temperature

Maybe it is line 250 in readDWD.R in addition

if(nch==12) "%Y%m%d%H%M" # for 201804270000

https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/10_minutes/

Kind regards
Christof

Consider using R.cache for `dir`

Consider whether to use R.cache for dir management.
https://stat.ethz.ch/pipermail/r-package-devel/2019q4/004783.html

Open questions:

implementation details
breakage of user code depending on the current default "dir=DWDdata"
usage of getwd()

add use case: values at locations in grid

Add use case to website

library(rdwd)

# select data
index <- indexFTP(folder="annual/air_temperature_max", base=gridbase)
index <- index[-(1:2)] # exclude description files
index <- index[as.numeric(substr(index,62,65))>=2013] # after year 2013
index

# download & read data:
tempmax <- dataDWD(index, base=gridbase, joinbf=TRUE)
names(tempmax) <- substr(names(tempmax), 62, 65)

# visual data & projection check:
plotRadar(tempmax[[1]], proj="seasonal",extent="seasonal", 
          main="Annual grid of monthly averaged daily maximum air temperature (2m) - 2013")

# raster stack
tempmax_stack <- raster::stack(tempmax)
tempmax_stack <- projectRasterDWD(tempmax_stack, proj="seasonal",extent="seasonal")
tempmax_stack
raster::plot(tempmax_stack, zlim=range(raster::cellStats(tempmax_stack, range)) )

loc <- data.frame(x=12.65295, y=53.06547) # Aussichtspunkt Kyritz-Ruppiner Heide
raster::extract(tempmax_stack, loc)

hr argument

dataDWD and readDWD shall gain the argument hr: integer code to automatically merge historical and recent datasets. If set, returns a data.frame instead of a list.

See this use case.

0 (default): ignore this argument
1: message("merging n files") + sort by hr (if rh is given) + merge + remove rownames.
2: also remove duplicated dates from recent
3: also remove columns QN3,QN4,eor
4: also remove column STATIONS_ID

subdaily data not being read correctly

Sub-daily datasets are not extracted correctly, as seen in my example.

library(rdwd)
link <- selectDWD(id=10381, res="subdaily", var="standard_format", per="historical")
file <- dataDWD(link, read=FALSE)
clim <- readDWD(file, varnames=FALSE)

Access to phenology data ?

Hi Berry,
With the help of your package you can download data from
ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate.
is there a way to download data from :
ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/phenology/
How can I tell rdwd to look into that folder instead ?

Thank you very much,
Best regards,
Etienne

the new DWD OpenData server should be used

Hi,

selectDWD loads wrong filenames.

link <- selectDWD(res="daily", var="kl", per="rh", outvec = TRUE)

Due to the following?
https://www.dwd.de/EN/climate_environment/cdc/cdc_node.html
CDC OpenData
Due to DWD data service restructuring, the CDC-FTP Server is moved to DWD OpenData, allowing access via ftp and https.

The new access is: https://opendata.dwd.de/climate_environment/CDC/ and ftp://opendata.dwd.de/climate_environment/CDC/.

Most data sets, especially the German station data, have been copied already and are now updated synchronously with the FTP-server. For a period of transition, both the new and the old addresses are valid.

Deadline for termination of address ftp-cdc is: 1st June 2019.

Downloads of historic data fails

First: After many years I finally managed to wrap my head around this DWD data stuff and got something useful (for me) running.
So thank you very much for making these strangely complex datasets available to the (nearly) normal people.

But now that I want to put it in practice, I realize that there seem to be problems downloading historic datasets. This case was not foreseen and now my script got stuck:

dataDWD -> dirDWD: adding to directory '/home/bernd/Projekte/R/rdwd'
dataDWD: 1 file already existing and not downloaded again:  'hourly_air_temperature_historical_stundenwerte_TU_05149_20050101_20221004_hist.zip'
Now downloading 3 files...
dataDWD -> newFilename: creating 3 files:
/home/bernd/Projekte/R/rdwd/hourly_air_temperature_historical_stundenwerte_TU_02600_20050301_20221231_hist.zip
/home/bernd/Projekte/R/rdwd/hourly_air_temperature_historical_stundenwerte_TU_05705_19480101_20221231_hist.zip
  (and 1 more)
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s  
Warnmeldung:
3 Downloads have failed (out of 4). Setting read=FALSE. download.file errors:
kann URL 'ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/historical/stundenwerte_TU_02600_20050301_20221231_hist.zip' nicht öffnen
kann URL 'ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/historical/stundenwerte_TU_05705_19480101_20221231_hist.zip' nicht öffnen
kann URL 'ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/historical/stundenwerte_TU_01107_20041201_20221231_hist.zip' nicht öffnen

Actually, the zips are there and can be downloaded manually.
Just ran into this issue late, cause I developed the script with recent data only.

selectDWD returns outdated URLs

Queries for historical weather records using selectDWD returns URLs that cannot be opened. In the past those URLs worked but seem outdated now. I am not able to figure out where these URLs are generated but perhaps the example can help in solving the issue.

##  example
library(rdwd)
library(magrittr)
stations <- c("Belm", "Salzuflen, Bad", "Bielefeld-Deppendorf")
url <- lapply(stations, selectDWD, res = 'daily', var = 'kl', per = 'h')  
url

All URLs end with 20171231 and do not exist as shown below for the first.

[[1]]
[1] "ftp://ftp-cdc.dwd.de/pub/CDC/observations_germany/climate/daily/kl/historical/tageswerte_KL_00342_20101201_20171231_hist.zip"
[[2]]
[1] "ftp://ftp-cdc.dwd.de/pub/CDC/observations_germany/climate/daily/kl/historical/tageswerte_KL_04371_19350101_20171231_hist.zip"
[[3]]
[1] "ftp://ftp-cdc.dwd.de/pub/CDC/observations_germany/climate/daily/kl/historical/tageswerte_KL_07106_20060901_20171231_hist.zip"

dataDWD(url[[1]], quiet = T)

Error in download.file(url = file[i], destfile = outfile[i], quiet = TRUE, :
cannot open URL 'ftp://ftp-cdc.dwd.de/pub/CDC/observations_germany/climate/daily/kl/historical/tageswerte_KL_00342_20101201_20171231_hist.zip'
In addition: Warning message:
In download.file(url = file[i], destfile = outfile[i], quiet = TRUE, :
InternetOpenUrl failed: ''

The correct URLs end with 20181231 and dataDWD can handle those

out <- lapply(url, stringr::str_replace, "20171231", "20181231") %>% 
  lapply(., dataDWD, quiet = T)
head(out[[1]])

STATIONS_ID MESS_DATUM QN_3 FX FM QN_4 RSK RSKF SDK SHK_TAG NM VPM PM TMK UPM TXK TNK TGK eor
1 342 2010-12-01 10 NA NA 3 0.0 4 3.0 NA NA NA NA -7.0 NA -4.9 -9.1 -9.2 eor
2 342 2010-12-02 10 7.0 NA 3 0.5 4 0.0 0 7.7 3.1 996.5 -7.4 90 -6.5 -8.6 -8.2 eor
3 342 2010-12-03 10 7.6 NA 3 0.1 4 0.0 1 6.9 3.4 998.0 -5.6 85 -3.9 -7.5 -9.7 eor
4 342 2010-12-04 10 12.0 NA 3 4.7 4 3.6 1 5.0 4.1 1000.5 -3.2 84 -0.8 -4.9 -6.6 eor
5 342 2010-12-05 10 13.1 NA 3 2.5 4 0.0 3 7.8 6.2 989.3 0.7 97 1.4 -1.2 -1.1 eor
6 342 2010-12-06 10 5.5 NA 3 0.1 4 0.0 1 7.9 5.8 989.7 0.2 95 1.6 -4.5 -6.3 eor

From session.info()

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] magrittr_1.5 rdwd_0.11.0

Error when 'fread = TRUE'

In the below example, the standard data import works as expected

lnk = selectDWD(id = 3379, res = "hourly", var = "air_temperature", per = "r")
dat1 = dataDWD(lnk, dir = file.path(tempdir(), "DWDdata")) # works

whereas fast reading using fread = TRUE fails

dat2 = dataDWD(lnk, dir = file.path(tempdir(), "DWDdata"), fread = TRUE) # error

with the following error message

Error in data.table::fread(paste("unzip -p", file, fp), na.strings = na9(), :
freadMain: NAstring << -9999>> has whitespace at the beginning or end

Here's my sessionInfo():

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] mapview_2.4.8 rdwd_0.10.2   DWD_0.1.0     sf_0.6-3     

loaded via a namespace (and not attached):
 [1] xfun_0.3              pbapply_1.3-4         lattice_0.20-35       colorspace_1.3-2      htmltools_0.3.6      
 [6] stats4_3.5.1          viridisLite_0.3.0     yaml_2.1.19           base64enc_0.1-3       e1071_1.6-8          
[11] later_0.7.3           withr_2.1.2           DBI_1.0.0             sp_1.3-1              RColorBrewer_1.1-2   
[16] plyr_1.8.4            munsell_0.5.0         raster_2.6-7          htmlwidgets_1.2       devtools_1.13.6      
[21] memoise_1.1.0         latticeExtra_0.6-28   knitr_1.20            httpuv_1.4.4.2        crosstalk_1.0.0      
[26] class_7.3-14          Rcpp_0.12.17          xtable_1.8-2          promises_1.0.1        scales_0.5.0         
[31] classInt_0.2-3        satellite_1.0.1       plotrix_3.7-2         leaflet_2.0.1         webshot_0.5.0        
[36] jsonlite_1.5          abind_1.4-5           mime_0.5              png_0.1-7             digest_0.6.15        
[41] Orcs_1.0.0            bookdown_0.7          shiny_1.1.0           berryFunctions_1.17.0 grid_3.5.1           
[46] RPostgreSQL_0.6-2     rgdal_1.3-3           tools_3.5.1           bitops_1.0-6          magrittr_1.5         
[51] RCurl_1.95-4.11       data.table_1.11.4     spData_0.2.9.0        R6_2.2.2              units_0.6-0          
[56] compiler_3.5.1

grib2 reading fails

nwp_t2m_base <- "ftp://opendata.dwd.de/weather/nwp/icon-d2/grib/03/p"
nwp_urls <- indexFTP("", base=nwp_t2m_base, dir=tempdir())
nwp_file <- dataDWD(nwp_urls[6], base=nwp_t2m_base, dir=tempdir(), joinbf=TRUE, dbin=TRUE, read=FALSE)
nwp_data <- readDWD(nwp_file, quiet=TRUE)

Fails with messsage from rgdal::readGDAL:
**.grib2 is a grib file, but no raster dataset was successfully identified.

local test dir

rename localtestdir to short locdir.
use it in website, examples, etc.
move to location out of 'rdwd' package folder and out of dropbox.

fread empty files

readDWD has the argument fread to read datasets through data.table::fread, which is significantly faster than base unzip+read.table.
Early on (2017), the default was fread=NA (i.e. conditional on availability of data.table).
Some users sent emails about errors on Windows OS so I changed the default to FALSE (272b947).
For speedup, it would be nice to have the default NA again.

This will be set experimentally to see if new complaints arise.

reduce index size

With the new 5-minute data (April 2022), the fileIndex and derived indexes are getting very big.
I think rdwd cannot be published atomatically on CRAN with the check NOTE installed size is 5.3Mb (R 3.5, data 1.4)

Ideas on package size reduction are welcome!

readDWD change argument list to `type`

in readDWD, change the long list of arguments (that decide which subfunction to call) to a single argument, e.g. called "type".
There shall be a separate (also vectorized) function to determine it from the filename, e.g. "get_type".
It shall have a nice table as documentation.

doc references

Throughout the homepage, redirected man pages like
https://www.rdocumentation.org/packages/rdwd/topics/fileIndex
should be replaced with
https://www.rdocumentation.org/packages/rdwd/topics/index

Something like
helplink <- function(doc, topic=doc) paste0("[",doc,"](https://www.rdocumentation.org/packages/rdwd/topics/",topic,")")
should do the trick, and then I just need to check manually which manpages have an alias.

expand per="hr" to all options

selectDWD(id=1050, res=c("daily","monthly"), var="kl", per="hr")

currently returns
dwdbase/daily/kl/historical/**_hist.zip and dwdbase/monthly/kl/recent/**_akt.zip

Should per="hr" expand to all combinations?
Does this extend to id vectors?

RStudio

install.packages("rdwd")
Installing package into ‘/home/rubensrangel/R/i686-pc-linux-gnu-library/3.2’
(as ‘lib’ is unspecified)
Warning in install.packages :
package ‘rdwd’ is not available (for R version 3.2.3)

metaInfo() - start/end date

Hi Berry!

Cool package! Just using it to look at some precipitation data. When searching available data sets for stations using metaInfo(), however, it seems that the time span printed for the folders historic and recent (or now) is the same for the variables. Shouldn't from/to indicate what data actually is available in the respective folder? Below the example output I get for the station Langerwish:

metaInfo(id=2863)
rdwd station id 2863 with 6 files. Name: Langerwisch, State: Brandenburg
res var per hasfile from to lat long ele
1 annual more_precip historical TRUE 1974-01-01 2020-12-31 52.3175 13.0679 40
2 daily more_precip historical TRUE 1974-01-01 2021-05-31 52.3175 13.0679 40
3 monthly more_precip historical TRUE 1974-01-01 2021-04-30 52.3175 13.0679 40
4 annual more_precip recent TRUE 1974-01-01 2020-12-31 52.3175 13.0679 40
5 daily more_precip recent TRUE 1974-01-01 2021-05-31 52.3175 13.0679 40
6 monthly more_precip recent TRUE 1974-01-01 2021-04-30 52.3175 13.0679 40

Erwin

the 'wininet' method for ftp:// URLs is defunct

'C:/Users/A1059380/AppData/Local/Temp/RtmpcbhB7B/hourly_air_temperature_historical_stundenwerte_TU_00003_19500401_20110331_hist.zip'
Warning: 1 Download has failed (out of 1). download.file error:
the 'wininet' method for ftp:// URLs is defunct

I get this error when I use selectDWD() then dataDWD(). I have R 4.2 installed.

Several links to ftp://opendata.dwd.de in documentation fail

Dear Berry,

I noticed that several links in your bookdown documentation of the package to the opendata platform of DWD are not working properly. The reason seems to be that urls have changed from ftp://opendata.dwd.de to https://opendata.dwd.de/

For instance in 1. Intro it must be https://opendata.dwd.de/climate_environment/CDC/Readme_intro_CDC_ftp.pdf rather than ftp://opendata.dwd.de/climate_environment/CDC/Readme_intro_CDC_ftp.pdf. The same applies also, e.g. to all the links provided in section 4 Available Datasets.

Best regards
Till

unable to download historical data

Apparently, the dwd has updated some filenames for historical data. I wanted to download the historical monthly numbers for weather stations in Saxony, but received errors for 31 out of 53 stations. Here is an example for Geringswalde-Altgeringswalde:

> library(rdwd)
> link <- selectDWD(id=131, res="monthly", var="kl", per="historical")
> file <- dataDWD(link, read=FALSE)
dataDWD -> dirDWD: creating directory '/home/steinbac/DWDdata'
dataDWD -> newFilename: creating 1 file:
'/home/steinbac/DWDdata/monthly_kl_historical_monatswerte_KL_00131_20041101_20181231_hist.zip'
Warning message:
1 Download has failed (out of 1).
download.file error:
cannot open URL 'ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/monthly/kl/historical/monatswerte_KL_00131_20041101_20181231_hist.zip'

When checking the URL and comparing to CDC folder, I observe that monatswerte_KL_00131_20041101_20181231_hist.zip is now monatswerte_KL_00131_20041101_20191231_hist.zip at this link. So a year's worth of data was added.

> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Fedora 31 (Workstation Edition)

Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rdwd_1.3.1

loaded via a namespace (and not attached):
[1] compiler_3.6.3        parallel_3.6.3        tools_3.6.3          
[4] berryFunctions_1.18.2 abind_1.4-5           pbapply_1.4-2

replace raster with terra

terra, the future replacement of the raster package should be implemented where possible, as the usage of proj4 strings now throw warnings through rgdal. WKT is now recommended and terra seems to do that internally just fine.

Maybe keep the raster argument with default FALSE and add terra argument.
Maybe make projectRasterDWD dependant on input class.

fn <- "misc/localdata/16_DJF_grids_germany_seasonal_air_temp_mean_188216.asc"
r <- raster::raster(fn)
rt <- terra::rast(nrows=raster::nrow(r), ncols=raster::ncol(r), extent=r@extent)
terra::values(rt) <- raster::values(r)

terra::plot(rt)
library(terra)
terra::plot(rt) # only works after calling library...

pseas <- "+proj=tmerc +lat_0=0 +lon_0=9 +k=1 +x_0=3500000 +y_0=0 +ellps=bessel +datum=potsdam +units=m +no_defs"
ptar <- "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"
terra::crs(rt) <- pseas
rp <- terra::project(rt, ptar)
terra::plot(rp)
addBorders()

Timestamps skip minutes after download

Hi there, I am using your great package for downloading wind data for a bunch of weather stations. Everything is working fine, however, the list output of dataDWD does only provide the time of the full hour. That means, the preciseness of the 10 minutes resolution is kind of lost for analysis. Am I doing something wrong? I checked the documentation of dataDWD and also the source code and could not find a clue as to wheter I have misspecified something. My output looks like this (the data type inside the list is double (S3: POSIXct, POSIXt).

res1[["10_minutes_wind_now_10minutenwerte_wind_00011_now_2"]][["MESS_DATUM"]]

"2020-02-25 00:00:00 GMT" 
"2020-02-25 00:00:00 GMT" 
"2020-02-25 00:00:00 GMT" 
"2020-02-25 00:00:00 GMT"
"2020-02-25 00:00:00 GMT" 
"2020-02-25 00:00:00 GMT" 
"2020-02-25 01:00:00 GMT" 
"2020-02-25 01:00:00 GMT"
"2020-02-25 01:00:00 GMT" 
"2020-02-25 01:00:00 GMT" 
"2020-02-25 01:00:00 GMT" 
"2020-02-25 01:00:00 GMT"
...

This is my code:

download <- selectDWD(id = 11, res = "10_minutes", outvec = TRUE, var = "wind", per = "now")
res1 <- dataDWD(file = download, dir = here::here("wetter"), force = TRUE, quiet = TRUE)

Of course, I could just add the minutes manually after downloading. But I first wanted to make sure there is no other way of doing so.

use polite scraping

Consider using polite for downloading files.
Maybe this is already conceptually fine with no re-downloads and CURL handle in https://github.com/brry/rdwd/blob/master/R/indexFTP.R#L121

use https

Hello, opendata.dwd.de seems to support also https, not just ftp. Would it be possible to add this possibility since https is much more supported and not restricted on more secure systems?

GitHub

devtools::install_github("brry/rdwd")
Downloading GitHub repo brry/rdwd@master
Skipping 2 packages not available: berryFunctions, pbapply
Installing 2 packages: berryFunctions, pbapply
Installing packages into ‘/home/rubensrangel/R/i686-pc-linux-gnu-library/3.2’
(as ‘lib’ is unspecified)
Erro: (convertido do aviso) packages ‘berryFunctions’, ‘pbapply’ are not available (for R version 3.2.3)

readDwd is creating error in a windows pc

link <- selectDWD("XXXXXX", res="hourly", var="air_temperature", per="recent")
file <- dataDWD(link, read=FALSE, dir=paste(getwd(),"/data/Weather_Hour/",sep = ""), quiet=TRUE)
clim <- readDWD(file,type = "data" )

Am getting the error
Der Befehl "unzip" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.

English version - The command "unzip" is either misspelled or
could not be found.

i am not able to find a installation of unyip for windows.

Problem when downloading data

Dear Berry,

I would like to first thank you for writing rdwd. I found the package quite interesting. I started it using a couple of days ago, and I have found an issue when downloading data that I hope you might help me.

Let me illustrate this with a little example. By running the following code:

library(rdwd)

link <- selectDWD(c("Potsdam"), res="hourly", var="sun", per="hist")
pdata <- dataDWD(link)

I get this message:

Warning message:
1 Download has failed (out of 1). Setting read=FALSE.
download.file error:
cannot open URL 'ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/sun/historical/stundenwerte_SD_03987_18930101_20181231_hist.zip'

If I manually change the last piece of the string "link" from 20181231 to 20191231, it works.

Is this a problem of my installation? (I am using R 4.0.1, with RStudio 1.2.5042 and rdwd 1.3.1 in Windows 7).

I tried updating to rdwd 1.3.13 via the updateRdwd command, but I get this error:

Error : (converted from warning) C:/Users/mmorenozam/AppData/Local/Temp/RtmpMNcjbL/R.INSTALL6484896499/rdwd/man/plotRadar.Rd:46: unknown macro '\rcode'
ERROR: installing Rd objects failed for package 'rdwd'

I have installed berryFunctions 1.19.3, and still cannot update.

Waiting for your answer,
Best
Mauricio

website update

remove all instances of dir=locdir()
add updateRdwd() in strategic locations

Please remove dependencies on rgdal, rgeos, and/or maptools

This package depends on (depends, imports or suggests) raster and one or more of the retiring packages rgdal, rgeos or maptools (https://r-spatial.org/r/2022/04/12/evolution.html). Since raster 3.6.3, all use of external FOSS library functionality has been transferred to terra, making the retiring packages very likely redundant. It would help greatly if you could remove dependencies on the retiring packages as soon as possible.

Unzip function for windows

Hi, I'm working on windows and when reading file with readDWD I get the following error:

'unzip' is not recognized as an internal or external command,
operable program or batch file.

I have both utils and ncdf4 libraries installed, and separately the unzip function works.

brry / rdwd Goto Github PK

rdwd's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs