brry / rdwd Goto Github PK
View Code? Open in Web Editor NEWdownload climate data from DWD (German Weather Service)
Home Page: https://bookdown.org/brry/rdwd
download climate data from DWD (German Weather Service)
Home Page: https://bookdown.org/brry/rdwd
Hello, I am struggling to download data from "ftp://opendata.dwd.de/climate_environment/CDC/grids_germany".
Am I doing something fundamentally wrong?
Thanks for help!
Indexing
radfiles <- indexFTP("daily/Project_TRY/air_temperature_mean/",
base = gridbase, sleep = 1)
Remove PDF files
radfiles <- radfiles[-(1:2)]
If I would not write the full URL, I get an error message, that dataDWD() wants a URL beginning with ftp://...
radfiles <- paste('ftp://opendata.dwd.de/climate_environment/CDC/grids_germany/',
radfiles,
sep = '')
Pick smaller subset for speed
radfiles <- radfiles[1:3]
data <- dataDWD(radfiles, dir = './DWDdata/Data/', sleep = 5)
And this ist the outpu I am getting:
dataDWD -> dirDWD: adding to directory 'C:/Users/Massimo/Documents/DWDdata/Data'
dataDWD -> newFilename: creating 3 files:
C:/Users/Massimo/Documents/DWDdata/Data/ftp:__opendata.dwd.de_climate_environment_CDC_grids_germany_daily_Project_TRY_air_temperature_mean__TT_199501_daymean.nc.gz
C:/Users/Massimo/Documents/DWDdata/Data/ftp:__opendata.dwd.de_climate_environment_CDC_grids_germany_daily_Project_TRY_air_temperature_mean__TT_199502_daymean.nc.gz
(and 1 more)
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 11s
Error: in dataDWD -> readDWD -> checkFile : The 3 files C:/Users/Massimo/Documents/DWDdata/Data/ftp:__opendata.dwd.de_climate_environment_CDC_grids_germany_daily_Project_TRY_air_temperature_mean__TT_199501_daymean.nc.gz, C:/Users/Massimo/Documents/DWDdata/Data/ftp:__opendata.dwd.de_climate_environment_CDC_grids_germany_daily_Project_TRY_air_temperature_mean__TT_199502_daymean.nc.gz (and 1 others)
do not exist. Current getwd: C:/Users/Massimo/Documents/DWDdata/Data
Consider the implications of setting the default call to download.file
with wb=TRUE
https://bookdown.org/brry/rdwd/raster-data.html#binary-file-errors
First of all, Thanks for this amazing package!
I only found a small inconvinience when downloading data with force
not being TRUE
but a vector such as c(Inf, 6)
. I left a comment on a PR at the corresponding line where I think the bug is at: f45b080#r136216003
Dear Berry,
I noticed that the historical links for air_temperature variable is not working
selectDWD(id = 5906,res = "hourly",var = 'air_temperature',per = 'h')
"ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/historical/stundenwerte_TU_05906_19480101_20191231_hist.zip"
The above ftp link doesnot exsist in Opendwd FTP portal
When i checked the ftp data portal, the data till 2020 is now moved to historical folder.
ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/historical/stundenwerte_TU_05906_19480101_20201231_hist.zip
This link works. Can you update the selectDWD function to give appropriate links?
Dear brry,
first, thanks so much for the handy package! I started using it today and encountered an error whose origins I could not determine. The following is my code:
download <- selectDWD(id = stations, res = "daily", outvec = TRUE)
res1 <- dataDWD(file = download, dir = "Y:/This/Is/My/Path/dataDWD")
R keeps telling me:
dataDWD -> dirDWD: adding to directory 'Y:/This/Is/My/Path/dataDWD'
Error: dataDWD -> newFilename: The following folder does not exist: ''
This happens even when I use:
res1 <- dataDWD(file = download)
Then, it creates the folder correctly at the current getwd()
location, but kinda refuses to fill it.
Do you know any help?
in all documentation SeeAlso sections, point to the appropriate part of the website
Hi Berry,
thanks for your nice package, it simplifies access to DWD database a lot.
When playing around with monthly "kl"-data I realized that there are multiple files for the same station ID on the server (usually with overlapping time periods) and there are even differences between the fileIndex of the package and an index created using the function indexFTP. Here´s a reprex to show what I mean:
suppressWarnings(library("tidyverse"))
suppressWarnings(library("rdwd"))
data(fileIndex)
ind1 = fileIndex %>%
filter(res == "monthly" & var == "kl" & per == "historical") %>%
filter(str_detect(string = path, pattern = ".\\.zip$")) %>%
as_tibble()
ind1 %>% filter(gdata::duplicated2(id))
#> # A tibble: 10 x 8
#> res var per id start end ismeta path
#> <chr> <chr> <chr> <int> <date> <date> <lgl> <chr>
#> 1 month~ kl histo~ 116 1899-06-01 1966-12-31 FALSE monthly/kl/historical~
#> 2 month~ kl histo~ 116 1899-06-01 1973-05-31 FALSE monthly/kl/historical~
#> 3 month~ kl histo~ 896 1963-01-01 2018-12-31 FALSE monthly/kl/historical~
#> 4 month~ kl histo~ 896 1991-01-01 2018-12-31 FALSE monthly/kl/historical~
#> 5 month~ kl histo~ 1568 1958-05-01 1970-11-30 FALSE monthly/kl/historical~
#> 6 month~ kl histo~ 1568 1958-05-01 1982-12-31 FALSE monthly/kl/historical~
#> 7 month~ kl histo~ 1762 1897-01-01 2006-12-31 FALSE monthly/kl/historical~
#> 8 month~ kl histo~ 1762 1991-01-01 2006-12-31 FALSE monthly/kl/historical~
#> 9 month~ kl histo~ 5424 1949-03-01 2018-12-31 FALSE monthly/kl/historical~
#> 10 month~ kl histo~ 5424 2007-06-01 2018-12-31 FALSE monthly/kl/historical~
ind2 = indexFTP(folder = "monthly/kl/historical", quiet = TRUE) %>%
str_subset(pattern = ".\\.zip$") %>%
enframe() %>%
mutate(id = str_sub(string = value, start = 38, end = 42),
id = as.numeric(id))
ind2 %>% filter(gdata::duplicated2(id))
#> # A tibble: 51 x 3
#> name value id
#> <int> <chr> <dbl>
#> 1 46 monthly/kl/historical/monatswerte_KL_00231_18790101_19740430_his~ 231
#> 2 47 monthly/kl/historical/monatswerte_KL_00231_18810101_19740430_his~ 231
#> 3 85 monthly/kl/historical/monatswerte_KL_00410_19960601_20181130_his~ 410
#> 4 86 monthly/kl/historical/monatswerte_KL_00410_19960601_20191231_his~ 410
#> 5 149 monthly/kl/historical/monatswerte_KL_00729_19030701_19801231_his~ 729
#> 6 150 monthly/kl/historical/monatswerte_KL_00729_19030701_19900131_his~ 729
#> 7 164 monthly/kl/historical/monatswerte_KL_00840_19891201_20191231_his~ 840
#> 8 165 monthly/kl/historical/monatswerte_KL_00840_19950301_20191231_his~ 840
#> 9 187 monthly/kl/historical/monatswerte_KL_00954_19980101_20190531_his~ 954
#> 10 188 monthly/kl/historical/monatswerte_KL_00954_19980101_20191231_his~ 954
#> # ... with 41 more rows
Created on 2020-06-04 by the reprex package (v0.3.0)
devtools::session_info()
#> - Session info ---------------------------------------------------------------
#> setting value
#> version R version 3.6.1 (2019-07-05)
#> os Windows 10 x64
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate German_Germany.1252
#> ctype German_Germany.1252
#> tz Europe/Berlin
#> date 2020-06-04
#>
#> - Packages -------------------------------------------------------------------
#> package * version date lib source
#> abind 1.4-5 2016-07-21 [1] CRAN (R 3.6.0)
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.2)
#> backports 1.1.7 2020-05-13 [1] CRAN (R 3.6.3)
#> berryFunctions 1.18.2 2019-05-01 [1] CRAN (R 3.6.3)
#> bitops 1.0-6 2013-08-17 [1] CRAN (R 3.6.0)
#> blob 1.2.1 2020-01-20 [1] CRAN (R 3.6.3)
#> broom 0.5.6 2020-04-20 [1] CRAN (R 3.6.3)
#> callr 3.4.3 2020-03-28 [1] CRAN (R 3.6.3)
#> cellranger 1.1.0 2016-07-27 [1] CRAN (R 3.6.1)
#> cli 2.0.2 2020-02-28 [1] CRAN (R 3.6.3)
#> colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.1)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1)
#> DBI 1.1.0 2019-12-15 [1] CRAN (R 3.6.3)
#> dbplyr 1.4.4 2020-05-27 [1] CRAN (R 3.6.3)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.1)
#> devtools 2.3.0 2020-04-10 [1] CRAN (R 3.6.3)
#> digest 0.6.25 2020-02-23 [1] CRAN (R 3.6.3)
#> dplyr * 1.0.0 2020-05-29 [1] CRAN (R 3.6.3)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 3.6.3)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.1)
#> fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.3)
#> forcats * 0.5.0 2020-03-01 [1] CRAN (R 3.6.3)
#> fs 1.4.1 2020-04-04 [1] CRAN (R 3.6.3)
#> gdata 2.18.0 2017-06-06 [1] CRAN (R 3.6.3)
#> generics 0.0.2 2018-11-29 [1] CRAN (R 3.6.1)
#> ggplot2 * 3.3.1 2020-05-28 [1] CRAN (R 3.6.3)
#> glue 1.4.1 2020-05-13 [1] CRAN (R 3.6.3)
#> gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.1)
#> gtools 3.8.2 2020-03-31 [1] CRAN (R 3.6.3)
#> haven 2.3.1 2020-06-01 [1] CRAN (R 3.6.3)
#> highr 0.8 2019-03-20 [1] CRAN (R 3.6.1)
#> hms 0.5.3 2020-01-08 [1] CRAN (R 3.6.3)
#> htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.1)
#> httr 1.4.1 2019-08-05 [1] CRAN (R 3.6.1)
#> jsonlite 1.6.1 2020-02-02 [1] CRAN (R 3.6.3)
#> knitr 1.28 2020-02-06 [1] CRAN (R 3.6.3)
#> lattice 0.20-38 2018-11-04 [2] CRAN (R 3.6.1)
#> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 3.6.3)
#> lubridate 1.7.8 2020-04-06 [1] CRAN (R 3.6.3)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.1)
#> modelr 0.1.8 2020-05-19 [1] CRAN (R 3.6.3)
#> munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.1)
#> nlme 3.1-140 2019-05-12 [2] CRAN (R 3.6.1)
#> pbapply 1.4-2 2019-08-31 [1] CRAN (R 3.6.1)
#> pillar 1.4.4 2020-05-05 [1] CRAN (R 3.6.3)
#> pkgbuild 1.0.8 2020-05-07 [1] CRAN (R 3.6.3)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.1)
#> pkgload 1.1.0 2020-05-29 [1] CRAN (R 3.6.3)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.3)
#> processx 3.4.2 2020-02-09 [1] CRAN (R 3.6.3)
#> ps 1.3.3 2020-05-08 [1] CRAN (R 3.6.3)
#> purrr * 0.3.4 2020-04-17 [1] CRAN (R 3.6.3)
#> R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.1)
#> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 3.6.3)
#> RCurl 1.98-1.2 2020-04-18 [1] CRAN (R 3.6.3)
#> rdwd * 1.3.1 2020-02-18 [1] CRAN (R 3.6.3)
#> readr * 1.3.1 2018-12-21 [1] CRAN (R 3.6.1)
#> readxl 1.3.1 2019-03-13 [1] CRAN (R 3.6.1)
#> remotes 2.1.1 2020-02-15 [1] CRAN (R 3.6.3)
#> reprex 0.3.0 2019-05-16 [1] CRAN (R 3.6.1)
#> rlang 0.4.6 2020-05-02 [1] CRAN (R 3.6.3)
#> rmarkdown 2.2 2020-05-31 [1] CRAN (R 3.6.3)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.1)
#> rvest 0.3.5 2019-11-08 [1] CRAN (R 3.6.1)
#> scales 1.1.1 2020-05-11 [1] CRAN (R 3.6.3)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1)
#> stringi 1.4.6 2020-02-17 [1] CRAN (R 3.6.2)
#> stringr * 1.4.0 2019-02-10 [1] CRAN (R 3.6.1)
#> testthat 2.3.2 2020-03-02 [1] CRAN (R 3.6.3)
#> tibble * 3.0.1 2020-04-20 [1] CRAN (R 3.6.3)
#> tidyr * 1.1.0 2020-05-20 [1] CRAN (R 3.6.3)
#> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 3.6.3)
#> tidyverse * 1.3.0 2019-11-21 [1] CRAN (R 3.6.3)
#> usethis 1.6.1 2020-04-29 [1] CRAN (R 3.6.3)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 3.6.1)
#> vctrs 0.3.0 2020-05-11 [1] CRAN (R 3.6.3)
#> withr 2.2.0 2020-04-20 [1] CRAN (R 3.6.3)
#> xfun 0.14 2020-05-20 [1] CRAN (R 3.6.3)
#> xml2 1.3.2 2020-04-23 [1] CRAN (R 3.6.3)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.3)
#>
#> [1] C:/Users/hendr/Documents/R/win-library/3.6
#> [2] C:/Program Files/R/R-3.6.1/library
Any idea about why this happens? Contentwise this is more a question to the DWD I assume but why the indexFTP result is different from the index provided with the package might be a question to you.
Thanks for any feedback you might have on this issue.
Best,
Hendrik
Hi
thank you for your great package!
A suggestion for improvement could be to provide the res="10_minutes"
for selectDWD
. It works but there are no minutes in the MESS_DATUM
column right now (all set to zero, instead of :00, :10, :20, :30, :40, :50)
library(rdwd)
link <- selectDWD("Kiel-Holtenau", res="10_minutes", var="air_temperature", per="recent")
file <- dataDWD(link, read=FALSE)
air_temperature <- readDWD(file, varnames=TRUE)
air_temperature
Maybe it is line 250 in readDWD.R in addition
if(nch==12) "%Y%m%d%H%M" # for 201804270000
https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/10_minutes/
Kind regards
Christof
Consider whether to use R.cache for dir
management.
https://stat.ethz.ch/pipermail/r-package-devel/2019q4/004783.html
Open questions:
getwd()
Add use case to website
library(rdwd)
# select data
index <- indexFTP(folder="annual/air_temperature_max", base=gridbase)
index <- index[-(1:2)] # exclude description files
index <- index[as.numeric(substr(index,62,65))>=2013] # after year 2013
index
# download & read data:
tempmax <- dataDWD(index, base=gridbase, joinbf=TRUE)
names(tempmax) <- substr(names(tempmax), 62, 65)
# visual data & projection check:
plotRadar(tempmax[[1]], proj="seasonal",extent="seasonal",
main="Annual grid of monthly averaged daily maximum air temperature (2m) - 2013")
# raster stack
tempmax_stack <- raster::stack(tempmax)
tempmax_stack <- projectRasterDWD(tempmax_stack, proj="seasonal",extent="seasonal")
tempmax_stack
raster::plot(tempmax_stack, zlim=range(raster::cellStats(tempmax_stack, range)) )
loc <- data.frame(x=12.65295, y=53.06547) # Aussichtspunkt Kyritz-Ruppiner Heide
raster::extract(tempmax_stack, loc)
dataDWD
and readDWD
shall gain the argument hr
: integer code to automatically merge historical and recent datasets. If set, returns a data.frame instead of a list.
See this use case.
0 (default): ignore this argument
1: message("merging n files") + sort by hr (if rh is given) + merge + remove rownames.
2: also remove duplicated dates from recent
3: also remove columns QN3,QN4,eor
4: also remove column STATIONS_ID
Sub-daily datasets are not extracted correctly, as seen in my example.
library(rdwd)
link <- selectDWD(id=10381, res="subdaily", var="standard_format", per="historical")
file <- dataDWD(link, read=FALSE)
clim <- readDWD(file, varnames=FALSE)
Hi Berry,
With the help of your package you can download data from
ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate.
is there a way to download data from :
ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/phenology/
How can I tell rdwd to look into that folder instead ?
Thank you very much,
Best regards,
Etienne
Hi,
selectDWD loads wrong filenames.
link <- selectDWD(res="daily", var="kl", per="rh", outvec = TRUE)
Due to the following?
https://www.dwd.de/EN/climate_environment/cdc/cdc_node.html
CDC OpenData
Due to DWD data service restructuring, the CDC-FTP Server is moved to DWD OpenData, allowing access via ftp and https.
The new access is: https://opendata.dwd.de/climate_environment/CDC/ and ftp://opendata.dwd.de/climate_environment/CDC/.
Most data sets, especially the German station data, have been copied already and are now updated synchronously with the FTP-server. For a period of transition, both the new and the old addresses are valid.
Deadline for termination of address ftp-cdc is: 1st June 2019.
First: After many years I finally managed to wrap my head around this DWD data stuff and got something useful (for me) running.
So thank you very much for making these strangely complex datasets available to the (nearly) normal people.
But now that I want to put it in practice, I realize that there seem to be problems downloading historic datasets. This case was not foreseen and now my script got stuck:
dataDWD -> dirDWD: adding to directory '/home/bernd/Projekte/R/rdwd'
dataDWD: 1 file already existing and not downloaded again: 'hourly_air_temperature_historical_stundenwerte_TU_05149_20050101_20221004_hist.zip'
Now downloading 3 files...
dataDWD -> newFilename: creating 3 files:
/home/bernd/Projekte/R/rdwd/hourly_air_temperature_historical_stundenwerte_TU_02600_20050301_20221231_hist.zip
/home/bernd/Projekte/R/rdwd/hourly_air_temperature_historical_stundenwerte_TU_05705_19480101_20221231_hist.zip
(and 1 more)
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Warnmeldung:
3 Downloads have failed (out of 4). Setting read=FALSE. download.file errors:
kann URL 'ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/historical/stundenwerte_TU_02600_20050301_20221231_hist.zip' nicht öffnen
kann URL 'ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/historical/stundenwerte_TU_05705_19480101_20221231_hist.zip' nicht öffnen
kann URL 'ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/historical/stundenwerte_TU_01107_20041201_20221231_hist.zip' nicht öffnen
Actually, the zips are there and can be downloaded manually.
Just ran into this issue late, cause I developed the script with recent data only.
Queries for historical weather records using selectDWD
returns URLs that cannot be opened. In the past those URLs worked but seem outdated now. I am not able to figure out where these URLs are generated but perhaps the example can help in solving the issue.
## example
library(rdwd)
library(magrittr)
stations <- c("Belm", "Salzuflen, Bad", "Bielefeld-Deppendorf")
url <- lapply(stations, selectDWD, res = 'daily', var = 'kl', per = 'h')
url
All URLs end with 20171231 and do not exist as shown below for the first.
[[1]]
[1] "ftp://ftp-cdc.dwd.de/pub/CDC/observations_germany/climate/daily/kl/historical/tageswerte_KL_00342_20101201_20171231_hist.zip"
[[2]]
[1] "ftp://ftp-cdc.dwd.de/pub/CDC/observations_germany/climate/daily/kl/historical/tageswerte_KL_04371_19350101_20171231_hist.zip"
[[3]]
[1] "ftp://ftp-cdc.dwd.de/pub/CDC/observations_germany/climate/daily/kl/historical/tageswerte_KL_07106_20060901_20171231_hist.zip"
dataDWD(url[[1]], quiet = T)
Error in download.file(url = file[i], destfile = outfile[i], quiet = TRUE, :
cannot open URL 'ftp://ftp-cdc.dwd.de/pub/CDC/observations_germany/climate/daily/kl/historical/tageswerte_KL_00342_20101201_20171231_hist.zip'
In addition: Warning message:
In download.file(url = file[i], destfile = outfile[i], quiet = TRUE, :
InternetOpenUrl failed: ''
The correct URLs end with 20181231 and dataDWD
can handle those
out <- lapply(url, stringr::str_replace, "20171231", "20181231") %>%
lapply(., dataDWD, quiet = T)
head(out[[1]])
STATIONS_ID MESS_DATUM QN_3 FX FM QN_4 RSK RSKF SDK SHK_TAG NM VPM PM TMK UPM TXK TNK TGK eor
1 342 2010-12-01 10 NA NA 3 0.0 4 3.0 NA NA NA NA -7.0 NA -4.9 -9.1 -9.2 eor
2 342 2010-12-02 10 7.0 NA 3 0.5 4 0.0 0 7.7 3.1 996.5 -7.4 90 -6.5 -8.6 -8.2 eor
3 342 2010-12-03 10 7.6 NA 3 0.1 4 0.0 1 6.9 3.4 998.0 -5.6 85 -3.9 -7.5 -9.7 eor
4 342 2010-12-04 10 12.0 NA 3 4.7 4 3.6 1 5.0 4.1 1000.5 -3.2 84 -0.8 -4.9 -6.6 eor
5 342 2010-12-05 10 13.1 NA 3 2.5 4 0.0 3 7.8 6.2 989.3 0.7 97 1.4 -1.2 -1.1 eor
6 342 2010-12-06 10 5.5 NA 3 0.1 4 0.0 1 7.9 5.8 989.7 0.2 95 1.6 -4.5 -6.3 eor
From session.info()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] magrittr_1.5 rdwd_0.11.0
In the below example, the standard data import works as expected
lnk = selectDWD(id = 3379, res = "hourly", var = "air_temperature", per = "r")
dat1 = dataDWD(lnk, dir = file.path(tempdir(), "DWDdata")) # works
whereas fast reading using fread = TRUE
fails
dat2 = dataDWD(lnk, dir = file.path(tempdir(), "DWDdata"), fread = TRUE) # error
with the following error message
Error in data.table::fread(paste("unzip -p", file, fp), na.strings = na9(), :
freadMain: NAstring << -9999>> has whitespace at the beginning or end
Here's my sessionInfo()
:
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C LC_TIME=German_Germany.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] mapview_2.4.8 rdwd_0.10.2 DWD_0.1.0 sf_0.6-3
loaded via a namespace (and not attached):
[1] xfun_0.3 pbapply_1.3-4 lattice_0.20-35 colorspace_1.3-2 htmltools_0.3.6
[6] stats4_3.5.1 viridisLite_0.3.0 yaml_2.1.19 base64enc_0.1-3 e1071_1.6-8
[11] later_0.7.3 withr_2.1.2 DBI_1.0.0 sp_1.3-1 RColorBrewer_1.1-2
[16] plyr_1.8.4 munsell_0.5.0 raster_2.6-7 htmlwidgets_1.2 devtools_1.13.6
[21] memoise_1.1.0 latticeExtra_0.6-28 knitr_1.20 httpuv_1.4.4.2 crosstalk_1.0.0
[26] class_7.3-14 Rcpp_0.12.17 xtable_1.8-2 promises_1.0.1 scales_0.5.0
[31] classInt_0.2-3 satellite_1.0.1 plotrix_3.7-2 leaflet_2.0.1 webshot_0.5.0
[36] jsonlite_1.5 abind_1.4-5 mime_0.5 png_0.1-7 digest_0.6.15
[41] Orcs_1.0.0 bookdown_0.7 shiny_1.1.0 berryFunctions_1.17.0 grid_3.5.1
[46] RPostgreSQL_0.6-2 rgdal_1.3-3 tools_3.5.1 bitops_1.0-6 magrittr_1.5
[51] RCurl_1.95-4.11 data.table_1.11.4 spData_0.2.9.0 R6_2.2.2 units_0.6-0
[56] compiler_3.5.1
nwp_t2m_base <- "ftp://opendata.dwd.de/weather/nwp/icon-d2/grib/03/p"
nwp_urls <- indexFTP("", base=nwp_t2m_base, dir=tempdir())
nwp_file <- dataDWD(nwp_urls[6], base=nwp_t2m_base, dir=tempdir(), joinbf=TRUE, dbin=TRUE, read=FALSE)
nwp_data <- readDWD(nwp_file, quiet=TRUE)
Fails with messsage from rgdal::readGDAL:
**.grib2 is a grib file, but no raster dataset was successfully identified.
rename localtestdir
to short locdir
.
use it in website, examples, etc.
move to location out of 'rdwd' package folder and out of dropbox.
readDWD
has the argument fread
to read datasets through data.table::fread
, which is significantly faster than base unzip+read.table
.
Early on (2017), the default was fread=NA
(i.e. conditional on availability of data.table
).
Some users sent emails about errors on Windows OS so I changed the default to FALSE (272b947).
For speedup, it would be nice to have the default NA again.
This will be set experimentally to see if new complaints arise.
With the new 5-minute data (April 2022), the fileIndex
and derived indexes are getting very big.
I think rdwd
cannot be published atomatically on CRAN with the check NOTE installed size is 5.3Mb (R 3.5, data 1.4)
Ideas on package size reduction are welcome!
in readDWD
, change the long list of arguments (that decide which subfunction to call) to a single argument, e.g. called "type".
There shall be a separate (also vectorized) function to determine it from the filename, e.g. "get_type".
It shall have a nice table as documentation.
Throughout the homepage, redirected man pages like
https://www.rdocumentation.org/packages/rdwd/topics/fileIndex
should be replaced with
https://www.rdocumentation.org/packages/rdwd/topics/index
Something like
helplink <- function(doc, topic=doc) paste0("[
",doc,"](https://www.rdocumentation.org/packages/rdwd/topics/",topic,")")
should do the trick, and then I just need to check manually which manpages have an alias.
selectDWD(id=1050, res=c("daily","monthly"), var="kl", per="hr")
currently returns
dwdbase/daily/kl/historical/**_hist.zip
and dwdbase/monthly/kl/recent/**_akt.zip
Should per="hr" expand to all combinations?
Does this extend to id vectors?
install.packages("rdwd")
Installing package into ‘/home/rubensrangel/R/i686-pc-linux-gnu-library/3.2’
(as ‘lib’ is unspecified)
Warning in install.packages :
package ‘rdwd’ is not available (for R version 3.2.3)
Hi Berry!
Cool package! Just using it to look at some precipitation data. When searching available data sets for stations using metaInfo(), however, it seems that the time span printed for the folders historic and recent (or now) is the same for the variables. Shouldn't from/to indicate what data actually is available in the respective folder? Below the example output I get for the station Langerwish:
metaInfo(id=2863)
rdwd station id 2863 with 6 files. Name: Langerwisch, State: Brandenburg
res var per hasfile from to lat long ele
1 annual more_precip historical TRUE 1974-01-01 2020-12-31 52.3175 13.0679 40
2 daily more_precip historical TRUE 1974-01-01 2021-05-31 52.3175 13.0679 40
3 monthly more_precip historical TRUE 1974-01-01 2021-04-30 52.3175 13.0679 40
4 annual more_precip recent TRUE 1974-01-01 2020-12-31 52.3175 13.0679 40
5 daily more_precip recent TRUE 1974-01-01 2021-05-31 52.3175 13.0679 40
6 monthly more_precip recent TRUE 1974-01-01 2021-04-30 52.3175 13.0679 40
Erwin
'C:/Users/A1059380/AppData/Local/Temp/RtmpcbhB7B/hourly_air_temperature_historical_stundenwerte_TU_00003_19500401_20110331_hist.zip'
Warning: 1 Download has failed (out of 1). download.file error:
the 'wininet' method for ftp:// URLs is defunct
I get this error when I use selectDWD() then dataDWD(). I have R 4.2 installed.
Dear Berry,
I noticed that several links in your bookdown documentation of the package to the opendata platform of DWD are not working properly. The reason seems to be that urls have changed from ftp://opendata.dwd.de to https://opendata.dwd.de/
For instance in 1. Intro it must be https://opendata.dwd.de/climate_environment/CDC/Readme_intro_CDC_ftp.pdf rather than ftp://opendata.dwd.de/climate_environment/CDC/Readme_intro_CDC_ftp.pdf. The same applies also, e.g. to all the links provided in section 4 Available Datasets.
Best regards
Till
Apparently, the dwd has updated some filenames for historical data. I wanted to download the historical monthly numbers for weather stations in Saxony, but received errors for 31 out of 53 stations. Here is an example for Geringswalde-Altgeringswalde
:
> library(rdwd)
> link <- selectDWD(id=131, res="monthly", var="kl", per="historical")
> file <- dataDWD(link, read=FALSE)
dataDWD -> dirDWD: creating directory '/home/steinbac/DWDdata'
dataDWD -> newFilename: creating 1 file:
'/home/steinbac/DWDdata/monthly_kl_historical_monatswerte_KL_00131_20041101_20181231_hist.zip'
Warning message:
1 Download has failed (out of 1).
download.file error:
cannot open URL 'ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/monthly/kl/historical/monatswerte_KL_00131_20041101_20181231_hist.zip'
When checking the URL and comparing to CDC folder, I observe that monatswerte_KL_00131_20041101_20181231_hist.zip
is now monatswerte_KL_00131_20041101_20191231_hist.zip
at this link. So a year's worth of data was added.
> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Fedora 31 (Workstation Edition)
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rdwd_1.3.1
loaded via a namespace (and not attached):
[1] compiler_3.6.3 parallel_3.6.3 tools_3.6.3
[4] berryFunctions_1.18.2 abind_1.4-5 pbapply_1.4-2
terra
, the future replacement of the raster
package should be implemented where possible, as the usage of proj4 strings now throw warnings through rgdal. WKT is now recommended and terra
seems to do that internally just fine.
Maybe keep the raster argument with default FALSE and add terra argument.
Maybe make projectRasterDWD
dependant on input class.
fn <- "misc/localdata/16_DJF_grids_germany_seasonal_air_temp_mean_188216.asc"
r <- raster::raster(fn)
rt <- terra::rast(nrows=raster::nrow(r), ncols=raster::ncol(r), extent=r@extent)
terra::values(rt) <- raster::values(r)
terra::plot(rt)
library(terra)
terra::plot(rt) # only works after calling library...
pseas <- "+proj=tmerc +lat_0=0 +lon_0=9 +k=1 +x_0=3500000 +y_0=0 +ellps=bessel +datum=potsdam +units=m +no_defs"
ptar <- "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"
terra::crs(rt) <- pseas
rp <- terra::project(rt, ptar)
terra::plot(rp)
addBorders()
Hi there, I am using your great package for downloading wind data for a bunch of weather stations. Everything is working fine, however, the list output of dataDWD
does only provide the time of the full hour. That means, the preciseness of the 10 minutes resolution is kind of lost for analysis. Am I doing something wrong? I checked the documentation of dataDWD
and also the source code and could not find a clue as to wheter I have misspecified something. My output looks like this (the data type inside the list is double (S3: POSIXct, POSIXt)
.
res1[["10_minutes_wind_now_10minutenwerte_wind_00011_now_2"]][["MESS_DATUM"]]
"2020-02-25 00:00:00 GMT"
"2020-02-25 00:00:00 GMT"
"2020-02-25 00:00:00 GMT"
"2020-02-25 00:00:00 GMT"
"2020-02-25 00:00:00 GMT"
"2020-02-25 00:00:00 GMT"
"2020-02-25 01:00:00 GMT"
"2020-02-25 01:00:00 GMT"
"2020-02-25 01:00:00 GMT"
"2020-02-25 01:00:00 GMT"
"2020-02-25 01:00:00 GMT"
"2020-02-25 01:00:00 GMT"
...
This is my code:
download <- selectDWD(id = 11, res = "10_minutes", outvec = TRUE, var = "wind", per = "now")
res1 <- dataDWD(file = download, dir = here::here("wetter"), force = TRUE, quiet = TRUE)
Of course, I could just add the minutes manually after downloading. But I first wanted to make sure there is no other way of doing so.
Consider using polite for downloading files.
Maybe this is already conceptually fine with no re-downloads and CURL handle in https://github.com/brry/rdwd/blob/master/R/indexFTP.R#L121
Hello, opendata.dwd.de seems to support also https, not just ftp. Would it be possible to add this possibility since https is much more supported and not restricted on more secure systems?
devtools::install_github("brry/rdwd")
Downloading GitHub repo brry/rdwd@master
Skipping 2 packages not available: berryFunctions, pbapply
Installing 2 packages: berryFunctions, pbapply
Installing packages into ‘/home/rubensrangel/R/i686-pc-linux-gnu-library/3.2’
(as ‘lib’ is unspecified)
Erro: (convertido do aviso) packages ‘berryFunctions’, ‘pbapply’ are not available (for R version 3.2.3)
link <- selectDWD("XXXXXX", res="hourly", var="air_temperature", per="recent")
file <- dataDWD(link, read=FALSE, dir=paste(getwd(),"/data/Weather_Hour/",sep = ""), quiet=TRUE)
clim <- readDWD(file,type = "data" )
Am getting the error
Der Befehl "unzip" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
English version - The command "unzip" is either misspelled or
could not be found.
i am not able to find a installation of unyip for windows.
Dear Berry,
I would like to first thank you for writing rdwd. I found the package quite interesting. I started it using a couple of days ago, and I have found an issue when downloading data that I hope you might help me.
Let me illustrate this with a little example. By running the following code:
library(rdwd)
link <- selectDWD(c("Potsdam"), res="hourly", var="sun", per="hist")
pdata <- dataDWD(link)
I get this message:
Warning message:
1 Download has failed (out of 1). Setting read=FALSE.
download.file error:
cannot open URL 'ftp://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/sun/historical/stundenwerte_SD_03987_18930101_20181231_hist.zip'
If I manually change the last piece of the string "link" from 20181231 to 20191231, it works.
Is this a problem of my installation? (I am using R 4.0.1, with RStudio 1.2.5042 and rdwd 1.3.1 in Windows 7).
I tried updating to rdwd 1.3.13 via the updateRdwd command, but I get this error:
Error : (converted from warning) C:/Users/mmorenozam/AppData/Local/Temp/RtmpMNcjbL/R.INSTALL6484896499/rdwd/man/plotRadar.Rd:46: unknown macro '\rcode'
ERROR: installing Rd objects failed for package 'rdwd'
I have installed berryFunctions 1.19.3, and still cannot update.
Waiting for your answer,
Best
Mauricio
This package depends on (depends, imports or suggests) raster and one or more of the retiring packages rgdal, rgeos or maptools (https://r-spatial.org/r/2022/04/12/evolution.html). Since raster 3.6.3
, all use of external FOSS library functionality has been transferred to terra, making the retiring packages very likely redundant. It would help greatly if you could remove dependencies on the retiring packages as soon as possible.
Hi, I'm working on windows and when reading file with readDWD I get the following error:
'unzip' is not recognized as an internal or external command,
operable program or batch file.
I have both utils and ncdf4 libraries installed, and separately the unzip function works.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.