GithubHelp home page GithubHelp logo

ropensci / rerddap Goto Github PK

View Code? Open in Web Editor NEW
40.0 5.0 14.0 98.37 MB

R client for working with ERDDAP servers

Home Page: https://docs.ropensci.org/rerddap

License: Other

R 98.63% Makefile 1.37%
rstats erddap noaa-data api-client r r-package

rerddap's Introduction

rerddap

cran checks R-CMD-check codecov.io rstudio mirror downloads cran version

rerddap is a general purpose R client for working with ERDDAP servers.

Package Docs: https://docs.ropensci.org/rerddap/

Installation

From CRAN

install.packages("rerddap")

Or development version from GitHub

remotes::install_github("ropensci/rerddap")
library("rerddap")

Some users may experience an installation error, stating to install 1 or more packages, e.g., you may need DBI, in which case do, for example, install.packages("DBI") before installing rerddap.

Background

ERDDAP is a server built on top of OPenDAP, which serves some NOAA data. You can get gridded data (griddap (https://upwell.pfeg.noaa.gov/erddap/griddap/documentation.html)), which lets you query from gridded datasets, or table data (tabledap (https://upwell.pfeg.noaa.gov/erddap/tabledap/documentation.html)) which lets you query from tabular datasets. In terms of how we interface with them, there are similarities, but some differences too. We try to make a similar interface to both data types in rerddap.

NetCDF

rerddap supports NetCDF format, and is the default when using the griddap() function. NetCDF is a binary file format, and will have a much smaller footprint on your disk than csv. The binary file format means it's harder to inspect, but the ncdf4 package makes it easy to pull data out and write data back into a NetCDF file. Note the the file extension for NetCDF files is .nc. Whether you choose NetCDF or csv for small files won't make much of a difference, but will with large files.

Caching

Data files downloaded are cached in a single directory on your machine determined by the hoardr package. When you use griddap() or tabledap() functions, we construct a MD5 hash from the base URL, and any query parameters - this way each query is separately cached. Once we have the hash, we look in the cache directory for a matching hash. If there's a match we use that file on disk - if no match, we make a http request for the data to the ERDDAP server you specify.

ERDDAP servers

You can get a data.frame of ERDDAP servers using the function servers(). Most I think serve some kind of NOAA data, but there are a few that aren't NOAA data. If you know of more ERDDAP servers, send a pull request, or let us know.

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for rerddap in R doing citation(package = 'rerddap')
  • Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

rerddap's People

Contributors

btupper avatar douglatornell avatar jeroen avatar kant avatar kwilcox avatar maelle avatar rmendels avatar sckott avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rerddap's Issues

Fail better

At least with the function erdddap_GET() which just uses stop_for_status right now

Leverage sf?

cc @sckott @rmendels

The fairly new sf package is gaining lots of traction and there is even a new geom_sf() coming to ggplot2

Assuming we can represent griddap/tabledap data structures using simple features, we can leverage/augment some already existing "automatic plotting methods" (e.g., plot.sf() and geom_sf()).

I'm thinking that (over the next week or so) I will attempt to write some tabledap/griddap_csv/griddap_nc methods for the sf::st_as_sf() generic, so generating plots could be as simple as:

library(rerddap)
library(sf)
library(ggplot2)

d <- st_as_sf(tabledap('erdCinpKfmBT'))
plot(d)
ggplot() + geom_sf(data = d)

Does this sound reasonable?

Query to get back just dimension vars

after playing a bit:

It looks like with dimension variables (Eg.., lat, lon), you can't query by a dim var, but not return that dim var, so e.g., (this call works)

http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdQSwind3day.htmlTable?latitude[(-15.0):1:(15.0)],longitude[(0.0):1:(360.0)]

We get back lat and lon, but we can't e.g., get back just lat, but include a query for lon.

Also, I haven't yet found a way to query to get just latitude e.g., but also get a measurement variable, e.g., x_wind here (this call doesn't work):

http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdQSwind3day.htmlTable?x_wind[(-15.0):1:(15.0)],latitude[(-15.0):1:(15.0)]

So anyway, I think we can allow the first situation, where you don't get any measured variables back, but you can get just dimension variables.

Set options

would be nice to set url at least, if not other options, since it's annoying to use url parameter every fxn call

add to docs: griddap and maybe tabledap

something like

if users run into this error that it's likely they are hitting up against a size limit, and they should reduce the amount of data they are requesting either via space, time, or variables.

"this error" being the one like this:

HTTP Status 500 - There was a (temporary?) problem. Wait a minute, then try again. (In a browser, click the Reload button.)

Crosswalk for non-standard variable names

from @rmendels

As for not knowing coordinate names, I can look at the names passed by a user, I can look at the names in the ERDDAP server, but matching them up could be problematic. For example, ERDDAP good practice is to use “longitude” “latitude”, “time” etc, but someone could have “lon”,”lat”,”time_series”, and the question is do I just bomb the user out or what. Also “altitude” can be things like “sigma depths”. More design decisions that anything.

change ncdf to ncdf4

message from Ripley:

which depend on package ncdf. The latter is now deprecated and
scheduled to be removed from CRAN in Jan 2016. It has only been keep
this long because ncdf4 was not available for Windows: it now is (on
CRAN extras, a default repository for binary installs). Please convert your package to use either ncdf4 or RNetCDF as soon as possible and definitely by early January. (For the second group this
could just mean removing references to ncdf as the package passes its
checks without it.)

Parsing nc grid data screws up

out <- info('erdMH1chla8day')
griddap(out, time = c('2013-02-01', '2013-02-29'),
    latitude = c(-22.8, -21.8),
    longitude = c(39.8, 40.8))

was giving wrong output

I think b/c we were layout out lat/long data incorrectly

Support advanced searches

ROMS data doens't work correctly

library(rerddap)
testInfo<-info('glos_tds_976e_41ad_58ec')

dimargs <- list(ny=c(31,33), nx=c(150,152))

##  this fails
extract<-griddap(x = testInfo, ny=c(31,33), nx=c(150,152))

## this works
extract<-griddap(testInfo,ny=c(31,33), nx=c(150,152), read=FALSE)

#The first fails because in “melting” the grid to “long-form” the code explicitly assumes 
# latitude and longitude.  Or as another example, extracting from a ROMS model:
testInfo <- info('whoi_f2e9_92f8_cca9')

### this fails
extract <- griddap(testInfo,time=c('2007-09-06','2007-09-06'), 
                   eta_rho=c(0,2),xi_rho=c(20,22), fields='bestrew')
### this works
extract <- griddap(testInfo,time=c('2007-09-06','2007-09-06'), 
                   eta_rho=c(0,2),xi_rho=c(20,22), fields='bestrew', read=FALSE)

# I am pretty certain the dimension checks won’t work correctly should 
# I give one out of the actual range, because those also, to the best 
# I can tell from looking at the code, specific to lat-lon-time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.