GithubHelp home page GithubHelp logo

mxnl / eumohpclipr Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 32.27 MB

Functionality to clip, subset, plot and write EU-MOHP data

License: Other

R 100.00%
eu-mohp hydrogeology hydrology modelling mapping machine-learning features r

eumohpclipr's Introduction

eumohpclipr

Lifecycle: experimental R-CMD-check Project Status: Active - The project has reached a stable, usable state and is being actively developed. License: MIT + file LICENSE codecov

The goal of eumohpclipr is to provide users of the EU-MOHP data set with the functionality to

  1. eumohp_clip(): Clip the raster .tif files to their custom area of interest and define a required subset of the data.
  2. eumohp_plot(): Plot the clipped and subsetted data relatively fast through using stars proxy objects.
  3. eumohp_write(): To write the clipped and subsetted data to disc as .tif files. This helps to reduce file sizes to the required spatial extent.

The EU-MOHP data set is meant as temporally static and spatially contiguous environmental predictors for the application of predominantly machine learning models for hydrologic and hydrogeological modelling / mapping tasks. It can be used along with other environmental predictors, such as land use and land cover data, soil maps, geological maps, digital elevation models, etc.

Installation

You can install the development version of eumohpclipr from GitHub with:

# install.packages("devtools")
devtools::install_github("MxNl/eumohpclipr")

Example

Prerequisites

In order to use this package with data, it is a necessary to download the EU-MOHP data set from the data hosting platform hydroshare in the latest or required version. After the dowload the zipped .7z files must be unzipped and stored in the same directory.

Load the package

library(here)
#> here() starts at D:/Data/github/eumohpclipr
library(eumohpclipr)

Clipping and Subsetting

Get the directory, where the EU-MOHP Geotiffs (.tif) files are stored.

eumohp_directory <- here::here(
  "..",
  "macro_mohp_feature_test",
  "macro_mohp_feature",
  "output_data"
)

This directory contains all the unzipped downloaded files as described previously on my local computer. This directory needs to be changed according to the directory on your local machine.

Specifying the spatial extent of the clipped result via the argument: countries

eumohp_clipped_countries <- eumohp_clip(
  directory_input = eumohp_directory,
  countries = c("germany", "denmark"),
  buffer = 1E4,
  hydrologic_order = 1:4,
  abbreviation_measure = c("dsd", "lp"),
  eumohp_version = "v013.1.1"
)

The resulting eumohp_clipped object holds a list of clipped and subsetted stars proxy objects. This list can later be fed into the functions eumohp_plot or eumohp_write.

We can have a look at the length of the list eumohp_clipped_countries.

eumohp_clipped_countries |> length()
#> [1] 8

In this case, eumohp_clipped_countries contains 8 stars proxy objects because we requested 4 hydrologic orders (hydrologic_order = 1:4) and 2 measures (abbreviation_measure = c("dsd", "lp")). 4 * 2 = 8.

But there are also other options to specify the area of interest. Specifying the spatial extent of the clipped result via the argument: custom_sf_polygon

eumohp_clipped_customsfpolygon <- eumohp_clip(
  directory_input = eumohp_directory,
  custom_sf_polygon = .test_custom_sf_polygon() |> summarise(),
  buffer = 1E4,
  hydrologic_order = 1:4,
  abbreviation_measure = c("dsd", "lp"),
  eumohp_version = "v013.1.1"
)

Specifying the spatial extent of the clipped result via the argument: region_name_spatcov

eumohp_clipped_regionnamespatcov <- eumohp_clip(
  directory_input = eumohp_directory,
  region_name_spatcov = c("france", "turkey", "italy2"),
  hydrologic_order = 1:4,
  abbreviation_measure = c("dsd", "lp"),
  eumohp_version = "v013.1.1"
)

Here, the argument buffer can not be applied as we are already using the maximum coverage of the EU-MOHP raster files through using the files directly for setting the spatial extent.

Plotting

You can plot the clipped and subsetted data with eumohp_plot().

eumohp_clipped_countries |> 
  eumohp_plot(downsample = 50)
#> Warning: Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).

You don’t have to provide the downsample argument, as it has a default value. But if your area of interest is quite large, a higher value for this argument reduces the time to plot.

Analogous with the second example

eumohp_clipped_customsfpolygon |> 
  eumohp_plot(downsample = 1)
#> Warning: Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).

Analogous with the third example

eumohp_clipped_regionnamespatcov |> 
  eumohp_plot(downsample = 10)
#> Warning: Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).

Writing Results to Disk

Regarding run time and memory, writing the data is the crucial part. This can be very expensive. This is why it is recommended to run this in parallel mode on a computer with sufficient memory and can be shut on for a few hours or days.

Write the data in sequential mode (not recommended)

eumohp_clipped_countries |>
  eumohp_write(directory_output = here("..", "output_test"))

Write the data in parallel mode (not recommended)

future::plan(future::multisession, 
             workers = ceiling(length(eumohp_clipped_countries) / 3))

eumohp_clipped_countries |>
  eumohp_write(directory_output = here("..", "output_test"),
               parallel = TRUE)

Citation

citation("eumohpclipr")
#> 
#> To cite package 'eumohpclipr' in publications use:
#> 
#>   Maximilian Nölscher (2022). eumohpclipr: Clipping the EU-MOHP data
#>   set to a selected country. R package version 0.0.0.9000.
#> 
#> Ein BibTeX-Eintrag für LaTeX-Benutzer ist
#> 
#>   @Manual{,
#>     title = {eumohpclipr: Clipping the EU-MOHP data set to a selected country},
#>     author = {Maximilian Nölscher},
#>     year = {2022},
#>     note = {R package version 0.0.0.9000},
#>   }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.