GithubHelp home page GithubHelp logo

hypertidy / controlledburn Goto Github PK

View Code? Open in Web Editor NEW
8.0 2.0 1.0 1.24 MB

don't materialize, just rasterize!

Home Page: https://hypertidy.github.io/controlledburn

License: Other

R 16.77% C++ 79.73% C 3.50%
compression lazy polygons raster

controlledburn's Introduction

controlledburn

R-CMD-check

The goal of controlledburn is to rasterize without materializing any pixel values.

A very fast rasterization algorithm for polygons in fasterize does the following:

  • restructures polygons to edges
  • considers only non-horizontal polygon edges and indexes them in y,x order (slope always down)
  • determines all affected raster rows and scans these for edge start and end column
  • burns polygon value or identity into every pixel between those starts and ends

controlledburn avoids that last step. With these rasterization functions the return value is not a set of pixels or a side-effect of a materialized raster file but simply those row and column start and end indexes.

This is an expression of the cell abstraction wish item fasterize/issues/11.

TODO

  • move to cpp11
  • rasterize lines fasterize/issues/30
  • formats for import (wk, geos, grd, rct, triangles etc.)
  • streaming with wkb/xy unpack with wk
  • provide output options (see next section)
  • port back into fasterize, with options for efficiently writing out to a tiled and sparse GeoTIFF
  • points is too easy, see vaster::cell_from_xy
  • name the package
  • copy logic from fasterize, and remove Armadillo array handling
  • remove use of raster objects, in favour of input extent and dimension
  • remove all trace of the raster package
  • implement return of the ‘yline, xpix’ and polygon ID info to user (see below)
  • make return of ylin,xpix structure efficient (CollectorList.h ftw)

Lines is working but still has some problems. For polygons it’s pretty good, see fasterize #6 for a remaining issue.

Outputs

Internal function burn_polygon() has arguments sf, extent, dimension. The first is a sf polygons data frame, extent is c(xmin, xmax, ymin, ymax) and dimension is c(ncol, nrow). In fasterize you need an actual non-materialized RasterLayer object, but all that was really used for was the six numbers extent, dimension and as the shell for the in-memory output.

The output of burn_polygon() is a list of four-element indexes start,end,row,poly_id - these are zero-based atm because they reflect the underlying C++ code. Examples shown here flatten this to a 4-column matrix (and add 1).

These options are still in play for what the interface/s could do:

  • record presence of polygon (this is all we have currently) OR ID of polygon
  • tools to format this meaningfully, and plot lazily (see example for quick plot)
  • tools to materialize as actual raster data

I also found this real world example for which this is relevant, discussed in PROJ for very fast lookup for large non-materialized (highly compressed) grids by Thomas Knudsen:

OSGeo/PROJ#1461 (comment)

Installation

You can install the development version of controlledburn like so:

remotes::install_github("hypertidy/controlledburn")

Example

This is a basic example, this is fast, and shows that it works.

pols <- silicate::inlandwaters
library(vaster)
#> 
#> Attaching package: 'vaster'
#> The following object is masked from 'package:stats':
#> 
#>     ts
## define a raster (xmin, xmax, ymin, ymax), (ncol, nrow)
ext <- unlist(lapply(silicate::sc_vertex(pols), range))
dm <- c(50, 40)
r <- controlledburn:::burn_polygon(pols, extent = ext,
                               dimension = dm)

## our index is triplets of start,end,line where the polygon edge was detected - 
## this essentially an rle by scanline of start,end polygon coverage
index <- matrix(unlist(r, use.names = F), ncol = 4L, byrow = TRUE) + 1 ## plus one because 0-index internally

## plot just the start and ends of each scanline detected
xy0 <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, index[,c(3, 3)], index[,1:2]))
plot(xy0)
plot(silicate::PATH0(pols), add = TRUE)

## expand out to every cell
  
cr <- do.call(rbind, apply(index, 1,\(.x) cbind(seq(.x[1], .x[2]), .x[3], .x[4])))
xy <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, cr[,2], cr[,1]))
plot(xy, pch = 19, cex = .3, col = palr::d_pal(cr[,3]))
plot(silicate::PATH0(pols), add = TRUE)

It scales to very large tasks, with small output.

dm <- c(500000, 400000)
system.time(r <- controlledburn:::burn_polygon(pols, extent = ext,
                               dimension = dm))
#>    user  system elapsed 
#>   0.913   0.068   0.982
length(r)
#> [1] 989153

## consider a prod(dm) raster of type double (or even bool) obviously would compress again but why churn
pryr::object_size(r)  
#> 71.22 MB

The following is inefficient, but shows that we get the right result.

dm <- c(500, 400)
system.time(r <- controlledburn:::burn_polygon(pols, extent = ext,
                               dimension = dm))
#>    user  system elapsed 
#>   0.002   0.001   0.003

index <- matrix(unlist(r, use.names = F), ncol = 4L, byrow = TRUE) + 1 ## plus one because 0-index internally

## now go inefficient, this is every column,row index, then converted to cell, converted to xy
cr <- do.call(rbind, apply(index, 1, \(.x) cbind(seq(.x[1], .x[2]), .x[3], .x[4])))
xy <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, cr[,2], cr[,1]))
plot(xy, pch = ".", col = palr::d_pal(cr[,3]))

rr <- terra::rast(cbind(xy, 0), type = "xyz")

rr[terra::cellFromXY(rr, xy)] <- 1
terra::plot(rr ,col = "firebrick", asp = NA)
plot(silicate::SC0(pols), add = TRUE)

Code of Conduct

Please note that the controlledburn project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

controlledburn's People

Contributors

mdsumner avatar olivroy avatar richardscottoz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

richardscottoz

controlledburn's Issues

lines

this example

l <- sf::st_cast(silicate::inlandwaters, "MULTILINESTRING")
library(vaster)
## define a raster (xmin, xmax, ymin, ymax), (ncol, nrow)
v <- silicate::sc_coord(l)
ext <- unlist(lapply(v, range)) #+ c(-1, 1, -1, 1) * .1

dm <- c(10000, 40000)
#ext <- c(0, dm[1], 0, dm[2])/dm

## less than a second
system.time({
r <- controlledburn:::burn_line(l, extent = ext,dimension = dm)
})
pp <- function(listindex, dimension, extent = NULL) {
  if (is.null(extent)) extent <- c(0, dimension[1], 0, dimension[2])
  idx <- do.call(rbind, listindex) + 1
  vaster::xy_from_cell(dimension, extent, vaster::cell_from_row_col(dimension, idx[,2], idx[,1]))
}
plot(pp(r, dm, ext), pch = ".")


# tr <- terra::rast(terra::ext(ext), ncols = dm[1], nrows = dm[2])
# vct <- terra::vect(l)
# tr <- terra::rasterize(vct, tr)

image

now committed.

  • not correct yet, it's got a hack for the initial y discretization (I've had enough getting confused for the moment)
  • _polygon and _edge versions of functions are now established
  • there is burn_polygon and burn_line unexported functions
  • currently we only allow sf lines for burn_line, but that could accept polygons trivially
  • consider, move to a model of storing cell numbers not yline,xcol ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.