GithubHelp home page GithubHelp logo

mapbox / rio-cloudmask Goto Github PK

View Code? Open in Web Editor NEW
53.0 102.0 16.0 7.45 MB

Rasterio plugin for identifying clouds in multi-spectral satellite imagery

License: MIT License

Python 100.00%
satellite imagery rasterio

rio-cloudmask's Introduction

Build Status Coverage Status

rio-cloudmask

Rasterio plugin for identifying clouds in multi-spectral satellite imagery.

This project is based laregely on the research by Zhu and Woodcock

as well as the subsequent fmask and cfmask software implementations.

Why build our own? The CFmask software produces excellent results but is designed to be part of a larger USGS processing framework, thus bringing with it some implementation overhead and assumptions that prevent easy integration with other systems. In short, we need a pip installable, numpy-based tool that works with GDAL raster formats and integrates well with Rasterio data processing pipelines.

Example

Given this input data from Landsat 8 (LC80130312015295LGN00)

rgb

Assuming we've already derived Top of Atmosphere (TOA) reflectance and brightness temperatures using rio-toa, we can use those to create a uint8 mask suitable for use as an alpha band in an RGBA image:

rio cloudmask LC8*_B[2-7]_toa.tif LC8*_B9_toa.tif LC8*_B10_toa.tif -o test.tif

mask

Status

The first iteration of the cloudmask algorithm implements the potential cloud layer

Still to do...

  • cloud shadow and snow detection (section 3.1 in Zhu, Woodcock 2012 with subsequent changes from 2015 paper) for Landsat 8.

  • Landsat 4-7 (TM/ETM+) sensors lack the cirrus band which is a critical component to high-quality cloud masks. However, the algorithm could be adjusted in the future by optionally ommiting the cirrus tests.

  • Sentinel 2 does not include a thermal band which is heavily used by this implementation. In the future, we may adjust the algorithm (per Zhu, Woodcock 2015, section 2.2.2) to account for this and allow for use with Sentinel 2 data.

  • The object-based cloud and show matching may be implemented at a later time if needed. (per Zhu, Woodcock 2012, section 3.2)

See also

Another Python implementation: Python Fmask

rio-cloudmask's People

Contributors

dgketchum avatar perrygeo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rio-cloudmask's Issues

How to handle potential snow

The output of rio-cloudmask is somewhat more opinionated than the original algorithm in that we mush boil the entire analysis down to a binary mask. Currently we just ignore the potential snow layer. Should we do something with it?

Reduce memory footprint

There are a few global operations (scene-wide quantiles, others?) that preclude a simplistic block iteration approach. But we should be able to trade a bit of speed, maybe doing two passes, in order to reduce the memory footprint which is upwards of 5+ GB. We should profile memory usage and think of some ways to reduce it.

More features?

@perrygeo I prefer your clean framework here a million times over the prohibiting interfaces of the other implementations of FMask. ๐Ÿ˜„

But are you planning on adding some of the missing features / want contributions in that direction? You mentioned some under https://github.com/mapbox/rio-cloudmask#status

In particular, I need cloud shadow detection and full Sentinel 2 support.

I can offer some, but maybe not all of the work. So if this is high enough on your agenda, we could maybe do this together rather soon and perhaps also get some more current users of python-fmask interested...

What do you say?

Expose kwargs and options for filter sizes

From the cloudmask function:

        # remove cloud shadow outliers
        pcsl = minimum_filter(pcsl, size=(5, 5))

        # grow around the edges
        pcl = maximum_filter(pcl, size=(21, 21))
        pcsl = maximum_filter(pcsl, size=(13, 13))

These should be configurable from the CLI and the cloudmask function. Naming TBD

Error in brightness test

Hello,

I found a small typo in the algorithm.

According to the paper, it is band 5 that is used for brightness test, i.e. swir1 and not nir as coded in rio-cloudmask. This messes a lot of things over water.

Many thanks for the clean code style!
Thomas

Property-based tests

We should test each function in equations.py using hypothesis to suss out any edge cases.

Handle absence of TIRS Band 10

When the TIRS band 10 is all-zero, as is the case with e.g. LC80460282016097LGN00, the equations 7&8 are filled with nans

    /Users/mperry/env/mapbox35/lib/python3.5/site-packages/numpy/lib/nanfunctions.py:1001: RuntimeWarning: All-NaN slice encountered
      warnings.warn("All-NaN slice encountered", RuntimeWarning)

Which leads to the masking of the entire scene, as if there were 100% clouds.

The fmask algorithm is capable of dealing with the absence of a thermal band. We just need to

  • figure out why some seemingly arbitrary scenes have blank thermal bands (is this a known TIRS issue @celoyd?)
  • adjust the implementation to fallback to a non-thermal mode.

Fails when there are no clear sky land pixels

Proximate failure: the temp_land function returns a single nan which prevents the tlow, thigh = temp_land(pcps, water, tirs1) from unpacking (expects two variables)

But the bigger question is, what should the algorithm do in this case? It's not well defined in the publication so I'll have to dig into the implemetation.

use of np.fmax with 3 arrays in equations.variability prob

Hi @perrygeo,
I love all the work you've done and put on github, it has taught me a lot and made my life of working with landsat data easier!

I'm putting your implementation of fmask equations in a class and noticed a possible misuse of numpy.fmax:

def variability_prob(ndvi, ndsi, whiteness):
 return 1.0 - np.fmax(np.absolute(ndvi), np.absolute(ndsi), whiteness)

where fmax should only evaluate two arrays at a time, e.g. in the following code indices in which c is greatest are ignored:

>>> a
array([[177,  44,  99],
       [198, 108, 206],
       [180,  10,  35]])
>>> b
array([[101,  64,  53],
       [  5, 169,  82],
       [206, 111, 252]])
>>> c
array([[149,  87,  36],
       [210, 111,  83],
       [189, 208, 106]])
>>> np.fmax(a, b, c)
array([[177,  64,  99],
       [198, 169, 206],
       [206, 111, 252]])

This is fixed with

        ndi_max = np.fmax(np.absolute(self.ndvi), np.absolute(self.ndsi))
        f_max =  1.0 - np.fmax(ndi_max, whiteness)
        return f_max

I'm planning on continued work with fmask. If you are open to PRs, I might work on equations.py (and tests) and seperate the functions from my fmask class.

Cheers

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.