image-rs / imageproc Goto Github PK

View Code? Open in Web Editor NEW

714.0 52.0 144.0 5.53 MB

Image processing operations

License: MIT License

Rust 100.00%

imageproc image-processing rust-library

imageproc's People

Contributors

Stargazers

Watchers

Forkers

theotherphil majkcramer tafia bvssvni cdbfoster lilith kwaegel nicokoch kevinmehall njfv huderlem gitter-badger alanstoate at1as rutgerrauws johann2 archer884 joek13 priestd09 kichristensen dvdplm michael-f-bryan cuviper arccha elsuizo heroickatora kloumpt iamjagan doumanash pandinosaurus humanscanner selaux frostie614 arthmis jonboylecoding etaloof michaelmauderer sinhpham hepek johntitor fintelia mduggan ccutch mikemoraned askannz dimitrigallos flappybug mcavity leo60228 ssnover colbyn zargornet feuerste jerry73204 yoonlee-lab jsanch116 sumibi-yakitori kf-jose nelsonjchen wonderfl drevoed chocolatier cellfusion palladinium beltegeuse robyoung hoangpq lazareviczoran comcx gelbpunkt arkrde dtolnay-contrib csheaff paulzhang5511 vlamai isabella232 djrakita isgasho horki yumoh charles-schleich ggsvr riey phlqp gtimoshaz ajunlonglive standardgalactic dcchut surban chhe bmgxyz okaneco gregoa neka-nat cyberflamego whhuang finnbear zainthemaynnn mat-1 dan-jfisher

imageproc's Issues

Add efficient Gaussian filtering via box filters

e.g. http://arxiv.org/pdf/1107.4958.pdf, or https://www.mia.uni-saarland.de/Publications/gwosdek-ssvm11.pdf

Add support for drawing anti-aliased lines and thick lines.

This might help: http://members.chello.at/~easyfilter/bresenham.html

When this library is sufficiently well featured and stable we should remove the overlap between it and the image crate. Everything in image::imagops looks to naturally belong here, but before doing that we need to:

find uses of those operations in other crates (this is hopefully as easy as just using cargo-crusader),
get agreement from the authors of the image crate!
decide how to handle DynamicImage. This currently forwards each of the stable-and-fairly-short list of imageops functions to those on the corresponding GenericImage. It's not sensible to just keep growing this list forever, but the simplest use case of load -> op -> save should remain as simple as possible.

Add usage examples to docs

Checked modules are complete.

Add support for extended Haar-like features

As defined here: https://pdfs.semanticscholar.org/72e0/8cf12730135c5ccd7234036e04536218b6c1.pdf

Add efficient dense non-maxima suppression

The current implementation is very naive and slow. Implement the block-based version from https://www.vision.ee.ethz.ch/en/publications/papers/proceedings/eth_biwi_00446.pdf (or at least something based on that - their pseudo-code looks to be buggy).

License?

Hello @theotherphil,
I am interested in helping with this library, as I want to work on a rust based computer vision project, but I'm curious to know what license the library intends to use?

Thank you,
Aaron

Add support for drawing curves

It would be nice to have bezier curve support (and maybe some other curve variants) in the imageproc::drawing crate. I'd like to take a stab at implementing this.

Add non-local-means image denoising

A naive implementation would be extremely slow, so we should probably implement something like this: http://www.gipsa-lab.grenoble-inp.fr/~laurent.condat/publis/condat_resreport_NLmeansv3.pdf.

Initial crate release

We don't have to make this polished, just less rough than the code is now.

Try to reduce signature bloat a little. Unfortunately I think we're stuck with the I: 'static, I::Pixel: 'static, <I::Pixel as Pixel>::Subpixel: + 'static spam for now (but I'd be very happy to be shown wrong).
Check the TODOs for anything easily fixable.
Profile performance. I've still not looked at this, but I fear the performance is pretty woeful. If we're 2x slower than, say, OpenCV that's fine for now. If we're 100x slower not so much.
Make the existing functions nicer to call, probably with a mix of functions taking loads of options and functions providing sensible defaults for some of those options that forward to the verbose ones.
~~Stop spamming everything directly into master. I'll create a fork and move my future commits to go via pull requests.~~
~~Set up continuous integration. Anyone want to help with this one?~~
~~Better readme - give usage examples, and briefly discuss future plans and how to contribute.~~
~~Add a license.~~
~~Build on stable.~~

Quickcheck

Quite a few image operations are amenable to property-based testing. Write utils to allow:

Easy specification of properties.
Generation of test images of sufficiently small dimensions for some specified set of pixel types.

The latter should also catch any issues with combinations of image/kernel/whatever dimensions that aren't handled by a smattering of hand-written test cases.

Haar filters are incorrect

The original definition of Haar-like filters requires that all regions of a single filter have the same dimensions. However, ours allow each region to vary in size.

Fix this.

Original paper: http://wearables.cc.gatech.edu/paper_of_week/viola01rapid.pdf

~~Might also want to add support for the extended set defined here: https://pdfs.semanticscholar.org/72e0/8cf12730135c5ccd7234036e04536218b6c1.pdf~~
I've moved support for the extended feature set to a
new issue #185

Helpful for sanity checking that we've got it right this time: https://stackoverflow.com/questions/1707620/viola-jones-face-detection-claims-180k-features

Add functions for adding noise to images

Should include at least Gaussian, shot and salt-and-pepper noise.

Project Goals? Non-Goals?

Could you add some additional lines about Project Goals / Non-Goals to the README.me? Maybe where some help is requested too?

Pixel API

There are currently quite a few places both in here and in the image library where we do clunky things to get at, operate on, and assign sub pixels of a given Pixel type.

We need at least:

Type level operations on Pixels, e.g. get the equivalent type where all channels are of a given floating point type,
A wider set of arithmetic operators available on Pixels (so that we can just add p and q, rather than get all their channel values, add those, and set as channels of a pixel r),
- A handy set of bounds to express when certain operations are possible, e.g. expressing when filtering by a float-valued kernel is appropriate.

Try to replace all current iterations over pixel channels with something nicer.

Boundary conditions

All the current filter operations treat pixels outside the input image as if we'd extended the boundary pixels indefinitely (i.e. padding "by continuity"). Affine transformations handle pixels whose pre-image is outside the input image by setting them to a user-provided default value.

The former is possibly limiting, and the latter is a bit clunky as you always have to provide a default even when you don't really care.

Come up with a sensible policy for handling boundary conditions and document it. Do we need to allow filter users to extend with zero/by symmetry/some other method?

Performance

It's currently pretty poor.

Benchmarking:

Make sure we have a sensible suite of benchmarks for all functions.
Write a tool to profile vs a selection of standard libraries (vlfeat, opencv, etc.).

Profiling and improving existing functions:

Do some!
Understand the cost of Rust operations/what LLVM can optimise away. e.g as well as being syntactically horrific, something like this looks expensive: (*out.get_pixel_mut(x, y))[0] += out.get_pixel(x, y - 1)[0]; I think that in theory this should all be optimised away, but maybe it's not.
Bounds checking, save vs unsafe. If we're doing something other than just iterate through all pixels in order then presumably eliding bounds checking is a lot trickier. Do we want to use unsafe everywhere? That seems a bit... unsafe.
Is there any benefit in exposing raw scanlines via a DirectImage trait (as exposed to GenericImage that just allows access to single pixels)?
Do we need any more abstractions for iterating over pixels/regions to let us get performance benefits safely?
Actually do the work to make all the existing functions reasonably performant.

Idioms/best practises:

Write up what we discovered from the profiling. Clearly document which operations are expensive, and how to avoid common pitfalls.

HoG features, plus visualisation

https://en.m.wikipedia.org/wiki/Histogram_of_oriented_gradients

Support gradient calculations for RGB images

See https://mht.technology/post/content-aware-resize/

filter3x3 already supports this, so we'd just need to decide how to combine the per-channel magnitudes. It's not obvious what the correct approach is here.

Ignore

In place vs functional operations

Every operation that can be reasonably implemented in place should be. A non-mutating wrapper should be provided for each of these. Document this convention clearly.

Avoiding mutation is nice, but if users want to create pipelines of operations we shouldn't force them to allocate a potentially massive intermediate for every step.

e.g.

fn snargle_mut(image: &mut I, params: Params) {
// update image in place
}

fn snargle(image: &I, params: Params) -> I {
let out = copy(image);
snargle(out);
out
}

Optional extra: write a macro to automatically generate the non-mutating version from the mutation one.

Circle drawing functions not yet released

Hello!

There are some unreleased features I'd like to make use of. Could you cut a new version and push to crates.io please? :)

Thanks,
Louis

No obvious way to create a rectangle

Looks like there's an entire source file set aside for building rectangles, except that there is not an accessible constructor in it for either Rect or RectPosition--which means, since they build one another, that you can't actually build either one.

I may have missed something, of course.

Is there a way to build a rectangle? If not, is this intended?

Add Otsu thresholding

https://en.m.wikipedia.org/wiki/Otsu%27s_method

Canny edge detection

https://en.m.wikipedia.org/wiki/Canny_edge_detector

index out of bounds - hog::cell_histograms always computes signed gradient orientation

I think hog::cell_histograms does not take into account HogOptions.signed - the gradient orientation d is not normalized to the [0, pi) interval when we compute unsigned gradients. When we subsequently compute the orientation bin indices from d / interval, the assumption of Interpolation::from_position_wrapping (that the left index is within bounds) will be violated. This resulted in 'index out of bounds' when I tried to compute the HOG descriptor of an image with unsigned gradients: I specified 9 unsigned gradient orientation bins but the indices in o_inter were {17, 0} when this happened.

I think this can be fixed by inserting something like

if !spec.options.signed && d >= f32::consts::PI {
   d = d - f32::consts::PI;
}

before we calculate o_inter. The = in >= is again important to avoid index out of bounds error by ensuring that d is strictly less than pi.

Latent images/lazy image operations

The ae.utils.graphics D library uses a neat trick where several image operations just return a new View (immutable handle to an image - expressed using template constraints roughly like an immutable variant of the GenericImage trait) whose pixel accessors access a pixel of the input image and then an appropriate transformation of that pixel.

e.g.

/// Return a view which applies a predicate over the
/// underlying view's pixel colors.
template colorMap(alias pred)
{
alias fun = unaryFun!(pred, false, "c");

auto colorMap(V)(auto ref V src)
    if (isView!V)
{
    alias OLDCOLOR = ViewColor!V;
    alias NEWCOLOR = typeof(fun(OLDCOLOR.init));

    struct Map
    {
        V src;

        @property int w() { return src.w; }
        @property int h() { return src.h; }

        /*auto ref*/ NEWCOLOR opIndex(int x, int y)
        {
            return fun(src[x, y]);
        }
    }

    return Map(src);
}

}

Rotations methods do something similar by just returning input_image[warp(pixel_coordinates)]. I'm sure there are dozens of Haskell libraries that perform similar tricks.

Is it possible to support something similar here? Would we even want to? There are potentially nice performance advantages to be had. One example using ae.utils.graphics shows LLVM optimising four 1/4 turns into a no-op. Maybe real-world operations would be less susceptible to nice optimisation, but if we only use a subset of the output pixels this could still be good for performance. We'd have to think about how to make latent images play nicely with manifest images, and when/how to dispatch differently based on whether we have one or the other.

Alternatively, but vastly larger in scope, we could support something more like halide, where the user specifies their transformations as pure function on images viewed as infinite grids, and separately specifies boundaries, chunking, parallelism, and laziness via a schedule. There's probably no nice middle ground for that one though - it seems like there's a pretty stark contrast between writing concrete implementations by hand and a fully-fledged DSL that generates awesome code for you.

Support building on stable channel

Might be as easy as just sticking the [bench] tests behind a feature flag.

Otsu Level implementation incorrect?

The results returned by imageproc::contrast::otsu_level are not at all consistent with those from the python library skimage (see skimage.filters.threshold_otsu). Eg. when given one of my sample images as input the imageproc function returned an Otsu level of 86 whereas the skimage function returned 171. The value range being 0-255 this is a huge difference. The results differ (although much less so) for the sample image used in the rust code test case too.

I tried to reimplement the code given on this page in Rust (see below) and I ended up with the same result as the python library which makes me wonder whether there's something wrong with the imageproc implementation.

For reference, here's my translation into Rust of that Java code:

// Based on http://www.labbookpages.co.uk/software/imgProc/otsuThreshold.html#java
pub fn otsu_level(image: &GrayImage) -> u8 {
    // calculate histogram
    let hist = histogram(image);  // from imageproc
    let width = image.width();
    let height = image.height();

    // total number of pixels
    let total = width * height;

    let sum = hist.iter().enumerate()
        .fold(0f64, |sum, (t, h)| sum + (t as u32 * h) as f64);

    let mut sumB = 0f64;
    let mut wB = 0u32;
    let mut wF = 0u32;

    let mut varMax = 0f64;
    let mut threshold = 0u8;

    for (t, h) in hist.iter().enumerate() {
        // weight background
        wB = wB + h;
        if wB == 0 {
            continue
        };

        // weight foreground
        wF = total - wB;
        if wF == 0 { break };

        sumB += (t as u32 * h) as f64;

        // mean background
        let mB = sumB / (wB as f64);

        // mean foreground
        let mF = (sum - sumB) / (wF as f64);

        // calculate between class variance
        let varBetween = (wB as f64) * (wF as f64) * (mB - mF).powi(2);

        // check if new maximum found
        if varBetween > varMax {
            varMax = varBetween;
            threshold = t as u8;
        }
    }
    threshold
}

draw_line_segment() signature inconsistency

I noticed a small inconsistency between two related function signatures when I was using this library. Of course, changing the signature would break anyone using these functions, and I'm not sure what the policy would be regarding that.

imageproc::drawing::draw_antialiased_line_segment takes tuples of type i32, but imageproc::drawing::draw_line_segment takes tuples of type f32.

It would seem appropriate for both to fake f32 tuples.

As an aside, I'm new to Rust, and I'm interested in contributing to this library. I'd like to start by adding some more testing, as well as some more example code, since those are mentioned in the README.

Add SURF features

https://en.wikipedia.org/wiki/Speeded_up_robust_features

Add Hough transform

https://en.wikipedia.org/wiki/Hough_transform

Preconditions

Add more preconditions to check that inputs are valid (mainly dimension checks). Maybe write a few helpers. Document best practises.

Add binary feature matching

Add binary feature matching using the hamming distance. The hamming crate may be useful.

Add histogram matching

Document row/column-major order

Functions and methods that require passing in a matrix represented as a slice, such as filter3x3 and Kernel::new, are currently missing documentation regarding how the data should be arranged.

Example

Should the kernel argument for filter3x3 be [a₁₁, a₁₂, a₁₃, a₂₁, a₂₂, ..] (row-major order) or [a₁₁, a₂₁, a₃₁, a₂₁, ..] (column-major order).

Cheers :)

Add grab cut

http://cvg.ethz.ch/teaching/cvl/2012/grabcut-siggraph04.pdf

Git Ignore File

Hello @theotherphil,
I noticed that when compiling the project that Git was picking up a number of unneeded files. So I found a rust ignore file https://github.com/github/gitignore/blob/master/Rust.gitignore which is shown below. I think it would be a good start that we could expand on as needed.

# Compiled files
*.o
*.so
*.rlib
*.dll

# Executables
*.exe

# Generated by Cargo
/target/

# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
# More information here http://doc.crates.io/guide.html#cargotoml-vs-cargolock
Cargo.lock

Thoughts?

Document canny high and low threshold more clearly

They're currently just f32s with no indication of scale (0 to 1? 0 to 255?).

cc @archer884 as I spotted this from a comment on his code that uses this function!

The canny algorithm 'traces' edges in an image, starting with edges >= high_threshold and containing to trace while the edge intensity is >= low_threshold.

The edge intensity is sqrt(dx^2 + dy^y) where dx and dy are defined using Sobel operators. The thresholds should be roughly the same scale as the pixel intensities (0-255). When documenting this we'll need to add some examples and clearly document the min and max possible values.

Drawing TTF fonts/text

It would be interesting to look at ways to draw TTF fonts (or just any text) onto supported images. Let me know what you think!

Take a look at https://github.com/dylanede/rusttype as well

Connected component analysis

How best to add parallelism?

Lots of the functions in this library are embarrassingly parallel. How should we parallelise them? Ideally with as little impact on the function bodies as possible. Image-chunk-iterators plus https://github.com/nikomatsakis/rayon?

Add image rotation with bilinear interpolation

Add image tinting

@softprops was looking for something like ImageMagick's TintImage function:
https://github.com/ImageMagick/ImageMagick/blob/master/MagickCore/fx.c.

Visual effects seem like something that should eventually go in a separate crate, but for now it should be pretty easy to add a module containing some ImageMagick-y effects.

Add BRIEF features

Add BRIEF feature descriptor extraction, as described in Calonder10.

Add macro for creating dothing from dothing_mut

To avoid the boilerplate of writing the same "create copy, mutate copy, return copy" code every time.

Add Haar-like features

https://en.m.wikipedia.org/wiki/Haar-like_features

Panic in connected_components

The following causes an arithmetic overflow:

fn chessboard(width: u32, height: u32) -> GrayImage {
        ImageBuffer::from_fn(width, height, |x, y| {
            if (x + y) % 2 == 0 { return Luma([255u8]); }
            else { return Luma([0u8]); }
        })
    }

    #[test]
    fn test_connected_components_eight_chessboard() {
        let image = chessboard(30, 30);
        let components = connected_components(&image, Eight);
        let max_component = components.pixels().map(|p| p[0]).max();
        assert_eq!(max_component, Some(1u32));
    }

Proposed list of algorithms / package layout

Hello,

I'm working on some computer vision problems, and I plan on switching my current matlab prototypes over to rust in the coming months. As such, I would love to make use of the full-fledged image processing crate that doesn't exist yet...

I realize you're in the early stages, but as soon as you have a proposed api and a list of functions/algorithms you'd like implemented, I'd be happy to help. There will be things that I'll probably write myself sooner rather than later which could have a home here., e.g. an implementation of Otsu's method for thresholding.

Also, what is the proposed scope of the library? I'm interested in frequency-space filters, image features like SIFT/SURF, image registration, edge detection, etc.

image-rs / imageproc Goto Github PK

imageproc's People

Contributors

Stargazers

Watchers

Forkers

imageproc's Issues

Example

Recommend Projects

Recommend Topics

Recommend Org

Jobs