GithubHelp home page GithubHelp logo

imageproc's People

Contributors

akonneker avatar archer884 avatar arthmis avatar at1as avatar bvssvni avatar cellfusion avatar cospectrum avatar crazymykl avatar gitter-badger avatar hepek avatar huderlem avatar johntitor avatar lazareviczoran avatar mikemoraned avatar morgane55440 avatar naisuuuu avatar nelsonjchen avatar nicokoch avatar palladinium avatar paolobarbolini avatar razrfalcon avatar selaux avatar sinhpham avatar stephanemagnenat avatar surban avatar tafia avatar th3charlie avatar theotherphil avatar tianyishi2001 avatar tinou98 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

imageproc's Issues

Integration with image

When this library is sufficiently well featured and stable we should remove the overlap between it and the image crate. Everything in image::imagops looks to naturally belong here, but before doing that we need to:

  1. find uses of those operations in other crates (this is hopefully as easy as just using cargo-crusader),
  2. get agreement from the authors of the image crate!
  3. decide how to handle DynamicImage. This currently forwards each of the stable-and-fairly-short list of imageops functions to those on the corresponding GenericImage. It's not sensible to just keep growing this list forever, but the simplest use case of load -> op -> save should remain as simple as possible.

Add usage examples to docs

Checked modules are complete.

  • affine
  • contrast
  • corners
  • definitions
  • drawing
  • edges
  • filter
  • gradients
  • haar
  • hog
  • hough
  • integral_image
  • local_binary_patterns
  • map
  • math
  • morphology
  • noise
  • pixelops
  • property_testing
  • rect
  • region_labelling
  • stats
  • suppress
  • union_find
  • utils

License?

Hello @theotherphil,
I am interested in helping with this library, as I want to work on a rust based computer vision project, but I'm curious to know what license the library intends to use?

Thank you,
Aaron

Add support for drawing curves

It would be nice to have bezier curve support (and maybe some other curve variants) in the imageproc::drawing crate. I'd like to take a stab at implementing this.

Initial crate release

We don't have to make this polished, just less rough than the code is now.

  • Try to reduce signature bloat a little. Unfortunately I think we're stuck with the I: 'static, I::Pixel: 'static, <I::Pixel as Pixel>::Subpixel: + 'static spam for now (but I'd be very happy to be shown wrong).
  • Check the TODOs for anything easily fixable.
  • Profile performance. I've still not looked at this, but I fear the performance is pretty woeful. If we're 2x slower than, say, OpenCV that's fine for now. If we're 100x slower not so much.
  • Make the existing functions nicer to call, probably with a mix of functions taking loads of options and functions providing sensible defaults for some of those options that forward to the verbose ones.
  • Stop spamming everything directly into master. I'll create a fork and move my future commits to go via pull requests.
  • Set up continuous integration. Anyone want to help with this one?
  • Better readme - give usage examples, and briefly discuss future plans and how to contribute.
  • Add a license.
  • Build on stable.

Quickcheck

Quite a few image operations are amenable to property-based testing. Write utils to allow:

  1. Easy specification of properties.
  2. Generation of test images of sufficiently small dimensions for some specified set of pixel types.

The latter should also catch any issues with combinations of image/kernel/whatever dimensions that aren't handled by a smattering of hand-written test cases.

Haar filters are incorrect

The original definition of Haar-like filters requires that all regions of a single filter have the same dimensions. However, ours allow each region to vary in size.

Fix this.

Original paper: http://wearables.cc.gatech.edu/paper_of_week/viola01rapid.pdf

Might also want to add support for the extended set defined here: https://pdfs.semanticscholar.org/72e0/8cf12730135c5ccd7234036e04536218b6c1.pdf
I've moved support for the extended feature set to a
new issue #185

Helpful for sanity checking that we've got it right this time: https://stackoverflow.com/questions/1707620/viola-jones-face-detection-claims-180k-features

Project Goals? Non-Goals?

Could you add some additional lines about Project Goals / Non-Goals to the README.me? Maybe where some help is requested too?

Pixel API

There are currently quite a few places both in here and in the image library where we do clunky things to get at, operate on, and assign sub pixels of a given Pixel type.

We need at least:

  • Type level operations on Pixels, e.g. get the equivalent type where all channels are of a given floating point type,
  • A wider set of arithmetic operators available on Pixels (so that we can just add p and q, rather than get all their channel values, add those, and set as channels of a pixel r),
    • A handy set of bounds to express when certain operations are possible, e.g. expressing when filtering by a float-valued kernel is appropriate.

Try to replace all current iterations over pixel channels with something nicer.

Boundary conditions

All the current filter operations treat pixels outside the input image as if we'd extended the boundary pixels indefinitely (i.e. padding "by continuity"). Affine transformations handle pixels whose pre-image is outside the input image by setting them to a user-provided default value.

The former is possibly limiting, and the latter is a bit clunky as you always have to provide a default even when you don't really care.

Come up with a sensible policy for handling boundary conditions and document it. Do we need to allow filter users to extend with zero/by symmetry/some other method?

Performance

It's currently pretty poor.

Benchmarking:

  • Make sure we have a sensible suite of benchmarks for all functions.
  • Write a tool to profile vs a selection of standard libraries (vlfeat, opencv, etc.).

Profiling and improving existing functions:

  • Do some!
  • Understand the cost of Rust operations/what LLVM can optimise away. e.g as well as being syntactically horrific, something like this looks expensive: (*out.get_pixel_mut(x, y))[0] += out.get_pixel(x, y - 1)[0]; I think that in theory this should all be optimised away, but maybe it's not.
  • Bounds checking, save vs unsafe. If we're doing something other than just iterate through all pixels in order then presumably eliding bounds checking is a lot trickier. Do we want to use unsafe everywhere? That seems a bit... unsafe.
  • Is there any benefit in exposing raw scanlines via a DirectImage trait (as exposed to GenericImage that just allows access to single pixels)?
  • Do we need any more abstractions for iterating over pixels/regions to let us get performance benefits safely?
  • Actually do the work to make all the existing functions reasonably performant.

Idioms/best practises:

  • Write up what we discovered from the profiling. Clearly document which operations are expensive, and how to avoid common pitfalls.

In place vs functional operations

Every operation that can be reasonably implemented in place should be. A non-mutating wrapper should be provided for each of these. Document this convention clearly.

Avoiding mutation is nice, but if users want to create pipelines of operations we shouldn't force them to allocate a potentially massive intermediate for every step.

e.g.

fn snargle_mut(image: &mut I, params: Params) {
// update image in place
}

fn snargle(image: &I, params: Params) -> I {
let out = copy(image);
snargle(out);
out
}

Optional extra: write a macro to automatically generate the non-mutating version from the mutation one.

No obvious way to create a rectangle

Looks like there's an entire source file set aside for building rectangles, except that there is not an accessible constructor in it for either Rect or RectPosition--which means, since they build one another, that you can't actually build either one.

I may have missed something, of course.

Is there a way to build a rectangle? If not, is this intended?

index out of bounds - hog::cell_histograms always computes signed gradient orientation

I think hog::cell_histograms does not take into account HogOptions.signed - the gradient orientation d is not normalized to the [0, pi) interval when we compute unsigned gradients. When we subsequently compute the orientation bin indices from d / interval, the assumption of Interpolation::from_position_wrapping (that the left index is within bounds) will be violated. This resulted in 'index out of bounds' when I tried to compute the HOG descriptor of an image with unsigned gradients: I specified 9 unsigned gradient orientation bins but the indices in o_inter were {17, 0} when this happened.

I think this can be fixed by inserting something like

if !spec.options.signed && d >= f32::consts::PI {
   d = d - f32::consts::PI;
}

before we calculate o_inter. The = in >= is again important to avoid index out of bounds error by ensuring that d is strictly less than pi.

Latent images/lazy image operations

The ae.utils.graphics D library uses a neat trick where several image operations just return a new View (immutable handle to an image - expressed using template constraints roughly like an immutable variant of the GenericImage trait) whose pixel accessors access a pixel of the input image and then an appropriate transformation of that pixel.

e.g.

/// Return a view which applies a predicate over the
/// underlying view's pixel colors.
template colorMap(alias pred)
{
alias fun = unaryFun!(pred, false, "c");

auto colorMap(V)(auto ref V src)
    if (isView!V)
{
    alias OLDCOLOR = ViewColor!V;
    alias NEWCOLOR = typeof(fun(OLDCOLOR.init));

    struct Map
    {
        V src;

        @property int w() { return src.w; }
        @property int h() { return src.h; }

        /*auto ref*/ NEWCOLOR opIndex(int x, int y)
        {
            return fun(src[x, y]);
        }
    }

    return Map(src);
}

}

Rotations methods do something similar by just returning input_image[warp(pixel_coordinates)]. I'm sure there are dozens of Haskell libraries that perform similar tricks.

Is it possible to support something similar here? Would we even want to? There are potentially nice performance advantages to be had. One example using ae.utils.graphics shows LLVM optimising four 1/4 turns into a no-op. Maybe real-world operations would be less susceptible to nice optimisation, but if we only use a subset of the output pixels this could still be good for performance. We'd have to think about how to make latent images play nicely with manifest images, and when/how to dispatch differently based on whether we have one or the other.

Alternatively, but vastly larger in scope, we could support something more like halide, where the user specifies their transformations as pure function on images viewed as infinite grids, and separately specifies boundaries, chunking, parallelism, and laziness via a schedule. There's probably no nice middle ground for that one though - it seems like there's a pretty stark contrast between writing concrete implementations by hand and a fully-fledged DSL that generates awesome code for you.

Otsu Level implementation incorrect?

The results returned by imageproc::contrast::otsu_level are not at all consistent with those from the python library skimage (see skimage.filters.threshold_otsu). Eg. when given one of my sample images as input the imageproc function returned an Otsu level of 86 whereas the skimage function returned 171. The value range being 0-255 this is a huge difference. The results differ (although much less so) for the sample image used in the rust code test case too.

I tried to reimplement the code given on this page in Rust (see below) and I ended up with the same result as the python library which makes me wonder whether there's something wrong with the imageproc implementation.

For reference, here's my translation into Rust of that Java code:

// Based on http://www.labbookpages.co.uk/software/imgProc/otsuThreshold.html#java
pub fn otsu_level(image: &GrayImage) -> u8 {
    // calculate histogram
    let hist = histogram(image);  // from imageproc
    let width = image.width();
    let height = image.height();

    // total number of pixels
    let total = width * height;

    let sum = hist.iter().enumerate()
        .fold(0f64, |sum, (t, h)| sum + (t as u32 * h) as f64);

    let mut sumB = 0f64;
    let mut wB = 0u32;
    let mut wF = 0u32;

    let mut varMax = 0f64;
    let mut threshold = 0u8;

    for (t, h) in hist.iter().enumerate() {
        // weight background
        wB = wB + h;
        if wB == 0 {
            continue
        };

        // weight foreground
        wF = total - wB;
        if wF == 0 { break };

        sumB += (t as u32 * h) as f64;

        // mean background
        let mB = sumB / (wB as f64);

        // mean foreground
        let mF = (sum - sumB) / (wF as f64);

        // calculate between class variance
        let varBetween = (wB as f64) * (wF as f64) * (mB - mF).powi(2);

        // check if new maximum found
        if varBetween > varMax {
            varMax = varBetween;
            threshold = t as u8;
        }
    }
    threshold
}

draw_line_segment() signature inconsistency

I noticed a small inconsistency between two related function signatures when I was using this library. Of course, changing the signature would break anyone using these functions, and I'm not sure what the policy would be regarding that.

imageproc::drawing::draw_antialiased_line_segment takes tuples of type i32, but imageproc::drawing::draw_line_segment takes tuples of type f32.

It would seem appropriate for both to fake f32 tuples.

As an aside, I'm new to Rust, and I'm interested in contributing to this library. I'd like to start by adding some more testing, as well as some more example code, since those are mentioned in the README.

Preconditions

Add more preconditions to check that inputs are valid (mainly dimension checks). Maybe write a few helpers. Document best practises.

Document row/column-major order

Functions and methods that require passing in a matrix represented as a slice, such as filter3x3 and Kernel::new, are currently missing documentation regarding how the data should be arranged.

Example

Should the kernel argument for filter3x3 be [a11, a12, a13, a21, a22, ..] (row-major order) or [a11, a21, a31, a21, ..] (column-major order).

Cheers :)

Git Ignore File

Hello @theotherphil,
I noticed that when compiling the project that Git was picking up a number of unneeded files. So I found a rust ignore file https://github.com/github/gitignore/blob/master/Rust.gitignore which is shown below. I think it would be a good start that we could expand on as needed.

# Compiled files
*.o
*.so
*.rlib
*.dll

# Executables
*.exe

# Generated by Cargo
/target/

# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
# More information here http://doc.crates.io/guide.html#cargotoml-vs-cargolock
Cargo.lock

Thoughts?

Document canny high and low threshold more clearly

They're currently just f32s with no indication of scale (0 to 1? 0 to 255?).

cc @archer884 as I spotted this from a comment on his code that uses this function!

The canny algorithm 'traces' edges in an image, starting with edges >= high_threshold and containing to trace while the edge intensity is >= low_threshold.

The edge intensity is sqrt(dx^2 + dy^y) where dx and dy are defined using Sobel operators. The thresholds should be roughly the same scale as the pixel intensities (0-255). When documenting this we'll need to add some examples and clearly document the min and max possible values.

Panic in connected_components

The following causes an arithmetic overflow:

fn chessboard(width: u32, height: u32) -> GrayImage {
        ImageBuffer::from_fn(width, height, |x, y| {
            if (x + y) % 2 == 0 { return Luma([255u8]); }
            else { return Luma([0u8]); }
        })
    }

    #[test]
    fn test_connected_components_eight_chessboard() {
        let image = chessboard(30, 30);
        let components = connected_components(&image, Eight);
        let max_component = components.pixels().map(|p| p[0]).max();
        assert_eq!(max_component, Some(1u32));
    }

Proposed list of algorithms / package layout

Hello,

I'm working on some computer vision problems, and I plan on switching my current matlab prototypes over to rust in the coming months. As such, I would love to make use of the full-fledged image processing crate that doesn't exist yet...

I realize you're in the early stages, but as soon as you have a proposed api and a list of functions/algorithms you'd like implemented, I'd be happy to help. There will be things that I'll probably write myself sooner rather than later which could have a home here., e.g. an implementation of Otsu's method for thresholding.

Also, what is the proposed scope of the library? I'm interested in frequency-space filters, image features like SIFT/SURF, image registration, edge detection, etc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.