image-rs / imageproc Goto Github PK
View Code? Open in Web Editor NEWImage processing operations
License: MIT License
Image processing operations
License: MIT License
This might help: http://members.chello.at/~easyfilter/bresenham.html
When this library is sufficiently well featured and stable we should remove the overlap between it and the image crate. Everything in image::imagops looks to naturally belong here, but before doing that we need to:
Checked modules are complete.
The current implementation is very naive and slow. Implement the block-based version from https://www.vision.ee.ethz.ch/en/publications/papers/proceedings/eth_biwi_00446.pdf (or at least something based on that - their pseudo-code looks to be buggy).
Hello @theotherphil,
I am interested in helping with this library, as I want to work on a rust based computer vision project, but I'm curious to know what license the library intends to use?
Thank you,
Aaron
It would be nice to have bezier curve support (and maybe some other curve variants) in the imageproc::drawing
crate. I'd like to take a stab at implementing this.
A naive implementation would be extremely slow, so we should probably implement something like this: http://www.gipsa-lab.grenoble-inp.fr/~laurent.condat/publis/condat_resreport_NLmeansv3.pdf.
We don't have to make this polished, just less rough than the code is now.
Quite a few image operations are amenable to property-based testing. Write utils to allow:
The latter should also catch any issues with combinations of image/kernel/whatever dimensions that aren't handled by a smattering of hand-written test cases.
The original definition of Haar-like filters requires that all regions of a single filter have the same dimensions. However, ours allow each region to vary in size.
Fix this.
Original paper: http://wearables.cc.gatech.edu/paper_of_week/viola01rapid.pdf
Might also want to add support for the extended set defined here: https://pdfs.semanticscholar.org/72e0/8cf12730135c5ccd7234036e04536218b6c1.pdf
I've moved support for the extended feature set to a
new issue #185
Helpful for sanity checking that we've got it right this time: https://stackoverflow.com/questions/1707620/viola-jones-face-detection-claims-180k-features
Should include at least Gaussian, shot and salt-and-pepper noise.
Could you add some additional lines about Project Goals / Non-Goals to the README.me? Maybe where some help is requested too?
There are currently quite a few places both in here and in the image library where we do clunky things to get at, operate on, and assign sub pixels of a given Pixel type.
We need at least:
Try to replace all current iterations over pixel channels with something nicer.
All the current filter operations treat pixels outside the input image as if we'd extended the boundary pixels indefinitely (i.e. padding "by continuity"). Affine transformations handle pixels whose pre-image is outside the input image by setting them to a user-provided default value.
The former is possibly limiting, and the latter is a bit clunky as you always have to provide a default even when you don't really care.
Come up with a sensible policy for handling boundary conditions and document it. Do we need to allow filter users to extend with zero/by symmetry/some other method?
It's currently pretty poor.
Benchmarking:
Profiling and improving existing functions:
Idioms/best practises:
See https://mht.technology/post/content-aware-resize/
filter3x3 already supports this, so we'd just need to decide how to combine the per-channel magnitudes. It's not obvious what the correct approach is here.
Every operation that can be reasonably implemented in place should be. A non-mutating wrapper should be provided for each of these. Document this convention clearly.
Avoiding mutation is nice, but if users want to create pipelines of operations we shouldn't force them to allocate a potentially massive intermediate for every step.
e.g.
fn snargle_mut(image: &mut I, params: Params) {
// update image in place
}
fn snargle(image: &I, params: Params) -> I {
let out = copy(image);
snargle(out);
out
}
Optional extra: write a macro to automatically generate the non-mutating version from the mutation one.
Hello!
There are some unreleased features I'd like to make use of. Could you cut a new version and push to crates.io please? :)
Thanks,
Louis
Looks like there's an entire source file set aside for building rectangles, except that there is not an accessible constructor in it for either Rect
or RectPosition
--which means, since they build one another, that you can't actually build either one.
I may have missed something, of course.
Is there a way to build a rectangle? If not, is this intended?
I think hog::cell_histograms
does not take into account HogOptions.signed
- the gradient orientation d
is not normalized to the [0, pi) interval when we compute unsigned gradients. When we subsequently compute the orientation bin indices from d / interval
, the assumption of Interpolation::from_position_wrapping
(that the left index is within bounds) will be violated. This resulted in 'index out of bounds' when I tried to compute the HOG descriptor of an image with unsigned gradients: I specified 9 unsigned gradient orientation bins but the indices in o_inter
were {17, 0}
when this happened.
I think this can be fixed by inserting something like
if !spec.options.signed && d >= f32::consts::PI {
d = d - f32::consts::PI;
}
before we calculate o_inter
. The =
in >=
is again important to avoid index out of bounds error by ensuring that d
is strictly less than pi.
The ae.utils.graphics D library uses a neat trick where several image operations just return a new View (immutable handle to an image - expressed using template constraints roughly like an immutable variant of the GenericImage trait) whose pixel accessors access a pixel of the input image and then an appropriate transformation of that pixel.
e.g.
/// Return a view which applies a predicate over the
/// underlying view's pixel colors.
template colorMap(alias pred)
{
alias fun = unaryFun!(pred, false, "c");
auto colorMap(V)(auto ref V src)
if (isView!V)
{
alias OLDCOLOR = ViewColor!V;
alias NEWCOLOR = typeof(fun(OLDCOLOR.init));
struct Map
{
V src;
@property int w() { return src.w; }
@property int h() { return src.h; }
/*auto ref*/ NEWCOLOR opIndex(int x, int y)
{
return fun(src[x, y]);
}
}
return Map(src);
}
}
Rotations methods do something similar by just returning input_image[warp(pixel_coordinates)]. I'm sure there are dozens of Haskell libraries that perform similar tricks.
Is it possible to support something similar here? Would we even want to? There are potentially nice performance advantages to be had. One example using ae.utils.graphics shows LLVM optimising four 1/4 turns into a no-op. Maybe real-world operations would be less susceptible to nice optimisation, but if we only use a subset of the output pixels this could still be good for performance. We'd have to think about how to make latent images play nicely with manifest images, and when/how to dispatch differently based on whether we have one or the other.
Alternatively, but vastly larger in scope, we could support something more like halide, where the user specifies their transformations as pure function on images viewed as infinite grids, and separately specifies boundaries, chunking, parallelism, and laziness via a schedule. There's probably no nice middle ground for that one though - it seems like there's a pretty stark contrast between writing concrete implementations by hand and a fully-fledged DSL that generates awesome code for you.
Might be as easy as just sticking the [bench] tests behind a feature flag.
The results returned by imageproc::contrast::otsu_level
are not at all consistent with those from the python library skimage (see skimage.filters.threshold_otsu
). Eg. when given one of my sample images as input the imageproc function returned an Otsu level of 86 whereas the skimage function returned 171. The value range being 0-255 this is a huge difference. The results differ (although much less so) for the sample image used in the rust code test case too.
I tried to reimplement the code given on this page in Rust (see below) and I ended up with the same result as the python library which makes me wonder whether there's something wrong with the imageproc implementation.
For reference, here's my translation into Rust of that Java code:
// Based on http://www.labbookpages.co.uk/software/imgProc/otsuThreshold.html#java
pub fn otsu_level(image: &GrayImage) -> u8 {
// calculate histogram
let hist = histogram(image); // from imageproc
let width = image.width();
let height = image.height();
// total number of pixels
let total = width * height;
let sum = hist.iter().enumerate()
.fold(0f64, |sum, (t, h)| sum + (t as u32 * h) as f64);
let mut sumB = 0f64;
let mut wB = 0u32;
let mut wF = 0u32;
let mut varMax = 0f64;
let mut threshold = 0u8;
for (t, h) in hist.iter().enumerate() {
// weight background
wB = wB + h;
if wB == 0 {
continue
};
// weight foreground
wF = total - wB;
if wF == 0 { break };
sumB += (t as u32 * h) as f64;
// mean background
let mB = sumB / (wB as f64);
// mean foreground
let mF = (sum - sumB) / (wF as f64);
// calculate between class variance
let varBetween = (wB as f64) * (wF as f64) * (mB - mF).powi(2);
// check if new maximum found
if varBetween > varMax {
varMax = varBetween;
threshold = t as u8;
}
}
threshold
}
I noticed a small inconsistency between two related function signatures when I was using this library. Of course, changing the signature would break anyone using these functions, and I'm not sure what the policy would be regarding that.
imageproc::drawing::draw_antialiased_line_segment
takes tuples of type i32
, but imageproc::drawing::draw_line_segment
takes tuples of type f32
.
It would seem appropriate for both to fake f32
tuples.
As an aside, I'm new to Rust, and I'm interested in contributing to this library. I'd like to start by adding some more testing, as well as some more example code, since those are mentioned in the README.
Add more preconditions to check that inputs are valid (mainly dimension checks). Maybe write a few helpers. Document best practises.
Add binary feature matching using the hamming distance. The hamming crate may be useful.
Functions and methods that require passing in a matrix represented as a slice, such as filter3x3
and Kernel::new
, are currently missing documentation regarding how the data should be arranged.
Should the kernel
argument for filter3x3
be [a11, a12, a13, a21, a22, ..] (row-major order) or [a11, a21, a31, a21, ..] (column-major order).
Cheers :)
Hello @theotherphil,
I noticed that when compiling the project that Git was picking up a number of unneeded files. So I found a rust ignore file https://github.com/github/gitignore/blob/master/Rust.gitignore which is shown below. I think it would be a good start that we could expand on as needed.
# Compiled files
*.o
*.so
*.rlib
*.dll
# Executables
*.exe
# Generated by Cargo
/target/
# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
# More information here http://doc.crates.io/guide.html#cargotoml-vs-cargolock
Cargo.lock
Thoughts?
They're currently just f32s with no indication of scale (0 to 1? 0 to 255?).
cc @archer884 as I spotted this from a comment on his code that uses this function!
The canny algorithm 'traces' edges in an image, starting with edges >= high_threshold and containing to trace while the edge intensity is >= low_threshold.
The edge intensity is sqrt(dx^2 + dy^y) where dx and dy are defined using Sobel operators. The thresholds should be roughly the same scale as the pixel intensities (0-255). When documenting this we'll need to add some examples and clearly document the min and max possible values.
It would be interesting to look at ways to draw TTF fonts (or just any text) onto supported images. Let me know what you think!
Take a look at https://github.com/dylanede/rusttype as well
Lots of the functions in this library are embarrassingly parallel. How should we parallelise them? Ideally with as little impact on the function bodies as possible. Image-chunk-iterators plus https://github.com/nikomatsakis/rayon?
@softprops was looking for something like ImageMagick's TintImage function:
https://github.com/ImageMagick/ImageMagick/blob/master/MagickCore/fx.c.
Visual effects seem like something that should eventually go in a separate crate, but for now it should be pretty easy to add a module containing some ImageMagick-y effects.
Add BRIEF feature descriptor extraction, as described in Calonder10.
To avoid the boilerplate of writing the same "create copy, mutate copy, return copy" code every time.
The following causes an arithmetic overflow:
fn chessboard(width: u32, height: u32) -> GrayImage {
ImageBuffer::from_fn(width, height, |x, y| {
if (x + y) % 2 == 0 { return Luma([255u8]); }
else { return Luma([0u8]); }
})
}
#[test]
fn test_connected_components_eight_chessboard() {
let image = chessboard(30, 30);
let components = connected_components(&image, Eight);
let max_component = components.pixels().map(|p| p[0]).max();
assert_eq!(max_component, Some(1u32));
}
Hello,
I'm working on some computer vision problems, and I plan on switching my current matlab prototypes over to rust in the coming months. As such, I would love to make use of the full-fledged image processing crate that doesn't exist yet...
I realize you're in the early stages, but as soon as you have a proposed api and a list of functions/algorithms you'd like implemented, I'd be happy to help. There will be things that I'll probably write myself sooner rather than later which could have a home here., e.g. an implementation of Otsu's method for thresholding.
Also, what is the proposed scope of the library? I'm interested in frequency-space filters, image features like SIFT/SURF, image registration, edge detection, etc.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.