GithubHelp home page GithubHelp logo

labforcomputationalvision / plenoptic Goto Github PK

View Code? Open in Web Editor NEW
51.0 9.0 9.0 622.85 MB

Visualize/test models for visual representation by synthesizing images.

Home Page: https://plenoptic.readthedocs.io/en/latest/

License: MIT License

Python 99.95% Dockerfile 0.05%

plenoptic's Introduction

plenoptic

PyPI Version License: MIT Python version Build Status Documentation Status DOI codecov Binder Project Status: Active – The project has reached a stable, usable state and is being actively developed.

plenoptic is a python library for model-based synthesis of perceptual stimuli. For plenoptic, models are those of visual1 information processing: they accept an image as input, perform some computations, and return some output, which can be mapped to neuronal firing rate, fMRI BOLD response, behavior on some task, image category, etc. The intended audience is researchers in neuroscience, psychology, and machine learning. The generated stimuli enable interpretation of model properties through examination of features that are enhanced, suppressed, or discarded. More importantly, they can facilitate the scientific process, through use in further perceptual or neural experiments aimed at validating or falsifying model predictions.

Getting started

  • If you are unfamiliar with stimulus synthesis, see the conceptual introduction for an in-depth introduction.
  • If you understand the basics of synthesis and want to get started using plenoptic quickly, see the Quickstart tutorial.

Installation

The best way to install plenoptic is via pip.

$ pip install plenoptic

Our dependencies include pytorch and pyrtools. Installation should take care of them (along with our other dependencies) automatically, but if you have an installation problem (especially on a non-Linux operating system), it is likely that the problem lies with one of those packages. Open an issue and we'll try to help you figure out the problem!

See the installation page for more details, including how to set up a virtual environment and jupyter.

ffmpeg and videos

Several methods in this package generate videos. There are several backends possible for saving the animations to file, see matplotlib documentation for more details. In order convert them to HTML5 for viewing (and thus, to view in a jupyter notebook), you'll need ffmpeg installed and on your path as well. Depending on your system, this might already be installed, but if not, the easiest way is probably through [conda] (https://anaconda.org/conda-forge/ffmpeg): conda install -c conda-forge ffmpeg.

To change the backend, run matplotlib.rcParams['animation.writer'] = writer before calling any of the animate functions. If you try to set that rcParam with a random string, matplotlib will tell you the available choices.

Contents

Synthesis methods

  • Metamers: given a model and a reference image, stochastically generate a new image whose model representation is identical to that of the reference image. This method investigates what image features the model disregards entirely.
  • Eigendistortions: given a model and a reference image, compute the image perturbation that produces the smallest and largest changes in the model response space. This method investigates the image features the model considers the least and most important.
  • Maximal differentiation (MAD) competition: given two metrics that measure distance between images and a reference image, generate pairs of images that optimally differentiate the models. Specifically, synthesize a pair of images that the first model says are equi-distant from the reference while the second model says they are maximally/minimally distant from the reference. Then synthesize a second pair with the roles of the two models reversed. This method allows for efficient comparison of two metrics, highlighting the aspects in which their sensitivities differ.
  • Geodesics: given a model and two images, synthesize a sequence of images that lie on the shortest ("geodesic") path in the model's representation space. This method investigates how a model represents motion and what changes to an image it consider reasonable.

Models, Metrics, and Model Components

  • Portilla-Simoncelli texture model, which measures the statistical properties of visual textures, here defined as "repeating visual patterns."
  • Steerable pyramid, a multi-scale oriented image decomposition. The basis are oriented (steerable) filters, localized in space and frequency. Among other uses, the steerable pyramid serves as a good representation from which to build a primary visual cortex model. See the pyrtools documentation for more details on image pyramids in general and the steerable pyramid in particular.
  • Structural Similarity Index (SSIM), is a perceptual similarity metric, returning a number between -1 (totally different) and 1 (identical) reflecting how similar two images are. This is based on the images' luminance, contrast, and structure, which are computed convolutionally across the images.
  • Multiscale Structrual Similarity Index (MS-SSIM), is a perceptual similarity metric similar to SSIM, except it operates at multiple scales (i.e., spatial frequencies).
  • Normalized Laplacian distance, is a perceptual distance metric based on transformations associated with the early visual system: local luminance subtraction and local contrast gain control, at six scales.

Getting help

We communicate via several channels on Github:

  • Discussions is the place to ask usage questions, discuss issues too broad for a single issue, or show off what you've made with plenoptic.
  • If you've come across a bug, open an issue.
  • If you have an idea for an extension or enhancement, please post in the ideas section of discussions first. We'll discuss it there and, if we decide to pursue it, open an issue to track progress.
  • See the contributing guide for how to get involved.

In all cases, please follow our code of conduct.

Citing us

If you use plenoptic in a published academic article or presentation, please cite both the code by the DOI as well the JOV paper. If you are not using the code, but just discussing the project, please cite the paper. You can click on Cite this repository on the right side of the GitHub page to get a copyable citation for the code, or use the following:

  • Code: DOI
  • Paper:
    @article{duong2023plenoptic,
      title={Plenoptic: A platform for synthesizing model-optimized visual stimuli},
      author={Duong, Lyndon and Bonnen, Kathryn and Broderick, William and Fiquet, Pierre-{\'E}tienne and Parthasarathy, Nikhil and Yerxa, Thomas and Zhao, Xinyuan and Simoncelli, Eero},
      journal={Journal of Vision},
      volume={23},
      number={9},
      pages={5822--5822},
      year={2023},
      publisher={The Association for Research in Vision and Ophthalmology}
    }

See the citation guide for more details, including citations for the different synthesis methods and computational moels included in plenoptic.

Support

This package is supported by the Simons Foundation Flatiron Institute's Center for Computational Neuroscience.

Footnotes

  1. These methods also work with auditory models, such as in Feather et al., 2019, though we haven't yet implemented examples. If you're interested, please post in Discussions!

plenoptic's People

Contributors

billbrod avatar pehf avatar lyndond avatar bichidian avatar nikparth avatar balzaniedoardo avatar thomasyerxa avatar kbonnen avatar dylex avatar eerosim avatar hmd101 avatar dherrera1911 avatar theowoo avatar yochannah avatar

Stargazers

Niall L. Williams avatar  avatar Yasuo Kabe avatar Connor Baker avatar galen avatar Yuhao Zhu avatar  avatar Isaac Berrios avatar Lukas avatar  avatar  avatar  avatar xyz avatar Wenhao Chai avatar Jacob A Rose avatar Scott Huberty avatar  avatar silvis0Lar avatar Jacob avatar  avatar  avatar  avatar  avatar Eric Thomson avatar Dekel avatar Abdulkadir Canatar avatar  avatar  avatar Joaquín Ruales avatar Tommy Sprague avatar Ajay Subramanian avatar Carlos Ortega avatar  avatar Jordan Lei avatar DKY avatar Tom Wallis avatar Xin (Simon) Dong avatar Arturo avatar Ling-Qi Zhang avatar  avatar  avatar  avatar Ian Czekala avatar Stefano Martiniani avatar Federico Adolfi avatar  avatar Michael Waskom avatar Shushi avatar Alex Remedios avatar  avatar Rodrigo Oliveira avatar

Watchers

James Cloos avatar Shuo Yuan avatar  avatar  avatar  avatar Eric Thomson avatar  avatar  avatar  avatar

plenoptic's Issues

Add preprocess function

Similar to po.load_images (currently only in #38 branch) but also:

  • accepts any of paths, arrays, or tensors
  • can make differentiable or not (default yes)
  • can use full dynamic range or not (divide by e.g., np.iinfo(np.uint16).max
  • set output range (default [0, 1], other standard case [-1, 1]
  • like load_images, should make sure we return 4d tensor, optionally (and by default) convert to grayscale, and convert to torch.float32 (make the end dtype an option? not sure if we also need torch.float16 and torch.float64)

One of the difficulties of accepting arrays or tensors (rather than paths) is they are unlikely to still be in their original dtype (e.g., the Einstein image is stored as an 8 bit image on disk, but, depending on how it's loaded, could easily end up as a np.float32 array, but it might still have a max of 255). That's mainly an issue when it comes to determining what the max value is, though this probably isn't a huge issue: we're likely to receive either something that has been re-ranged or still has original values (it seems unlikely that someone would e.g., load an 8-bit image, multiply its values by 5, and then pass it to this function), so we might be able to just do a simple check: don't change anything with all pixel values within the output range, treat anything with all positive values and max between 1 and 255 as an 8-bit image, treat anything with all positive values and max between 255 and 65535 as a 16-bit image, and raise an exception for anything else.

Make FrontEnd more efficient

The FrontEnd model is a very useful one, and would be great to have in some examples, but it's right now so inefficient that synthesizing with it is very slow. How can we make it more efficient?

Discussed a bit with @pehf and my understanding is the main issue is that it's convolving with 31x31 kernels in the signal domain (I haven't profiled it to investigate). If that's so, will get slower as a function of image size. Could we not just take the Fourier transform of the kernel and the image, multiply together, and take the inverse Fourier transform (like the way our steerable pyramid implementation works)?

Replace our autodiff jvp/vjp functions with PyTorch 1.5 built-in jvp/vjp?

Issue: The new stable release of PyTorch 1.5 includes autograd methods to compute vector-jacobian products (VJP) and jacobian-vector products (JVP). We rolled our own methods to compute these products to synthesize Eigendistortions. Should we replace our methods in favour of PyTorch's built-in functions to possibly reduce redundant code?

Short answer: No.

Long answer: We use the power method (and the Lanczos algorithm, which is a form of power method) to synthesize Eigendistortions. This requires calling VJP and JVP thousands of times. The way we do this now is to compute 1 forward pass of the model, maintain its graph, then iteratively use our functions to perform N backwards passes on that graph to compute N VJP/JVPs (i.e. N+1 operations to compute N products) This contrasts with PyTorch's implementation of VJP/JVP in that their methods perform a forward pass each time, thus requiring 2N operations to compute N products.

Fix notebook tqdm

Our progress bar is created by tqdm, which has a separate version for working in notebooks: from tqdm.notebook import tqdm instead of from tqdm import tqdm. We would need to be able to tell whether the library is being imported from a notebook or not and change which tqdm we import.

It looks like tqdm also has an auto option, which should figure this out for us.

Abstract classes for model and synthesis objects?

In pyrtools, we had a pyramid class that we never wanted anyone to use, but that all the pyramid objects inherited. This made it easy to share relevant methods between them and make sure they had comparable attributes.

Should we do a similar thing for model and synthesis objects? I've written a whole bunch of code for the ventral stream models and for metamer that I feel could be relevant for other models and synthesis objects, and it would make standardization of the API easier. Using an abstract master class would make it easy to share these methods and make sure the attributes are consistent without requiring too much overhead (once the initial creation of the parent classes is finished...)

I've been meaning to abstract some of the stuff I've written for metamer and ventral stream models regardless. For example, the save and load methods (as well as the "reduced" version for the ventral stream model), and the display code. I've done a bit of work making the display code abstract already, but if we put it in a parent class, you'd have access to it for free.

Overload batch dimension?

Geodesic and eigendistortion only work on inputs with a single element in the batch dimension and then overload it: eigendistortion makes use of it for the different eigenvectors, geodesic for different steps in the path between the two anchor images. Should Synthesis, MADCompetition, and Metamer do something similar?

Create display tutorial

Create a tutorial showing how to use all the Synthesis display code. Show basic usage, how you can customize the size of the plot and its contents, and the fine-grained control allowed by axes_idx. Also, how animate works pretty easily.

For advanced usage, discuss update_plot?

Add GPU tests

We want to make sure that our code runs on GPUs with very little overheard.

Currently, there are two steps for that:

  1. Make sure everything runs on GPU in same manner. See metamer.py, steerable_pyramid_freq.py, pooling.py, and ventral_stream.py for my preferred way, but basically: none of our synthesis methods nor models should set the device anywhere:

    • Models and synthesis methods should have a .to method, which moves all tensor attributes over to given device/dtype, and then all of its methods should work regardless of which device they're on. This can be done by using things like torch.ones_like; if a new tensor needs to be created (and you can't use torch.ones_like or something like it), its device should be explicitly set to that of that method's input. If method has no input, check one of the tensor attributes.
  2. Figure out how to make Travis CI work with CUDA. There's an open issue on this, so it might not be trivial, but they link an existing project which has a .travis.yml file we could try modifying.

Docs are broken

Maybe not surprising, but the docs are broken right now. Following the instructions outlined in CONTRIBUTING.md (with a fresh install of the plenoptic_docs environment), the make html command fails, The attached docs.log shows its output.

It's a bunch of errors, but probably the same thing over and over again.

Add `normalize_coefficients` method to Steerable Pyramid

This method would analytically normalize the steerable pyramid coefficients, which vary in their magnitude across scales. There are two reasons for this:

  1. Down-sampling between scales (when downsample=True). Because we downsample at each scale by a factor of two, the magnitude increases by four (two squared, two in each of the two dimensions).

  2. Natural images have more power in the lower frequencies than the higher frequencies, because they have 1/f power spectra. Therefore, we could up-weight the higher frequencies proportional to that. Should we correct for this? Make it an argument? When optimizing, you want them all to be approximately the same magnitude, but are there cases where you would want to correct for the first issue but not this?

Z-scoring the coefficients appears to be key for good metamer synthesis both for the Portilla-Simoncelli texture statistics and the PooledV1 model. If we do the above, that extra step might not be necessary.

Refactoring autodiff.py

Right now, these functions are just used in eigendistortion and are fine as written, but they're helpful in other contexts: e.g., if you want to use torch.backward(output), and output is not a scalar, you need to pass a gradient vector, and the Jacobian-vector product should be used (assuming I'm understanding the documentation). So we should make these functions easier to use in other contexts, such as our standard way of interacting with models: when input is 4d and output is 3d or 4d (with possibly multiple batches and channels)

Add ability to move foveation location

ventral_stream.py models (and pooling.py windows) currently can only fixate at the center of the image, would be relatively simple to allow fixation location to be a parameter

Add plenoptic.imshow

When using our pyrtools.imshow, it's annoying to convert the tensors to arrays all the time (and call squeeze and all that), so let's create a wrapper around it that handles it automatically.

Should probably live in tools/display.py and have same call signature as pyrtools.imshow, should call plenoptic/tools/data.to_numpy on each image and .squeeze() them. Not sure if it would need anything else.

Utilize symmetry of fft for real steerable pyramid

In the real steerable pyramid, utilize the benefits of the symmetric fft in rfft and irfft to make the computation more efficient. This will require utilizing onesided=True for these cases and then adjusting the mask sizes etc. to make the rest of the code compatible.

Reproduce SSIM/MSE MAD Competition example from paper

MAD Competition is working now, but in order to be certain about it, we want to synthesize some images that match examples from the MATLAB code (though they won't be identical, should be in the ballpark)

Additionally, I found a weird issue with the example in the Simple_MAD notebook: when using po.add_noise to generate the initial image, the image generated would always lie along the forward or reverse diagonal (e.g., from base image [.5, .5] to [.6, .4]), which gives you L1 and L2 loss contours such that the circle is completely circumscribed within the square:

Selection_010

In this case, MAD Competition (with the parameters set up in that notebook) completely failed to find any solution for it. Need to think about both why there appear to be such limited possible values and why MAD Competition has trouble here, but for now that means going back to the earlier way of adding noise. Should maybe add option to specify initial image?

FourMomentsClamper failures

The FourMomentsClamper for metamer synthesis doesn't seem to be working. In order to get it working on gpus, I need to add a bunch of device=ch.device throughout it (see the ventral_stream branch), but now I'm running into CUDA error: an illegal memory access was encountered.. I'm not sure this happens every time?

I sometimes have gotten even stranger errors: MAGMA geev : Argument 4 : illegal value at /opt/conda/cond-bld/pytorch_1556653114079/work/aten/src/THC/generic/THCTensorMathMagma.cu:220

So I'm not sure what to make of that. I don't understand enough of what's happening in that function (specifically, it's the modkurt function), but would probably be worth cleaning it up so there's no so much creation of new tensors throughout.

Check test coverage

Probably worth using something to check how complete our test coverage is (that is, are we not missing a test): see here for general discussion and pytest-cov for a library we could use.

Make eigendistortion accept batch/channel images

We want all of our synthesize methods to expect 4d images: (batch, channel, height, width) and expect the model outputs to be 3d or 4d: (batch, channel, y_1) or (batch, channel, y_1, y_2). Eigendistortions right now does not.

Right now, the synthesis methods are probably too memory-intensive to make this way of doing things reasonable. It would require some more thinking about how to parallelize across batches / channels that none of us need right now.

OMP: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized.

Describe the bug
Error:
OMP: Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized.

To Reproduce
[Note: this will only be reproducible on Mac OSX and only sometimes.]

import matplotlib.pyplot as plt
import plenoptic as po
import torch

model = po.simul.Texture_Statistics([256,256])
image = plt.imread('../data/nuts.pgm').astype(float)/255.
im0 = torch.tensor(image, requires_grad=True, dtype = torch.float32).squeeze().unsqueeze(0).unsqueeze(0)
c = po.RangeClamper([image.min(), image.max()])
M = po.synth.Metamer(im0, model)

producing the following message:

OMP: Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.

Note: If this error occurs during the use of a jupyter notebook, then the kernel dies , producing the error message above in the Terminal and the following message in the Jupyter notebook:

image

System (please complete the following information):

  • OS: MacOSX 10.15.6
  • Python version 3.7
  • Pytorch version 1.6
  • Plenoptic version 0.1

Fix RNG state when resuming synthesis

If you run synthesis twice in a row, you'll pick up more or less where you left off (assuming you set initial_image=None and learning_rate=None on the second call; this is pending the merger of the current ventral_model branch), with one major caveat: the state of the random number generator. We require a seed and always set it at the beginning of the synthesis call. If you resume synthesis with the same object in the same session, we can just allow the user to set seed=None and, if seed is None, don't set it.

However, if we save a metamer object, load it, and then resume synthesis (which is not uncommon when doing synthesis that takes a long period of time), currently we have no good way to resume the RNG state. Something like torch.random.fork_rng_state or what it does (I can't find an example code with how to use it) is probably what we want. But I'm not sure how to handle devices with this.

Grabbing the cpu state would be easy, my preference would be to do something like the following: at the end of synthesis, do self.cpu_rng_state = torch.get_rng_state(), make sure to save the cpu_rng_state attribute by adding it to the list of attributes in the save function and then, during load, call torch.set_rng_state(metamer.cpu_rng_state).

However, grabbing the GPU states apparently takes a long amount of time (see the warning in the function linked above) and we would only want to do it for the relevant devices. Currently the metamer object is not explicitly aware of what devices are relevant, which I prefer because it makes the code completely device-agnostic. However, it presents a problem here and I see three solutions:

  1. Don't try to resume GPU rng state (current situation)
  2. Grab RNG state from all available GPUs (as the fork_rng function linked above does if devices isn't specified), and set them all.
  3. Figure out what devices are being used. I think this is the best solution, and my preference for how to handle it would be to check initial_image.device and model.device. Currently, we do not require the model's device to be set and so it's very possible that there is no device attribute (my ventral stream models have device attribute). We could start encouraging it and default to 2 if it's not present.

If we do something like 2 (or do that as the default in 3), then we should probably require this option to be enabled, rather than always doing it, since it apparently takes time. And regardless of whether we do 2 or 3, it should happen at the end of the synthesis call.

Test not-downsampled pyramid

We currently only test the downsampled version of the pyramid against earlier implementations. Can we add a test of the not-downsampled version as well? We should be able to either up or down-sample, respectively, the coefficients in order to check against each other, and that should hopefully (if we do it in the same manner) account for the difference in magnitudes.

replace instances with torch.tensor when applicable

Instantiating a tensor via torch.tensor should be avoided when torch.from_numpy or torch.as_tensor can be used instead. This is bc torch.tensor always copies data, whereas the other to do not, or at the very least avoid when possible.

Look into pytest-notebook

pytest-notebook looks like a good way to re-run notebooks and check that their output hasn't changed. Could be useful to add that to our tests, since we want to make sure that we don't break the tutorial notebooks with any changes we make (see if we can select some cells to have different outputs maybe? would we want to know anytime the output changes or just if you can't run the notebooks anymore?)

Add geodesics

The basics of this has been completed, but needs more work to finalize.

Pytorch-ify PooledVentralStream

PooledVentralStream models are not quite pytorch-ic: they should have the different computations as layers, each of which are torch.nn.modules, allowing for hooks (see here) and don't store memory-intensive attributes. Attributes should only be metadata, and have methods for converting the tensor output into the more structured representational form for visualization / understanding (but do not store it as an attribute)

tools.rectangular_to_polar / tools.polar_to_rectangular fails tests

pytest for test_plenoptic.TestNonlinearities::test_coordinate_transform fails for unknown reason, likely to do with po.rescale.

Reproducible failed test (note torch.manual_seed(1) placed on line before second a = [...].

def test_coordinatetransform(self):
        a = torch.randn(10, 5, 256, 256)
        b = torch.randn(10, 5, 256, 256)

        A, B = po.polar_to_rectangular(*po.rectangular_to_polar(a, b))

        assert torch.norm(a - A) < 1e-3
        assert torch.norm(b - B) < 1e-3
        torch.manual_seed(1)
        a = torch.rand(10, 5, 256, 256)
        b = po.rescale(torch.randn(10, 5, 256, 256), -np.pi / 2, np.pi / 2)

        A, B = po.rectangular_to_polar(*po.polar_to_rectangular(a, b))

        assert torch.norm(a - A) < 1e-3
        assert torch.norm(b - B) < 1e-3

The last assert statement assert torch.norm(b - B) < 1e-3, comparisons of angles before and after what should be an identity transformation, is what failed.

Add support for coarse_to_fine for steerable pyramid

Want the steerable pyramid to have support for coarse to fine optimization, which means that it should accept scales the way that the ventral stream models do. This will improve efficiency for those models and will help Portilla-Simoncelli coarse-to-fine as well.

Add color/channel support

Variety of stuff:

  • look for SSIM and color references
  • make sure synthesis methods can take and return multi-channel images (though what they do should depend on the model)
  • Integration with colour?

FrontEnd model produces artefactual eigendistortions near boundaries

In its current form, the FrontEnd model produces eigendistortions near the edges for several input images of varying crop size. We are currently using refletctionpad2d boundary handling. This issue could possibly be be resolved with a frequency domain implementation, as suggested in existing issue #23, which would in effect implement circular boundary handling. Alternatively, we could leave our convs in the spatial domain and try various other boundary handling options.

I tried applying a circular diskmask to the image during the forward() call. In this case the eigendistortions just ended up at the edges of the circular mask.

The 31x31 conv2d weights we're using are pre-trained using a model that was trained on images of dim 384x512. I tried using images of this size as well to synthesize eigendistortions and still got eigendistortions near the edge.

Feature request: user-defined num_orientations in steerable_pyramid_freq

In simulate.canonical_computations.steerable_pyramid_freq.Steerable_Pyramid_Freq we should allow the user to define the number of filter orientations. This would obviate using steer_coeffs method, and would explicitly have responses of each oriented filter returned in the response tensor as additional channels.

Add MS-SSIM

We had an implementation of this, but removed it because of difficulties getting it implemented. The function is pasted below as a starting point.

This repo contains the MATLAB MS-SSIM code from Zhou Wang's website that's referenced in the code below, and that should be used for generating values to match

Things to watch out for:

  • When comparing the images curie.pgm and einstein.pgm from this repo, some of the mcs values are negative, which leads to issues because pytorch doesn't support complex values right now (we have workarounds in our steerable pyramid, where we put the real and imaginary in a 5th dimension, at the end). In matlab, the returned value is complex (.0289+.0666i) and I don't know enough about MS-SSIM to know if this is reasonable
  • mcs and mssims will be (5, b, c) tensors, where b and c are the numbers of batches and channels that we get when comparing img1 and img2. weights is a 1d tensor with 5 elements, and so mcs**weights (or, equivalently, torch.pow(mcs, weights)) will be a (5,b,5) (not sure what happens with channels) tensor, from which we want the diagonal of each batch. This is a little clunky and there's probably a better way to do it.
  • Would we want ability to change levels? Where would weights come from then?
def msssim(img1, img2, dynamic_range=1, normalize=False):
    device = img1.device
    weights = torch.FloatTensor([0.0448, 0.2856, 0.3001, 0.2363, 0.1333]).to(device)
    levels = weights.size()[0]
    mssims = []
    mcs = []
    for _ in range(levels):
        ssim_map, contrast_map, _ = _ssim_parts(img1, img2, dynamic_range=dynamic_range)
        mssims.append(ssim_map.mean((-1, -2)))
        mcs.append(contrast_map.mean((-1, -2)))

        img1 = F.avg_pool2d(img1, (2, 2))
        img2 = F.avg_pool2d(img2, (2, 2))

    mssims = torch.stack(mssims)
    mcs = torch.stack(mcs)

    # Normalize (to avoid NaNs during training unstable models, not compliant with original definition)
    if normalize:
        mssims = (mssims + 1) / 2
        mcs = (mcs + 1) / 2

    # This does not work as written -- a tensor with 5 elements raised to
    # another tensor with five elements returns a 5x5 tensor, from which we
    # want the diagonals. And some values in mcs can be negative, which leads
    # to difficulties
    pow1 = mcs ** weights
    pow2 = mssims ** weights
    # From Matlab implementation https://ece.uwaterloo.ca/~z70wang/research/iwssim/
    output = torch.prod(pow1[:-1] * pow2[-1])
    return output

Add support for relative threshold

For Synthesis (and its subclasses), we have two stopping criteria: either you reach max_iter or your (absolute) loss decreases by less than loss_thresh over the past loss_change_iter iterations. But this is an absolute number, which is going to differ wildly depending on the magnitude of your loss.

Would like to add support for a relative threshold, rel_loss_thresh, which checks whether (absolute) loss has decreased by rel_loss_thresh * loss_prev over the past loss_change_iter iterations. On each iteration check if loss < loss - rel_loss_thresh * loss_prev and, if so, update loss_prev = loss. Keep going until there have been loss_change_iter iterations without that, and then break.

This would go into Synthesis._check_for_stabilization, and would need to do a similar check with coarse_to_fine

Add MAD competition

  • add support for metrics
  • add tests
  • move more initialization to Synthesis superclass
  • check with coarse-to-fine, plot, and animate code

This is linked to #17, will be in same PR

Make public

We want to make public by July at the latest so Nikhil can share stuff related to his NeurIPS project.

Before that, we want to:

  • merge the open PR #19

(#38 may include breaking changes, #24 unnecessary)

Then we'll:

  • make a Github release (e.g., v0.1-neurips)
  • add alpha or WiP badge and language?
  • get a doi for that release (probably from zenodo)

Reorganize Documentation structure

In addition to docstrings, examples, and tutorials that we need, we need some good basic documentation that explains the idea behind this package, points to the associated papers, and lays out the basic ideas. Also should include stuff about basic API, how to use the various abstractions / more general functionality (coarse-to-fine optimization, plotting, etc). Those might not be necessary for final users, but are necessary for us while we work on the core.

Some potentially helpful info: open source guides from Github, Mozilla Science Working Open Workshop.

Look into napari

Seems like a useful way to visualize nD arrays: https://ilovesymposia.com/2019/10/24/introducing-napari-a-fast-n-dimensional-image-viewer-in-python/. Also has explicit support for image pyramids: http://napari.org/tutorials/image.html#image-pyramids

Want to check whether we can subclass it / extend it like we did with pyrtools.imshow, in order to make sure:

  1. There's no interpolation or smoothing in the display of an array
  2. Arrays are displayed as either zoomed out by a power of two or zoomed in by an integer, no intermediate values (which would lead to interpolation)

It looks like it's built on top of VisPy rather than matplotlib, don't know anything about the differences between them.

Get set up with Software Heritage ID

For citing, we'll ideally have a paper in JOSS (which will help publicize the project and get us some feedback). But we'd also like something to allow people to specify exactly what version they used. Software heritage IDs seems like a good way to do that (requires putting together a codemeta.json), so look into it.

Idea from this blog post, which recommends it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.