A containerized approach to making your own deep dream images
The only dependencies for this project are singularity, Python 3.5 or newer, an internet connection, and optionally an nvidia GPU with CUDA support. The actual ML frameworks are installed and managed by the singularity virtual environment, removing the need to install them on your system.
To download the supported models and build the container with the dependencies, just run make
and sit back and relax.
The make process is network bottle-necked, so unless you have a multi-gigabit connection, there likely won't be any
benefit of using a -j
flag.
To run the model on a set of images, execute ./main.py -s [source image directory] -d [destination image directory]
. The main
program will then automatically bind all the necessary directories to the singularity container and begin execution.
Flag | Short Form | Argument Type | Default | Use |
---|---|---|---|---|
--blend |
-b |
None Taken | N/A | When present, blends in a bit of the previously generated image to the new image. This allows some level of temporal stability when generating deep-dream videos. Assumes that the frames are in ascii-betical order |
--destination |
-d |
Directory | None, must to be set | Selects the directory that processed images will be written to |
--guide |
-g |
JPEG Image | None | When present, uses the specified image as a stylistic guide. When not present no style guide is used |
--jitter |
-j |
Integer | 32 | Maximum displacement for the random translation applied to the images for each algorithm set |
--maximize |
-l |
Layer | inception_4c/output |
The layer within the neural net to maximize the activation on. Using lower leveled layers will generate more geometric dreams, and higher level layers will generate more recognizable objects |
--model |
-m |
File | bvlc_googlenet/bvlc_googlenet.caffemodel |
The caffe model to be used for the generation of dreams. |
--no-gpu |
None Taken | N/A | When present, the neural network is forced to be run on the CPU. Currently this is accomplished by a separate conda environment within the environment container that only has a CPU version of caffe installed. | |
--num-iterations |
-n |
Integer | 1 | The number of iterations of the algorithm to be ran on each image. The larger this number, the more pronounced the dream like structures will be |
--octave-count |
Integer | 4 | The number of "octaves" to be run on each image. Each octave will detect and bring out patterns at an increasing scale | |
--octave-scale |
Float | 1.3 | The amount to scale the size of features generated by each subsequent octave | |
--prototext |
-p |
File | bvlc_googlenet/deploy.prototxt |
The prototext declaration of the caffe model being used |
--source |
-s |
Directory | None, must be set | Selects the directory that raw images will be read from |
--steps-per-octave |
Integer | 10 | The number of min-max steps to take on each individual octave during image generation. Has a similar effect to --num-iterations , but is applied in a different order |
If you're okay with running CPU-only models, you can install WSL2 and set up singularity in it and proceed with life as normal. If
If you want to run GPU based models, you will first need to install all of the dependencies:
- CUDA: available for download with instructions from nvidia's website
- Anaconda: available here
- caffe-gpu: Installable through anaconda
- numpy: Installable through anaconda
- scipy: Installable through anaconda
- pillow: Installable through anaconda
- ipython: Installable through anaconda
- protobuf: Installable through anaconda
- numba: Installable through anaconda
Once the dependencies are installed, you can run the model directly, bypassing singularity using the src/dream.py
file. Note that you wll need to supply the following arguments:
- --prototext: The path the the proto text of the caffe model, usually has file extension of
.prototxt
or.prototxt
, but this can vary. - --caffemodel: The path to the model data for the caffe model, usually has file extension of
.caffemodel
, but this can vary. - --source: The path to the folder of all images that should be processed
- --destination: The path to the folder where the processed images should be saved
Other options are optional and can be used to fine tune image generation
Uhhhhhhh.... have fun. I make no guarantee that this project will work for you, but the steps listed in the "running on windows" section should work for you ¯\_(ツ)_/¯
Let me know if you get it working.
- Multi-GPU: Add support for multi-GPU. This will be easy when blend is off, but will require some fancy transition-sensitive allocation logic for blended videos
- GPU selection: Select what GPU (or GPUs if I get multi-GPU working) you want the model to use in multi-GPU systems
- Better blend options: Right now the blend factor is just randomly selected each frame. It would be nice to include things like constant factors or Gaussian distributions
- Style transfers: Spaghetti videos! I need to find the models for this...
- Single Image Support: For when you only want to process a single image instead of a whole directory
- Support for other image formats: Currently the program only supports jpegs... :(