GithubHelp home page GithubHelp logo

sheiksadi / sam-lstm-resnet Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 3.0 38 MB

Crop and bound-box interesting parts of images using saliency maps generated with Attentive Convolutional LSTM Residual Network

Home Page: https://pypi.org/project/sam-lstm/

License: MIT License

Python 1.07% Jupyter Notebook 98.93%
saliency-prediction human-eye-fixations saliency-attentive-model opencv python3 scikit-image scipy tensorflow2

sam-lstm-resnet's Introduction

Crop and bound-box interesting regions of an image (smart cropping) from saliency maps generated with SAM-LSTM-RESNET model

This repository contains the reference code written in Python 3 for generating saliency maps of images using Convolutional LSTM Resnet (implemented with TensorFlow 2 ) and smartly cropping images based on these maps.

Demo

image

Getting Started

Try Now On Colab

Pip Installation

pip install sam-lstm==1.0.0

Dependencies

  • Tensorflow 2.9.0
  • Scipy 1.9.3
  • Scikit Image 0.19.3
  • Numpy 1.23.4
  • OpenCV 2.9.0
  • CUDA (GPU)

Tips: Building up the environment on your local machine from scratch can take hours. If you want to get your hands on asap, then just use Google Colab with GPU runtime. It's free and all these libraries are preinstalled there.

Note It's mandatory to run the code on GPU runtime, otherwise it will fail. In a future release, the code will be made compatible with CPU runtime as well.

All you need is two lines!

# Create a folder "samples" in the current directory
# Upload some images (.jpg, .png) in it
from sam_lstm import SalMap
SalMap.auto()

With just this two lines, sam_lstm will compile the LSTM-based Saliency Attentive Convolutional model, generate raw saliency mapping in the maps folder, colored overlay mapping in the cmaps folder, bounding boxes over the images in the boxes and cropped ones in the crops folder. All of these will happen automatically. Just make sure you have .jpg/.jpeg/.png images in the samples folder. image

Training the weights

from sal_lstm import SalMap

dataset = "dataset"
checkpoint = "/content/drive/MyDrive/Checkpoints/"

# Uncomment these lines if on GOOGLE COLAB
# import os
# from google.colab import drive
# drive.mount('/content/drive')
# if  not os.path.exists(checkpoint):
#	os.mkdir(checkpoint)

s = SalMap()
s.compile()
s.load_weights()
s.train(dataset_path, checkpoint, steps_per_epoch=1000)

With these line, you can start training the models using the Salicon 2017 dataset (which will get downloaded in the dataset directory)

Credits

This work has been built on top of the following works:

  1. Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model by Cornia et. el. 2018
  2. Python 2 implementation (using Keras+Theano) by @marcellacornia. Check here

Scope of work done by @SheikSadi

  1. Implement the source code on Python 3, using latest versions (by November 2022) of tensorflow and opencv. The original work by @marcellacornia was written with Python2 and used Theano backend for Keras, all of which are now unsupported by the community.
  2. Update the preprocessing stage to be compatible with Salicon 2017 dataset.
  3. Convert the work into an open source Python package readily installable from PyPa.
  4. Addition of the cropping module that allows for smart cropping of images. I have written a Descent from Hilltop algorithm for finding the bounding boxes by which the images are cropped.

The Underlying Neural Network

image

Resources

  1. Training and validation dataset
  1. No Top Resnet50 weights (NCHW format)
  1. Pre-trained weights

sam-lstm-resnet's People

Contributors

sheiksadi avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.