rdroste / unisal Goto Github PK

Unified Image and Video Saliency Modeling (ECCV 2020)

Home Page: https://arxiv.org/abs/2003.05477

License: Apache License 2.0

Python 99.81% TeX 0.19%

machine-learning deep-learning saliency-detection saliency-prediction visual-saliency visual-salience video-saliency-prediction video-saliency image-saliency eccv2020

unisal's Introduction

Unified Image and Video Saliency Modeling

This repository provides the code for the paper:

Richard Droste, Jianbo Jiao and J. Alison Noble. Unified Image and Video Saliency Modeling. In: ECCV (2020).

If you use UNISAL, please cite the following BibTeX entry:

@inproceedings{drostejiao2020,
     author = {{Droste}, Richard and {Jiao}, Jianbo and {Noble}, J. Alison},
      title = "{Unified Image and Video Saliency Modeling}",
  booktitle = {Proceedings of the 16th European Conference on Computer Vision (ECCV)},
       year = {2020},
}

https://arxiv.org/abs/2003.05477
Official Benchmark
Supplementary Video
ECCV Full Spotlight Presentation (10min)
ECCV Short Presentation (90s)

Comparison of UNISAL with current state-of-the-art methods on the DHF1K Benchmark

UNISAL method overview

Dependencies

To install the dependencies into a new conda environment, simpy run:

conda env create -f environment.yml
source activate unisal

Alternatively, you can install them manually:

conda create --name unisal
source activate unisal
conda install pytorch=1.0 torchvision cudatoolkit=9.2 -c pytorch
conda install opencv=3.4 -c conda-forge
conda install scipy
pip install fire==0.2 tensorboardX==1.6

Demo

We provide demo code that generates saliency predictions for example files from the DHF1K, Hollywood-2, UCF-Sports, SALICON and MIT1003 datasets. The predictions are generated with the pretrained weights in training_runs/pretrained_unisal.
Follow these steps:

Download the following example files and extract the contents into the examples of the repository folder:
Google Drive: .zip file or .tar.gz file
Baidu Pan: .zip file (password: mp3y) or .tar.gz file (password: ixdd)
Generate the demo predictions for the examples by running python run.py predict_examples

The predictions are written to saliency sub-directories to the examples folders.

Training, scoring and test set predictions

The code for training and scoring the model and to generate test set predictions is included.

Data

For training and test set predictions, the relevant datasets need to be downloaded.

DHF1K, Hollywood-2 and UCF Sports:
https://github.com/wenguanwang/DHF1K
SALICON:
http://salicon.net/challenge-2017/
MIT1003:
http://people.csail.mit.edu/tjudd/WherePeopleLook/index.html
MIT300
http://saliency.mit.edu/results_mit300.html

Specify the paths of the downloaded datasets with the environment variables DHF1K_DATA_DIR, SALICON_DATA_DIR, HOLLYWOOD_DATA_DIR, UCFSPORTS_DATA_DIR MIT300_DATA_DIR, MIT1003_DATA_DIR.

Training

To train the model, simpy run:

python run.py train

By default, this function computes the scores of the DHF1K and SALICON validation sets and the Hollywood-2 and UCF Sports test sets after the training is finished. The training data and scores are saved in the training_runs folder. Alternatively, the training path can be overwritten with the environment variable TRAIN_DIR.

Scoring

Any trained model can be scored with:

python run.py score_model --train_id <name of training folder>

If --train_id is omitted, the provided pre-trained model is scored. The scores are saved in the corresponding training folder.

Test set predictions

To generate predictions for the test set of each datasets follow these steps:

Specify the directory where the predictions should be saved with the environment variable PRED_DIR.
Generate the predictions by running python run.py generate_predictions --train_id <name of training folder>

If --train_id is omitted, predictions of the provided pretrained model are generated.

unisal's People

Contributors

Stargazers

Watchers

unisal's Issues

Missing script to recreate DHF1K directories

Hello, I need to evaluate UniSal on DHF1K, but the original DH1FK comes with a folder whose directories are different from the ones used within your dataset class. Could you share a script to reorganise the dataset or share a link to download an already reorganised DHF1K dataset?

SALICON data.py error

Hello, I'm a rookie and I'm just getting into video saliency detection. When running "python run.py train",, the following problem appears, showing that it is a problem when reading in SALICON data, as follows.

What is the reason for this and what improvements do I need to make?

Question about cudatoolkit and environment

Hi! I'm trying to run this on my laptop. When I try:

conda install pytorch=1.0 torchvision cudatoolkit=9.2 -c pytorch

I got this:

Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - cudatoolkit=9.2

Current channels:

  - https://conda.anaconda.org/pytorch/win-64
  - https://conda.anaconda.org/pytorch/noarch
  - https://repo.anaconda.com/pkgs/main/win-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/win-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/msys2/win-64
  - https://repo.anaconda.com/pkgs/msys2/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

May I ask what kind of cuda requirements for training this project? Is it possible to run without GPU or CUDA?

Even I tried:

conda install pytorch=1.0 torchvision -c pytorch

I got a version incompatibility.

Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: \
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
Examining python=3.8:  33%|██████████████                            | 1/3 [00:00<00:0Examining python=3.8:  67%|████████████████████████████              | 2Examining torchvision:  67%|███████████████████████████▎             | 2Examining conflict for pytorch python torchvision:  33%|████▎        | 1/3 [00:00<00:00,  7.81iExamining conflict for pytorch python torchvision:  67%|████████▋    | 2/3 [00:00<00:00, 15Examining conflict for python torchvision:  67%|██████████████       | 2/3 [00:00<00:0Examining conflict for python torchvision: 100%|█████████████████████| 3/3 [00:failed

UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:

Specifications:

  - torchvision -> python[version='>=2.7,<2.8.0a0']

Your python: python=3.8

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versions

Package ca-certificates conflicts for:
python=3.8 -> openssl[version='>=1.1.1e,<1.1.2a'] -> ca-certificates
torchvision -> python -> ca-certificates

Thank you so much for your help.

finetuning on a different dataset

Hi! thanks for releasing the code for unisal!

Is it possible to provide a minimal set of scripts and instructions to finetune the unisal model on a different dataset other than the ones listed in the respository? I notice that data.py, train.py etc. come with various commands to switch across the datasets you used to train unisal. Would be great to get some help with training on just one new dataset.

Thanks,
Ekta

unisal can't transfer onnx model

Dear author, your work is very good, but your network is operated with torch.unbind everywhere, which makes it impossible to export the onnx model and onnx acceleration, so sad, have you tried this, thank you

How are the third line images generated

Hi, please tell me how are the third line images generated, do you have any codes about it? thank you, i am waiting for your reply

Why I can't run the project with MIT300?

Hello, I would like to ask which dataset the weights_best.pth in the training_runs/pretrained_unisal folder is trained with

Getting image heatmap from saliency map and original image

Is there any way to get good heatmap, form saliency map and original image, I am trying blending techniques etc but these are not good , some time face problem of change in scale to blue etc and some time not that much heat over attention points. Saliency map thresolding is main problem.
can any one help there

Evaluation Metric

Hi, thank you for sharing your excellent work. I have a question when I run the code. I found that the evaluation value is calculated by the average over all the videos instead of all the frames of all the videos. This makes the evaluation value higher than that calculated by the metrics used in other papers. Is my understanding right?

SALICON loaded with 0 samples in training

Hello,Mr.Droste.I changed the paths of datasets and run python run.py train, but it don't work. This is the traceback. I find that SALICON loaded with 0 samples in training. Could you please help me?

(unisal) LHY2@node2:/storage/LHY2/Code/unisal$ OMP_NUM_THEREADS=1 CUDA_VISIBLE_DEVICES=6 python run.py train
++++++++ /storage/LHY2/Dataset/VideoSaliency/SALICON
600 videos loaded (100.0%)
0 videos are too short [(0.0%)]
0 videos are too long (0.0%)
DHF1K      train dataset loaded with 600 samples
946 videos loaded (43.2%)
1244 videos are too short (56.8%)
0 videos are too long (0.0%)
Hollywood  train dataset loaded with 946 samples
82 videos loaded (88.2%)
11 videos are too short (11.8%)
0 videos are too long (0.0%)
UCFSports  train dataset loaded with 82 samples
**SALICON    train dataset loaded with 0 samples**

Epoch   0, lr 0.04000
DHF1K, phase train batch size: 4
Hollywood, phase train batch size: 4
UCFSports, phase train batch size: 4
SALICON, phase train batch size: 32
Traceback (most recent call last):
  File "run.py", line 97, in <module>
    fire.Fire()
  File "/storage/LHY/anaconda/envs/unisal/lib/python3.7/site-packages/fire/core.py", line 138, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/storage/LHY/anaconda/envs/unisal/lib/python3.7/site-packages/fire/core.py", line 471, in _Fire
    target=component.__name__)
  File "/storage/LHY/anaconda/envs/unisal/lib/python3.7/site-packages/fire/core.py", line 675, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "run.py", line 13, in train
    trainer.fit()
  File "/storage/LHY2/Code/unisal/unisal/train.py", line 237, in fit
    self.fit_epoch()
  File "/storage/LHY2/Code/unisal/unisal/train.py", line 265, in fit_epoch
    self.fit_phase()
  File "/storage/LHY2/Code/unisal/unisal/train.py", line 281, in fit_phase
    for src in sources}
  File "/storage/LHY2/Code/unisal/unisal/train.py", line 281, in <dictcomp>
    for src in sources}
  File "/storage/LHY2/Code/unisal/unisal/train.py", line 1067, in get_dataloader
    drop_last=True,
  File "/storage/LHY/anaconda/envs/unisal/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 802, in __init__
    sampler = RandomSampler(dataset)
  File "/storage/LHY/anaconda/envs/unisal/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 64, in __init__
    "value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integeral value, but got num_samples=0

On the pooling method of the backbone network

Great work! But I'm curious about the choice of the weird pooling method of mobilenet backbone. You didn't use a normal pooling method like average/max pooling or pooling by stride in convolution, but choose to directly slice a quarter of the input. I thought it'll stop the gradient for the other 75% of inputs when backprop and make these inputs useless, which doesn't make sense at all. Is this an intentional design or a random choice?

Regrading to the AUC performance on SALICON testing set

Hi,

Thank you for sharing your awesome work.

I read you paper, I noticed that you report the AUC_Judd in the Table 3 of your paper.

However, when we submit the results to the SALICON official website, it seems that they only give the AUC_Borji value of SALICON test set, as stated by @mjiang one of the authors of SALICON evaluation code.

If the AUC value from SALICON website is AUC_Borji, how did you get the AUC_Judd value and report it in the Table 3 of your paper?

Could you please explain that? Thank you so much!

Bests.