wgcban / hypertransformer Goto Github PK

[CVPR'22] HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening

Home Page: https://www.wgcban.com/research#h.ar24vwqlm021

License: MIT License

Python 87.84% MATLAB 12.16%

pansharpening super-resolution hyperspectral-imaging image-fusion deep-learning multispectral-images transformers attention-mechanism

hypertransformer's Introduction

HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening (CVPR'22)

Wele Gedara Chaminda Bandara, and Vishal M. Patel

For more information, please see our

Paper: CVPR-2022-Open-Access or arxiv.
Poster: view here
Video Presentation: view here
Presentation Slides: download here

Summary

Setting up a virtual conda environment

Setup a virtual conda environment using the provided environment.yml file or requirements.txt.

conda env create --name HyperTransformer --file environment.yaml
conda activate HyperTransformer

conda create --name HyperTransformer --file requirements.txt
conda activate HyperTransformer

Download datasets

We use three publically available HSI datasets for experiments, namely

Pavia Center scene Download the .mat file here, and save it in "./datasets/pavia_centre/Pavia_centre.mat".
Botswana datasetDownload the .mat file here, and save it in "./datasets/botswana4/Botswana.mat".
Chikusei dataset Download the .mat file here, and save it in "./datasets/chikusei/chikusei.mat".

Processing the datasets to generate LR-HSI, PAN, and Reference-HR-HSI using Wald's protocol

We use Wald's protocol to generate LR-HSI and PAN image. To generate those cubic patches,

Run process_pavia.m in ./datasets/pavia_centre/ to generate cubic patches.
Run process_botswana.m in ./datasets/botswana4/ to generate cubic patches.
Run process_chikusei.m in ./datasets/chikusei/ to generate cubic patches.

Training HyperTransformer

We use two stage procedure to train our HyperTransformer.

We first train the backbone of HyperTrasnformer and then fine-tune the MHFA modules. This way we get better results and faster convergence instead of training whole network at once.

Training the Backbone of HyperTrasnformer

Use the following codes to pre-train HyperTransformer on the three datasets.

Pre-training on Pavia Center Dataset:

Change "train_dataset" to "pavia_dataset" in config_HSIT_PRE.json.

Then use following commad to pre-train on Pavia Center dataset. python train.py --config configs/config_HSIT_PRE.json.
Pre-training on Botswana Dataset: Change "train_dataset" to "botswana4_dataset" in config_HSIT_PRE.json.

Then use following commad to pre-train on Pavia Center dataset. python train.py --config configs/config_HSIT_PRE.json.
Pre-training on Chikusei Dataset:

Change "train_dataset" to "chikusei_dataset" in config_HSIT_PRE.json.

Then use following commad to pre-train on Pavia Center dataset. python train.py --config configs/config_HSIT_PRE.json.

Fine-tuning the MHFA modules in HyperTrasnformer

Next, we fine-tune the MHFA modules in HyperTransformer starting from pre-trained backbone from the previous step.

Fine-tuning MHFA on Pavia Center Dataset:

Change "train_dataset" to "pavia_dataset" in config_HSIT.json.

Then use the following commad to train HyperTransformer on Pavia Center dataset.

Please specify path to best model obtained from previous step using --resume. python train.py --config configs/config_HSIT.json --resume ./Experiments/HSIT_PRE/pavia_dataset/N_modules\(4\)/best_model.pth.
Fine-tuning on Botswana Dataset:

Change "train_dataset" to "botswana4_dataset" in config_HSIT.json.

Then use following commad to pre-train on Pavia Center dataset.

python train.py --config configs/config_HSIT.json --resume ./Experiments/HSIT_PRE/botswana4/N_modules\(4\)/best_model.pth.
Fine-tuning on Chikusei Dataset:

Change "train_dataset" to "chikusei_dataset" in config_HSIT.json.

Then use following commad to pre-train on Pavia Center dataset.

python train.py --config configs/config_HSIT.json --resume ./Experiments/HSIT_PRE/chikusei_dataset/N_modules\(4\)/best_model.pth.

Trained models and pansharpened results on test-set

You can download trained models and final prediction outputs through the follwing links for each dataset.

Pavia Center: Download here
Botswana: Download here
Chikusei: Download here

Citation

If you find our work useful, please consider citing our paper.

@InProceedings{Bandara_2022_CVPR,
    author    = {Bandara, Wele Gedara Chaminda and Patel, Vishal M.},
    title     = {HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {1767-1777}
}

hypertransformer's People

Contributors

Stargazers

Watchers

Forkers

rsdljm techthiyanes zhwzhong tianyu-z paperwave kleffy shenkqtx mshabdiz dl-vit xavierjiezou prikat25 lyan-ing liurixian guochaocs tami-uib bscng

hypertransformer's Issues

the problem of dataset

The dataset could not be downloaded using the link you gave.

A problem when reproduced HyperTransformer code

Hello Chaminda,

Thank you for the code and work on HyperTransformer. When I tried to reproduce the score on the botswana4 dataset, the metrics I got are far from expectation:

pretrain:
{"loss": 0.07473030593246222, "cc": 0.9290400743484497, "sam": 3.0020320415496826, "rmse": 0.021659649908542633, "ergas": 0.6483104825019836, "psnr": 27.892120361328125}

train：
{"loss": 0.08555552270263433, "cc": 0.8993576765060425, "sam": 3.35520076751709, "rmse": 0.02654602937400341, "ergas": 0.7366535663604736, "psnr": 26.585630416870117}

I also used the trained model you provided, and I got:
{"loss": 0.05360260047018528, "cc": 0.9539724588394165, "sam": 2.2932522296905518, "rmse": 0.01636636257171631, "ergas": 1.8692368268966675, "psnr": 30.393962860107422}

Both results are far from expectation.

Then,I checked the github issue of HyperTransformer, then I changed "max_value": 8000 to "max_value": 9816(I got this value from the proccess code of matlab),the pretrain metrics got improvement:

{"loss": 0.03306201007217169, "cc": 0.964657187461853, "sam": 1.863145351409912, "rmse": 0.01357241254299879, "ergas": 0.3927982747554779, "psnr": 32.014068603515625}

But still far away from expectation.

Do you know how to solve this problem?

how to determine the parameters in process_xxx.m

Hello, I am very interested in your work. I want to ask how to determine the parameters (i.e. 60: 80, 1:29, and 1:100) in process_xxx.m, for example, the chikusei_pan = mean(chikusei(:,:,60:80), 3); Botswana_pan = mean(Botswana(:,:,1:29), 3); pavia_pan = mean(pavia(:,:,1:100), 3);

Sir! I think there are careless mistakes in 180 and 183 lines of train.py

The process of getting "predicted_RGB" and "target_RGB" is wrong.
the operating will get "nan" matrix in second channel.
I think it is reason why our result deviates from the real image in hue, which is training model with real multispectral and panchromatic Images。

Regarding the issues of parameters in the config_HSIT.json file, the number of heads in multi-head attention, and the calculation of metrics.

@wgcban
Hello sir,
thank you for your outstanding work and providing code on HyperTransformer. In order to cite your paper better, I have a few questions. Firstly, in the paper, it was mentioned that the best performance was achieved when the number of heads in the multi-head attention was 16. However, the best model provided by you in config_HSIT.json was using 8 heads, and there were errors in the RGB parameters in the same file. Can you provide the correct best model and config_HSIT.json file? It is difficult to reproduce your method without the correct files.
Secondly, for the calculation of the metrics, did you use the results generated by the code or did you re-calculate them using MATLAB?
Your response is crucially important, and I am very grateful for your work.

KeyError: 'multi_scale_loss' on training the backbone

Hi @wgcban ,
thank you very much to publish the code for the paper!

I was trying to train the backbone (first stage):

python train.py --config configs/config_HSIT_PRE.json

But, the training script is raising an error:

Traceback (most recent call last):
  File "train.py", line 346, in <module>
    train(epoch)
  File "train.py", line 198, in train
    if config[config["train_dataset"]]["multi_scale_loss"]:
KeyError: 'multi_scale_loss'

Is the configuration value for multi_scale_loss missing?
Which value do I need to set?
Thanks for your help! :D

A problem when reproduced on large scale factor super-resolution

Thank you for your work on Hypertransformer! When I tried HyperTransformer on CAVE dataset with 32-scale factor super-resolution, the loss is so huge. In order to make the code suitable for 32x super resolution tasks, I upsampled the MS_image 4 times and then fed it into the feature extractor in backbone (i.e., the self.SFE in code). The HSIs in CAVE have been normailized into 0~1. But the numerical range of reocnstructed HSI is very huge too, about 3e4.

The size of MS_image: 8x8x31; Size of PAN_image(RGB image): 32x32x3, Batchsize:5;

Training Epoch: 1 Loss: 3612763.764423077
Training Epoch: 2 Loss: 766998.2764423077
Training Epoch: 3 Loss: 350786.46514423075
Training Epoch: 4 Loss: 230237.5733173077
Training Epoch: 5 Loss: 184773.47115384616
Training Epoch: 6 Loss: 160125.31189903847
Training Epoch: 7 Loss: 137537.77524038462
Training Epoch: 8 Loss: 117618.86358173077
Training Epoch: 9 Loss: 106500.88341346153
Training Epoch: 10 Loss: 96507.72115384616

Furthermore, the output is displayed on RGB. I want to known if there is something wrong.
(https://github.com/Caoxuheng/imgs/blob/main/1.png)

help!

hello sir,sorry to bother you! i have a problem about the matlab code,i trying to run the process_pavia.m text but always got the problem of miss the function of disp_rgb,how to fix it?

clerical errors in essays

Hello！Thanks for yourwork! I think there are clerical errors in the second sentence and second paragraph on page six of the paper.The exact words are "We use 10-th, 10-th, and 12-th spectral bands as the blue-band" and " Errur Relative Globale Adimensionnelle Desynthese (ERGAS)". I wonder if this is a clerical error.

Having trouble reproduce the score in botswana4 dataset

Hello, thank you for the code and work here. When I tried to reproduce the score on the botswana4 dataset, I only got roughly 26db PSNR with the two stage training strategy, and pretraining phase only gets a 16db PSNR score. Do you have any ideas about what went wrong? Thank you.

Furthermore, when I try to test with the pretrained weights and config file you provided, I get a weird score like the below,

When using the final_prediction.mat you provided, the psnr with the gt I generated is 29.84db. My dataset is generated using the Matlab code "process_botswana.m". Do you have any ideas where things went wrong? Thank you.

Sir! A question about Eq. 7 and its coding implementation.

Hello Sir! Thank you for the code and work here.
Here, I haven't found the coding implementation of q-mean(q^1) or k-mean(k^1)) in the corresponding source coding. Is this version incomplete, or have I missed some vital details?

Sir, I have some problems in reproducing your paper.

I'm sorry to bother you, sir.
I ran your code in my python environment, and the ERGAS index value after training is 0.67. Is this a problem caused by my environment?