helloricky123 / siamese-rpn Goto Github PK

View Code? Open in Web Editor NEW

223.0 14.0 44.0 4.58 MB

Full reimplementation of siamese rpn, has 0.24 eao on vot2017.

License: MIT License

Python 100.00%

siamese pytorch siamrpn

siamese-rpn's Introduction

Siamese-RPN

This is a PyTorch implementation of SiameseRPN. This project is mainly based on SiamFC-PyTorch and DaSiamRPN.

For more details about siameseRPN please refer to the paper : High Performance Visual Tracking with Siamese Region Proposal Network by Bo Li, Junjie Yan,Wei Wu, Zheng Zhu, Xiaolin Hu.

This repository includes training and tracking codes.

Results

This project can get 0.626 AUC on OTB100, and can get better result than the DaSiamRPN on 46 videos. Test results of 50 trained models on OTB100 are available in the eval_result.json. The best is the 38 epoch.

Data preparation:

You should first get VID dataset and youtube-bb dataset. This process is a little troublesome. The part of code has not been formatted by now. If any one do this, please give a git pull request.

python bin/create_dataset_ytbid.py --vid-dir /PATH/TO/ILSVRC2015 --ytb-dir /PATH/TO/YT-BB --output-dir /PATH/TO/SAVE_DATA --num_threads 6

The command above will get a dataset, I put the dataset in the baiduyundisk. Use this data to create lmdb. 链接:https://pan.baidu.com/s/1QnQEM_jtc3alX8RyZ3i4-g 密码:myq4

python bin/create_lmdb.py --data-dir /PATH/TO/SAVE_DATA --output-dir /PATH/TO/RESULT.lmdb --num_threads 12

Traing phase:

python bin/train_siamrpn.py --data_dir /PATH/TO/SAVE_DATA

Test phase:

Change the data_path first in the test_OTB.py, then run:

python bin/test_OTB.py -ms /PATH/TO/MODEL -v cvpr2013

Environment:

python version == 3.6.5

pytorch version == 1.0.0

Model Download:

Pretrained model on Imagenet: https://drive.google.com/drive/folders/1HJOvl_irX3KFbtfj88_FVLtukMI1GTCR

Model with 0.626 AUC: https://pan.baidu.com/s/1vSvTqxaFwgmZdS00U3YIzQ keyword:v91k

Reference

[1] Li B , Yan J , Wu W , et al. High Performance Visual Tracking with Siamese Region Proposal Network[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2018.

siamese-rpn's People

Contributors

Stargazers

Watchers

siamese-rpn's Issues

Which pretrained model to choose?

When I run demo_siamfc.py in 59version, which pretrained model to choose?, there are many pretrained models, I just got confused. @HelloRicky123

evaluate on OTB100 , I get 0.6016 AUC

At first, Thanks for your great work！respect !!!!
but when I use siam_38.pth to evaluate OTB100 , I only get 0.6016 AUC. I don't why !
I have little change of your code, just make 'Jogging' and 'Skating2' to 'Jogging-1'&'Jooging-2' and 'Skating2-1'&'Skating2-2'
also I use different pytoch version 0.4.0
Could you tell me which parameters do you adjust to improve the tracker's performance?
Thanks

Error: ModuleNotFoundError: No module named 'siamfc'

In your bin/demo_siamrpn.py file, there is one line code which likes:

from siamfc import SiamRPNTracker

Can you provide the siamfc module so that we can run your code more easily?

some questions about training

Thanks for your excellent work. At the beginning of training, the cls_loss and reg_loss is Nan, is this because I didn't load the pretrained model or something else? If I want to change the siamese-network, how do I prepare a new pretrained model?

some questions

Hi thanks for your work ,i found some questions about this code in utils/crop_and_pad functions
if any([top, bottom, left, right]):
te_im = np.zeros((r + top + bottom, c + left + right, k), np.uint8) # 0 is better than 1 initialization
te_im[top:top + r, left:left + c, :] = img
if top:
te_im[0:top, left:left + c, :] = img_mean
if bottom:
te_im[r + top:, left:left + c, :] = img_mean
if left:
te_im[:, 0:left, :] = img_mean
if right:
te_im[:, c + left:, :] = img_mean

whether it should be change to
if any([top, bottom, left, right]):
te_im = np.zeros((r + top + bottom, c + left + right, k), np.uint8) # 0 is better than 1 initialization
te_im[top:top + r, left:left + c, :] = img
if top:
te_im[0:top,; , :] = img_mean
if bottom:
te_im[r + top:, ;, :] = img_mean
if left:
te_im[:, 0:left, :] = img_mean
if right:
te_im[:, c + left:, :] = img_mean
and the other problem is that it takes a longtime to run bin/create_dataset and bin/create_lmdb
thank you!

Where is vid15rpn_large file?

No such file or directory: './dataset_ssd/vid15rpn_large/meta_data.pkl'

fix former 3 conv layer

hello,
Two questions for training:

Did you freeze the last convolution layer too in training phase?
keys_former3conv = ['featureExtract.0.weight', 'featureExtract.0.bias', 'featureExtract.1.weight',
'featureExtract.1.bias', 'featureExtract.1.running_mean', 'featureExtract.1.running_var',
'featureExtract.4.weight', 'featureExtract.4.bias', 'featureExtract.5.weight',
'featureExtract.5.bias', 'featureExtract.5.running_mean', 'featureExtract.5.running_var',
'featureExtract.8.weight', 'featureExtract.8.bias', 'featureExtract.9.weight',
'featureExtract.9.bias', 'featureExtract.9.running_mean', 'featureExtract.9.running_var']
What's the meaning of set track_running_stats to false if you have set required_grad to false?
'model.featureExtract[layer].track_running_stats = False',

data processing

Hi Thank you for the code. You generated meta_data.pkl file in create_dataset_ytbid with youtobe and ILSVRC2015, but since the YouTubeBB dataset I got has been processed and has a separate meta_data.pkl. So what do I do next before I run create_lmdb.py? Can I combine their meta_data.pkl directly? Or do I set --data-dir to two paths? I hope you can give me some advice. Thank you very much！

separate the two datasets

Hello, can you separate the two datasets and pass them separately? After all, the data set is a bit big, one of the data sets I already have, YTB has not, thank you

train process

The ./siamrpn_30.pth is got after 20 epochs' 8e-4 lr and 30 epochs' 1e-2
what did this mean?
did you first train the model with 20 epochs' 8e-4 lr and 30 epochs' 1e-2, and then use the train.py script to train it again?
if we want to train it from scratch, how to set the lr schedule?
thanks

Where can I download the latest pretrained model

Thank you for your creative work! But I can not find the pretrained model which gets 0.59 AUC on OTB100. I am appreciative for making it available.

EAO on vot2017

The EAO on vot2017 is 0.2366

About the loss

You did a good work. I also implemented the training code, but I always get a higher loss in the validation set compared to the training set. The performance on the OTB dataset is not good. Do you have this problem? Whats the values of your loss of the regression and classification branch, respectively?

about youtube_BB dataset 111G

SiamRPN author had published their training dataset. His youtube_BB dataset is about 400G. Is your dataset had been cropped? Anyway, thanks for your work!

About dataset downloading link

Because the limitation of the BaiduYun disk, can you upload the dataset you mentioned in the README.MD to Google Drive? thanks.

this shared code is very good

Hi
for my curiosity ,this guy is very good

beside , similarity code structure

Where can I find tb_50.txt, tb_100.txt and cvpr13.txt

How to get tb_50.txt, tb_100.txt and cvpr13.txt？

thank you for your exciting work. can you open the code about VOT test.??

some issues

2 Issues...

seems there is no .flatten() in PyTorch 4.0.

2.There is no reduction='sum' in PyTorch 4.0.

Thanks

train the 54vision without yt_bb

Hi, I trained your 54vision with VID and got Nan loss, then I reduce the LR it can run properly. I run the test_OTB.py with my trained model but error occured to data overflow, why？Awaiting your reply.

CAN'T RUN IN WIN10

when i run the demo_siamesefc.py in win10 ,it report the error:

D:\code\SiamFC-PyTorch-master\SiamFC-PyTorch>python bin/demo_siamfc.py --video-dir D:/code/SiamFC-PyTorch-master/SiamFC-PyTorch/data/David --gpu-id 0 --model-path D:/code/SiamFC-PyTorch-master/SiamFC-PyTorch/models
Traceback (most recent call last):
File "bin/demo_siamfc.py", line 57, in
Fire(main)
File "C:\Users\Acer\AppData\Local\Programs\Python\Python35\lib\site-packages\fire\core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "C:\Users\Acer\AppData\Local\Programs\Python\Python35\lib\site-packages\fire\core.py", line 366, in _Fire
component, remaining_args)
File "C:\Users\Acer\AppData\Local\Programs\Python\Python35\lib\site-packages\fire\core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "bin/demo_siamfc.py", line 27, in main
tracker = SiamFCTracker(model_path, gpu_id)
File "D:\code\SiamFC-PyTorch-master\SiamFC-PyTorch\siamfc\tracker.py", line 23, in init
self.model.load_state_dict(torch.load(model_path))
File "C:\Users\Acer\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\serialization.py", line 365, in load
f = open(f, 'rb')
PermissionError: [Errno 13] Permission denied: 'D:/code/SiamFC-PyTorch-master/SiamFC-PyTorch/models'

even i have restart the cmd window and Run as an administrator,when i run the demo_simesefc in ubuntu because i haven't install GPU-driver in ubuntu ,so please give me an hand as soon as you seen it.thanks

is it a bug when get gt_cx or gt_cy less than zero

ground_truth's cx cy ,some time less than zero

do some change to return with bbox info in siamfc/dataset.py
"origin:"--->"add bbox return:"

def __getitem__(self,index):
     return exemplar_img, instance_img, regression_target, conf_target.astype(np.int64)

--->

def __getitem__(self,index):
     return exemplar_img, instance_img, regression_target, conf_target.astype(np.int64),\
gt_cx,gt_cy ,gt_w, gt_h

then debug bbox detail by print twice


gt_cx,gt_cy ,gt_w, gt_h :: tensor([8.], dtype=torch.float64) tensor([-2.], dtype=torch.float64) \
tensor([83.1059], dtype=torch.float64) tensor([49.2817], dtype=torch.float64)

gt_cx,gt_cy ,gt_w, gt_h :: tensor([6.], dtype=torch.float64) tensor([7.5000], dtype=torch.float64)\
 tensor([76.8523], dtype=torch.float64) tensor([43.6093], dtype=torch.float64)

the error's occured position maybe at RandomCrop ,where "gt_cx = cx_o - cx" at line 136 ,line 137
problem in here

about the dataset

Thank you for your working! I don't have ytb dataset. So if I use the dataset uploaded by you on 5.30, do I need to download the ytb dataset?

where can i get the pretrained Alexnet

the Alexnet used in your work is not common with the general alexnet. so how can i get the pretrained Alexnet for this work. training with imagenet dataset myself or some else.
thx for your aply!

Question about training

I use your training code with pretrained AlexNet to train SiamRPN, but the performance is pretty bad when only using VID dataset. I use the test code provided by arbitularov. When testing your pretrained model on OTB2015, the success score is 0.577 and the precision score is 0.772. However when using the saved model produced during my training, the performance is worser (success rate is about 0.45 and the precision score is about 0.61). I'm wondering if the huge gap is origined from the lack of Youtube dataset or just the fluctuation of the training.

About processed dataset

hi, a good work. can u upload processed dataset to BaiduCloud?

I want to ask some question about the code in "net/dataset.py"?

I read your code carefully, and i want to know the meaning or function in dataset.py as follow, thank you so much.
" cy_o = (im_h - 1) / 2
cx_o = (im_w - 1) / 2
cy = cy_o + np.random.randint(- self.max_translate, self.max_translate + 1)
cx = cx_o + np.random.randint(- self.max_translate, self.max_translate + 1)
gt_cx = cx_o - cx
gt_cy = cy_o - cy"

TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

when I run the 59version/train_siamrpn.py

what's wrong

File "/home/fc/文档/Siamese-RPN-master/59version/lib/utils.py", line 99, in crop_and_pad
te_im[:, 0:left, :] = img_mean
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

about the 54version

I noticed that the config.py in your 54version is different from the former version ,the instance_size of 54ver is 271 while the instance_size of the former ver is 255.Does this mean that data preprocessing needs to be done again when running version 054 of the code?

train on the base of the pretrained model SiamRPNotb.model

can this code train on the base of the pretrained model "SiamRPNotb.model" released by foolwood

Why is the AUC lower when the pretrained model is loaded?

I loaded the pretrained model alexnet.pth when running train_siamrpn.py and got lower AUC on OTB dataset .Without the pretrained model and youtube_bb dataset, i only get about 0.3 AUC on OTB dataset. Could you give me some advice?

What if I already have a small raw patch for exemplar_img?

If I have a small patch with same size as ground truth box, how to crop it into exemplar_img like training?

How to prepare YoutubeBB data?

Hi Ruiqi,
Can you please elaborate how you prepared YoutubeBB before running create_dataset_ytbid.py? How did you download and store it in /mnt/diska1/YT-BB/v2/youtube_dection_frame_temp/ and /mnt/usershare/zrq/pytorch/lab/model/zhangruiqi/ytb_vid/benchmark.ytbb/xml ?

Thank you!
Yiming

Does VOT support pytorch1.0 ？

Your PyTorch version is 1.0, is that supported by VOT toolkit ? In my case, some errors occured when using PyTorch 0.4.1, and only 0.3.1 works well.

Shouldn't bbox be 0-based before passing to crop method in imagenet data generation?

Siamese-RPN/bin/create_dataset_ytbid.py

Lines 105 to 112 in 8f0ec54

 bbox = np.array( 

 [(bbox[2] + bbox[0]) / 2, (bbox[3] + bbox[1]) / 2, bbox[2] - bbox[0] + 1, 

 bbox[3] - bbox[1] + 1]) 

 instance_img, w, h, _ = get_instance_image(img, bbox, 

 config.exemplar_size, instance_crop_size, 

 config.context_amount, 

 img_mean)

Hey! thanks for sharing your work. In the above lines pointed out, I see that you transformed imagenet bbox (1-based [xmin, ymin, xmax, ymax]?) to center-based width height form ([cx, cy, w, h]). I think that the center-based bbox which you'll get will be 1-based:

 bbox = np.array( 
     [(bbox[2] + bbox[0]) / 2,   # <--- cx will be 1-based since xmin and xmax are 1-based?
      (bbox[3] + bbox[1]) / 2,
       bbox[2] - bbox[0] + 1, 
      bbox[3] - bbox[1] + 1])

I guess bbox above should be 0-based box before passing to get_instance_image method which expects a 0-based centered box:

 bbox = np.array( 
     [(bbox[2] + bbox[0]) / 2 - 1, 
      (bbox[3] + bbox[1]) / 2 - 1,
       bbox[2] - bbox[0] + 1, 
      bbox[3] - bbox[1] + 1])

Otherwise the generated crops will not have object exactly at the center, it would be off by 1. What do you think?

The anchors seems to be wrong..

The total stride is not 8
In generate_anchors.py, The anchors are tiled with score_size*score_size times, while xx and yy are tiled with num_anchor times, it seems that the center of anchors should not be replaced directly using xx and yy...

why not normalize(sample/255)?

Thanks for your work. I noticed that images are not normalized before feeding into network in your code, and why?

AttributeError: 'NoneType' object has no attribute 'buffer'

when run train.py, "img_buffer=np.frombuffer(img_buffer,np.uint8)" will cause the error" AttributeError: 'NoneType' object has no attribute 'buffer'"

About instance_size

In config,py, the instance_size=271 but the size in the paper equals to 255.

how to test on vot?

I run your program and got only 0.1%AUC on OTB100

Thanks for your work at first. I download your code and just change "data_path" in file "test_OTB100" to my folder of OTB100 sequences , and use the model "siamrpn_38"that you provided, but when I run "test_OTB100.py" ,I got only 0.1% AUC on that. and I don't know why. Is there anything I forgot, or anything I did wrong ?

about the Proposal selection

Your Proposal selection in this code is the same as the paper "SiamRPN" or the paper"DasiamRPN"?
I just cannot find the new Proposal selection method in the code foolwood/DaSiamRPN.

noble action to open source such work , some suggestion to improve

1, for the architecture and loss_fn is based on RPN ,while, your module's name is Siamesefc . so ,I think is appropriate way to rename "SiamFC-PyTorch/siamRPN/" ,instead of "SiamFC-PyTorch/siamfc/"

2, it is handsome way if add benchmark or experiment result in last of README.md, contrast your choiced baseline , like this gay's action.

ps: according to your dataloader's implementation ,I only perceive that your datasource is ILSVRC-VID ,which is smaller than ILSVRC (where siameseRPN's office author had done his experiment ) . from above words ,is your result gained a higher performance than baseline ?

3, can you explain the config.warm_epoch vs config.epoch

~~4,can you explain "np.clip" putting in use . eg : why use torch.clip~~

best regards

ModuleNotFoundError: No module named 'siamfc'

In your bin/demo_siamrpn.py file, there is one line code which likes:

from siamfc import SiamRPNTracker

Can you provide the siamfc module so that we can run your code more easily?

Where is the dataset_ssd directory?

I got this error message:

FileNotFoundError: [Errno 2] No such file or directory: '/dataset_ssd/ytb_vid_rpn_id/meta_data.pkl'

Where is the dataset_ssd directory in the 59version? Should I create a such directory manually and then download the meta_data.pkl from the Internet and put it into that directory?

	bbox = np.array(
	[(bbox[2] + bbox[0]) / 2, (bbox[3] + bbox[1]) / 2, bbox[2] - bbox[0] + 1,
	bbox[3] - bbox[1] + 1])

	instance_img, w, h, _ = get_instance_image(img, bbox,
	config.exemplar_size, instance_crop_size,
	config.context_amount,
	img_mean)