aofrancani / tsformer-vo Goto Github PK

View Code? Open in Web Editor NEW

57.0 2.0 8.0 306 KB

Implementation of the paper "Transformer-based model for monocular visual odometry: a video understanding approach".

Home Page: https://arxiv.org/abs/2305.06121

License: MIT License

Python 100.00%

deep-learning monocular-visual-odometry transformer-models visual-odometry visual-slam

tsformer-vo's People

Contributors

Stargazers

Watchers

Forkers

wq1989 jorgepradoh zhuyaohui1998 rayson-chan shneka-swamy isl2 mohitburkule tim-prangemeier gitshohoku

tsformer-vo's Issues

7DoF and 6DoF

How to visualize the trajectory of 7DoF and 6DoF respectively? Thank you if you can tell me on the code

An error occurs in pretrained_ViT: True

Thank you for sharing your great work.

When setting pretrained_ViT: True in args of train.py, the following error occurs. I confirmed that the ViT model was successfully downloaded. Can you tell me how to solve it?

Building model...
--- loading pretrained to start training ---
https://dl.fbaipublicfiles.com/deit/deit_small_patch16_224-cd65a155.pth

Downloading: "https://dl.fbaipublicfiles.com/deit/deit_small_patch16_224-cd65a155.pth" to /home/dmsai3/.cache/torch/hub/checkpoints/deit_small_patch16_224-cd65a155.pth
Traceback (most recent call last):
File "train.py", line 239, in
model, args = build_model(args, model_params)
File "/home/dmsai3/TSformer-VO/build_model.py", line 97, in build_model
load_pretrained(model, num_classes=model_params["num_classes"],
File "/home/dmsai3/TSformer-VO/timesformer/models/helpers.py", line 161, in load_pretrained
elif num_classes != state_dict[classifier_name + '.weight'].size(0):
KeyError: 'head.weight'

Dataset indexing issues

Hello, thank you very much for providing the code. I encountered the error 'Value Error: Length of values (14860) does not match length of index (14864)' while running train.py. I have not been able to resolve this issue. How can I resolve this issue? @aofrancani
The specific error message is as follows ：
python train.py
Using CUDA: True
Loading data...
Traceback (most recent call last):
File "train.py", line 223, in
dataset = KITTI(window_size=args["window_size"], overlap=args["overlap"], transform=preprocess)
File "/home/sy/TSformer-VO-main/datasets/kitti.py", line 59, in init
data["frames"] = frames
File "/home/sy/anaconda3/envs/tsformer-vo/lib/python3.8/site-packages/pandas/core/frame.py", line 3044, in setitem
self._set_item(key, value)
File "/home/sy/anaconda3/envs/tsformer-vo/lib/python3.8/site-packages/pandas/core/frame.py", line 3120, in _set_item
value = self._sanitize_column(key, value)
File "/home/sy/anaconda3/envs/tsformer-vo/lib/python3.8/site-packages/pandas/core/frame.py", line 3768, in _sanitize_column
value = sanitize_index(value, self.index)
File "/home/sy/anaconda3/envs/tsformer-vo/lib/python3.8/site-packages/pandas/core/internals/construction.py", line 747, in sanitize_index
raise ValueError(
ValueError: Length of values (14860) does not match length of index (14864)

How can get the value of undo normalization step in plot_results.py

Hi, thanks for the great work. I saw that in the plot_result.py you set

        mean_angles = np.array([1.7061e-5, 9.5582e-4, -5.5258e-5])
        std_angles = np.array([2.8256e-3, 1.7771e-2, 3.2326e-3])
        mean_t = np.array([-8.6736e-5, -1.6038e-2, 9.0033e-1])
        std_t = np.array([2.5584e-2, 1.8545e-2, 3.0352e-1])

How can get these values and is this step necessary in the evaluation stage?

06 sequence

Why is the 06 sequence result so poor? May I ask if there are any improvement measures?

window_sizw

Hello, thank you very much for your multiple replies. I apologize for bothering you again. I have a question. Should the window_size in these three places (train.py and kitti. py) in the picture be the same? When the value is 2, it represents VO1, when the value is 3, it represents VO2, and when the value is 4, it represents VO3. May I ask if this is the understanding?

To achieve results more closely aligned with the paper

Hi. A few days ago, I encountered an error while attempting to run the pretrained_ViT model. I managed to resolve it through another issue. Actually, the reason I attempted to run the pretrained_ViT model was because the results of the non-pretrained model were inconsistent with the results in the paper provided in this GitHub repository. Therefore, after resolving the pretrained issue, I trained the model with pretrained_ViT set to True, and obtained results for sequences 01, 03, 04, 05, 06, 07, and 10 as follows:

Here are the settings in train.py:

args = {
    "data_dir": "data",
    "bsize": 4,  # batch size
    "val_split": 0.1,  # percentage to use as validation data
    "window_size": 2,  # number of frames in window
    "overlap": 1,  # number of frames overlapped between windows
    "optimizer": "Adam",  # optimizer [Adam, SGD, Adagrad, RAdam]
    "lr": 1e-5,  # learning rate
    "momentum": 0.9,  # SGD momentum
    "weight_decay": 1e-4,  # SGD momentum
    "epoch": 100,  # train iters each timestep
	"weighted_loss": None,  # float to weight angles in loss function
  	"pretrained_ViT": True,  # load weights from pre-trained ViT
    "checkpoint_path": "checkpoints/Exp_vit_base_2",  # path to save checkpoint
    "checkpoint": None,  # checkpoint
}

# tiny  - patch_size=16, embed_dim=192, depth=12, num_heads=3
# small - patch_size=16, embed_dim=384, depth=12, num_heads=6
# base  - patch_size=16, embed_dim=768, depth=12, num_heads=12
model_params = {
    "dim": 768,
    "image_size": (192, 640),  #(192, 640),
    "patch_size": 16,
    "attention_type": 'divided_space_time',  # ['divided_space_time', 'space_only','joint_space_time', 'time_only']
    "num_frames": args["window_size"],
    "num_classes": 6 * (args["window_size"] - 1),  # 6 DoF for each frame
    "depth": 12,
    "heads": 12,
    "dim_head": 64,
    "attn_dropout": 0.1,
    "ff_dropout": 0.1,
    "time_only": False,
}

The results are similar to the pretrained models provided on GitHub, namely Model1, Model2, and Model3.

It seems like I might have made a mistake somewhere. Could you kindly advise on what I should correct?

"data_dir": "data",

in args, is this supposed to be the training folder? or just the data folder containing the 11 kitti sequences?

I look forward to your answer, cheers!

aofrancani / tsformer-vo Goto Github PK

tsformer-vo's People

Contributors

Stargazers

Watchers

Forkers

tsformer-vo's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs