henghuiding / mose-api Goto Github PK

[ICCV 2023] MOSE: A New Dataset for Video Object Segmentation in Complex Scenes

Home Page: https://henghuiding.github.io/MOSE/

Python 100.00%

benchmark complex-environment dataset iccv2023 video-object-segmentation video-segmentation

mose-api's Issues

Annotation tool

Thank you for your wonderful work! Could you share the annotation tool used to build the dataset? I appreciate if you can release codes for annotation tool.

I have another question about the unsupervised VOS setting, since I want to test some VIS methods on MOSE and need to follow the same way you used in the MOSE paper. From my understanding, unsupervised VOS with multiple objects = VIS (VIdeo Instance Segmentation), both segment and track objects of predefined categories without reference on the first frame, is that right?

Therefore, when you run those methods listed in Table 5, you train these methods on the training set of MOSE (only on those videos where the first frame is exhausitively labeled), and then test them on the validation set (only those videos with exhasutive labeled first frame) without reference on the first frame, which is the same way of trainging and testing VIS method, is this the case?

Could you please clarify this? Thanks a lot!

Unsupervised VOS Evaluation

Hi, thanks for the great dataset!

I am interested in the unsupervised VOS part. Although the meatafiles for training set and validation set have included the first_frame_exhaustive_anno part to denote whether the first frame is exhaustively annotated, the evaluation server on CodaLab seems to not include the specific results for the unsupervised VOS setting.

If that's the case, do we have any other way to evaluate the unsupervised VOS setting and so we can compare it with the Table 5 results in the MOSE paper? Thank you!

Some folders only have one image

Hi,

I have recently realized that some folders only have one image, which is weird for a video dataset. For instance :

MOSE/train/JPEGImages/9eb92f21

Is it true or the problem is somewhere from my side?

Unsupervised VOS

Hello, thank you very much for your work. Is there any dataset for Unsupervised VOS？

Gdown is no longer working and Google drive download is very unstable

Hello!

We appreciate the release of this dataset! It's fantastic.
I would like just to point out that gdown is showing this error:

`gdown https://drive.google.com/uc\?id\=10HYO-CJTaITalhzl_Zbz_Qpesh8F3gZR
Access denied with the following error:

Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses.

You may still be able to access the file from the browser:

 https://drive.google.com/uc?id=10HYO-CJTaITalhzl_Zbz_Qpesh8F3gZR

`
And when you download from Google Drive it breaks every 2 min and eventually is a failed download. Besides, in Baidu, it shows errors to download. The most stable so far is One Drive.

Thanks.

Why videos with only one frame?

Hi, why are there "videos" in the dataset with only one frame?
For example 330ac20d, 9eb92f21, a4287634, ce1ea47c.

I'm just curious if there's a reason, otherwise, thanks for this dataset.

Training setting

Thanks for your work!

I want to ask about the training setting. Your paper said "We replace the training dataset of previous methods from YouTubeVOS with our MOSE and strictly follow their training settings on YouTube-VOS [3]."

Most previous works training on YouTube-VOS and DAVIS in the main-training after the image pre-training.
Do you remove the DAVIS dataset? If that's the case, is it because you tested that removing DAVIS worked better?

Thanks a lot for your answer!

DeAOT training & inference.

Since you uploaded the code of XMem, could you please also provide the train_datasets.py and eval_datasets.py of aot-benchmark for MOSE?
And did you change the training config, or same as YTB, like
self.DATA_MOSE_REPEAT = 1, self.DATA_RANDOM_GAP_MOSE = 3

Thanks a lot!

Boundaries threshold

Are you using morphological ops like in the official Davis evaluation toolkit?

Cause that impl could depend on the frame size and so It could be very generous with not too big objects:
https://github.com/davisvideochallenge/davis2017-evaluation/blob/master/davis2017/metrics.py#L77

If you are using the same approach for MOSE eval you could at least report F at different threshold levels.

Possible Out-Of-Date SHA256sum for train.tar.gz

First of all, thank you for this amazing work! I would kindly ask you a confirm, since I am facing an issue in checking data integrity with sha256sum.
I've downloaded the .tar.gz training file from OneDrive, as it is the suggested source.
The sums reported in the corresponding file in the OneDrive folder match the ones reported on GitHub, but by downloading the train multiple times, I get a different value (but always the same) for that file, which does not match the one reported in the file in OneDrive and on GitHub.
I've also tried downloading it from multiple PCs (Desktop + Windows + Chrome, Laptop + Arch Linux + Firefox), and I always obtain the same sha256sum, which is different from the one reported.
Am I doing something wrong, or is the sum value for train.tar.gz actually out of date?

Download issues

Please feel free to raise any download issues here.

Question about the experimental result of STCN on table 3

I trained STCN on MOSE, but your paper had a different result.

What I've done so far:

downloaded a pre-trained model of STCN - static image pre-trained version
trained on only MOSE, with the same setting as stage 3 of STCN.
inference on MOSE valid using eval_genenric.py of STCN
uploaded to MOSE codalab

and the score I got on MOSE codalab was 0.2601784555.

Do you have any idea why this discrepancy showed?

About the use of data sets

Thanks for the data, I am using a dataset like DAVIS structure for the first time, I want to turn the dataset into a human segmented dataset, is there any way I can extract all the humans from the segmented annotations?

the frame interval I tested was 1

Thanks for your great work, the frame interval I tested was 1, do you know what went wrong?

henghuiding / mose-api Goto Github PK

mose-api's Issues

Annotation tool

Unsupervised VOS Setting

Unsupervised VOS Evaluation

Some folders only have one image

Unsupervised VOS

Gdown is no longer working and Google drive download is very unstable

Why videos with only one frame?

Training setting

DeAOT training & inference.

Boundaries threshold

Possible Out-Of-Date SHA256sum for train.tar.gz

Download issues

Question about the experimental result of STCN on table 3

About the use of data sets

the frame interval I tested was 1

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs