GithubHelp home page GithubHelp logo

henghuiding / mose-api Goto Github PK

View Code? Open in Web Editor NEW
295.0 6.0 2.0 26 KB

[ICCV 2023] MOSE: A New Dataset for Video Object Segmentation in Complex Scenes

Home Page: https://henghuiding.github.io/MOSE/

Python 100.00%
benchmark complex-environment dataset iccv2023 video-object-segmentation video-segmentation

mose-api's Introduction

MOSE: A New Dataset for Video Object Segmentation in Complex Scenes

๐Ÿ [Homepage] โ€ƒ ๐Ÿ“„[Arxiv]

This repository contains information and tools for the MOSE dataset.

Download

[๐Ÿ”ฅ02.09.2023: Dataset has been released!]

โฌ‡๏ธ Get the dataset from:

๐Ÿ“ฆ Or use gdown:

# train.tar.gz
gdown 'https://drive.google.com/uc?id=ID_removed_to_avoid_overaccesses_get_it_by_yourself'

# valid.tar.gz
gdown 'https://drive.google.com/uc?id=ID_removed_to_avoid_overaccesses_get_it_by_yourself'

# test set will be released when competition starts.

Please also check the SHA256 sum of the files to ensure the data intergrity:

3f805e66ecb576fdd37a1ab2b06b08a428edd71994920443f70d09537918270b train.tar.gz
884baecf7d7e85cd35486e45d6c474dc34352a227ac75c49f6d5e4afb61b331c valid.tar.gz

Evaluation

[๐Ÿ”ฅ02.16.2023: Our CodaLab competition is on live now!]

Please submit your results on

File Structure

The dataset follows a similar structure as DAVIS and Youtube-VOS. The dataset consists of two parts: JPEGImages which holds the frame images, and Annotations which contains the corresponding segmentation masks. The frame images are numbered using five-digit numbers. Annotations are saved in color-pattlate mode PNGs like DAVIS.

Please note that while annotations for all frames in the training set are provided, annotations for the validation set will only include the first frame.

<train/valid.tar>
โ”‚
โ”œโ”€โ”€ Annotations
โ”‚ โ”‚ 
โ”‚ โ”œโ”€โ”€ <video_name_1>
โ”‚ โ”‚ โ”œโ”€โ”€ 00000.png
โ”‚ โ”‚ โ”œโ”€โ”€ 00001.png
โ”‚ โ”‚ โ””โ”€โ”€ ...
โ”‚ โ”‚ 
โ”‚ โ”œโ”€โ”€ <video_name_2>
โ”‚ โ”‚ โ”œโ”€โ”€ 00000.png
โ”‚ โ”‚ โ”œโ”€โ”€ 00001.png
โ”‚ โ”‚ โ””โ”€โ”€ ...
โ”‚ โ”‚ 
โ”‚ โ”œโ”€โ”€ <video_name_...>
โ”‚ 
โ””โ”€โ”€ JPEGImages
  โ”‚ 
  โ”œโ”€โ”€ <video_name_1>
  โ”‚ โ”œโ”€โ”€ 00000.jpg
  โ”‚ โ”œโ”€โ”€ 00001.jpg
  โ”‚ โ””โ”€โ”€ ...
  โ”‚ 
  โ”œโ”€โ”€ <video_name_2>
  โ”‚ โ”œโ”€โ”€ 00000.jpg
  โ”‚ โ”œโ”€โ”€ 00001.jpg
  โ”‚ โ””โ”€โ”€ ...
  โ”‚ 
  โ””โ”€โ”€ <video_name_...>

BibTeX

Please consider to cite MOSE if it helps your research.

@inproceedings{MOSE,
  title={{MOSE}: A New Dataset for Video Object Segmentation in Complex Scenes},
  author={Ding, Henghui and Liu, Chang and He, Shuting and Jiang, Xudong and Torr, Philip HS and Bai, Song},
  booktitle={ICCV},
  year={2023}
}

License

MOSE is licensed under a CC BY-NC-SA 4.0 License. The data of MOSE is released for non-commercial research purpose only.

mose-api's People

Contributors

changliu19 avatar henghuiding avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

mose-api's Issues

Annotation tool

Thank you for your wonderful work! Could you share the annotation tool used to build the dataset? I appreciate if you can release codes for annotation tool.

Gdown is no longer working and Google drive download is very unstable

Hello!

We appreciate the release of this dataset! It's fantastic.
I would like just to point out that gdown is showing this error:

`gdown https://drive.google.com/uc\?id\=10HYO-CJTaITalhzl_Zbz_Qpesh8F3gZR
Access denied with the following error:

Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses. 

You may still be able to access the file from the browser:

 https://drive.google.com/uc?id=10HYO-CJTaITalhzl_Zbz_Qpesh8F3gZR 

`
And when you download from Google Drive it breaks every 2 min and eventually is a failed download. Besides, in Baidu, it shows errors to download. The most stable so far is One Drive.

Thanks.

Unsupervised VOS Setting

Thanks for your answer before!

I have another question about the unsupervised VOS setting, since I want to test some VIS methods on MOSE and need to follow the same way you used in the MOSE paper. From my understanding, unsupervised VOS with multiple objects = VIS (VIdeo Instance Segmentation), both segment and track objects of predefined categories without reference on the first frame, is that right?

Therefore, when you run those methods listed in Table 5, you train these methods on the training set of MOSE (only on those videos where the first frame is exhausitively labeled), and then test them on the validation set (only those videos with exhasutive labeled first frame) without reference on the first frame, which is the same way of trainging and testing VIS method, is this the case?

Could you please clarify this? Thanks a lot!

Some folders only have one image

Hi,

I have recently realized that some folders only have one image, which is weird for a video dataset. For instance :

MOSE/train/JPEGImages/9eb92f21

Is it true or the problem is somewhere from my side?

DeAOT training & inference.

Since you uploaded the code of XMem, could you please also provide the train_datasets.py and eval_datasets.py of aot-benchmark for MOSE?
And did you change the training config, or same as YTB, like
self.DATA_MOSE_REPEAT = 1, self.DATA_RANDOM_GAP_MOSE = 3

Thanks a lot!

Training setting

Thanks for your work!

I want to ask about the training setting. Your paper said "We replace the training dataset of previous methods from YouTubeVOS with our MOSE and strictly follow their training settings on YouTube-VOS [3]."

Most previous works training on YouTube-VOS and DAVIS in the main-training after the image pre-training.
Do you remove the DAVIS dataset? If that's the case, is it because you tested that removing DAVIS worked better?

Thanks a lot for your answer!

About the use of data sets

Thanks for the data, I am using a dataset like DAVIS structure for the first time, I want to turn the dataset into a human segmented dataset, is there any way I can extract all the humans from the segmented annotations?

Question about the experimental result of STCN on table 3

I trained STCN on MOSE, but your paper had a different result.

What I've done so far:

  1. downloaded a pre-trained model of STCN - static image pre-trained version
  2. trained on only MOSE, with the same setting as stage 3 of STCN.
  3. inference on MOSE valid using eval_genenric.py of STCN
  4. uploaded to MOSE codalab

and the score I got on MOSE codalab was 0.2601784555.

Do you have any idea why this discrepancy showed?

Why videos with only one frame?

Hi, why are there "videos" in the dataset with only one frame?
For example 330ac20d, 9eb92f21, a4287634, ce1ea47c.

I'm just curious if there's a reason, otherwise, thanks for this dataset.

Possible Out-Of-Date SHA256sum for train.tar.gz

First of all, thank you for this amazing work! I would kindly ask you a confirm, since I am facing an issue in checking data integrity with sha256sum.
I've downloaded the .tar.gz training file from OneDrive, as it is the suggested source.
The sums reported in the corresponding file in the OneDrive folder match the ones reported on GitHub, but by downloading the train multiple times, I get a different value (but always the same) for that file, which does not match the one reported in the file in OneDrive and on GitHub.
I've also tried downloading it from multiple PCs (Desktop + Windows + Chrome, Laptop + Arch Linux + Firefox), and I always obtain the same sha256sum, which is different from the one reported.
Am I doing something wrong, or is the sum value for train.tar.gz actually out of date?

Unsupervised VOS Evaluation

Hi, thanks for the great dataset!

I am interested in the unsupervised VOS part. Although the meatafiles for training set and validation set have included the first_frame_exhaustive_anno part to denote whether the first frame is exhaustively annotated, the evaluation server on CodaLab seems to not include the specific results for the unsupervised VOS setting.

If that's the case, do we have any other way to evaluate the unsupervised VOS setting and so we can compare it with the Table 5 results in the MOSE paper? Thank you!

Unsupervised VOS

Hello, thank you very much for your work. Is there any dataset for Unsupervised VOS๏ผŸ

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.