henghuiding / mose-api Goto Github PK
View Code? Open in Web Editor NEW[ICCV 2023] MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
Home Page: https://henghuiding.github.io/MOSE/
[ICCV 2023] MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
Home Page: https://henghuiding.github.io/MOSE/
Thank you for your wonderful work! Could you share the annotation tool used to build the dataset? I appreciate if you can release codes for annotation tool.
Thanks for your answer before!
I have another question about the unsupervised VOS setting, since I want to test some VIS methods on MOSE and need to follow the same way you used in the MOSE paper. From my understanding, unsupervised VOS with multiple objects = VIS (VIdeo Instance Segmentation), both segment and track objects of predefined categories without reference on the first frame, is that right?
Therefore, when you run those methods listed in Table 5, you train these methods on the training set of MOSE (only on those videos where the first frame is exhausitively labeled), and then test them on the validation set (only those videos with exhasutive labeled first frame) without reference on the first frame, which is the same way of trainging and testing VIS method, is this the case?
Could you please clarify this? Thanks a lot!
Hi, thanks for the great dataset!
I am interested in the unsupervised VOS part. Although the meatafiles for training set and validation set have included the first_frame_exhaustive_anno
part to denote whether the first frame is exhaustively annotated, the evaluation server on CodaLab seems to not include the specific results for the unsupervised VOS setting.
If that's the case, do we have any other way to evaluate the unsupervised VOS setting and so we can compare it with the Table 5 results in the MOSE paper? Thank you!
Hi,
I have recently realized that some folders only have one image, which is weird for a video dataset. For instance :
MOSE/train/JPEGImages/9eb92f21
Is it true or the problem is somewhere from my side?
Hello, thank you very much for your work. Is there any dataset for Unsupervised VOS?
Hello!
We appreciate the release of this dataset! It's fantastic.
I would like just to point out that gdown is showing this error:
`gdown https://drive.google.com/uc\?id\=10HYO-CJTaITalhzl_Zbz_Qpesh8F3gZR
Access denied with the following error:
Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses.
You may still be able to access the file from the browser:
https://drive.google.com/uc?id=10HYO-CJTaITalhzl_Zbz_Qpesh8F3gZR
`
And when you download from Google Drive it breaks every 2 min and eventually is a failed download. Besides, in Baidu, it shows errors to download. The most stable so far is One Drive.
Thanks.
Hi, why are there "videos" in the dataset with only one frame?
For example 330ac20d, 9eb92f21, a4287634, ce1ea47c
.
I'm just curious if there's a reason, otherwise, thanks for this dataset.
Thanks for your work!
I want to ask about the training setting. Your paper said "We replace the training dataset of previous methods from YouTubeVOS with our MOSE and strictly follow their training settings on YouTube-VOS [3]."
Most previous works training on YouTube-VOS and DAVIS in the main-training after the image pre-training.
Do you remove the DAVIS dataset? If that's the case, is it because you tested that removing DAVIS worked better?
Thanks a lot for your answer!
Since you uploaded the code of XMem, could you please also provide the train_datasets.py and eval_datasets.py of aot-benchmark for MOSE?
And did you change the training config, or same as YTB, like
self.DATA_MOSE_REPEAT = 1, self.DATA_RANDOM_GAP_MOSE = 3
Thanks a lot!
Are you using morphological ops like in the official Davis evaluation toolkit?
Cause that impl could depend on the frame size and so It could be very generous with not too big objects:
https://github.com/davisvideochallenge/davis2017-evaluation/blob/master/davis2017/metrics.py#L77
If you are using the same approach for MOSE eval you could at least report F at different threshold levels.
First of all, thank you for this amazing work! I would kindly ask you a confirm, since I am facing an issue in checking data integrity with sha256sum
.
I've downloaded the .tar.gz
training file from OneDrive, as it is the suggested source.
The sums reported in the corresponding file in the OneDrive folder match the ones reported on GitHub, but by downloading the train
multiple times, I get a different value (but always the same) for that file, which does not match the one reported in the file in OneDrive and on GitHub.
I've also tried downloading it from multiple PCs (Desktop + Windows + Chrome, Laptop + Arch Linux + Firefox), and I always obtain the same sha256sum, which is different from the one reported.
Am I doing something wrong, or is the sum value for train.tar.gz
actually out of date?
Please feel free to raise any download issues here.
I trained STCN on MOSE, but your paper had a different result.
What I've done so far:
and the score I got on MOSE codalab was 0.2601784555.
Do you have any idea why this discrepancy showed?
Thanks for the data, I am using a dataset like DAVIS structure for the first time, I want to turn the dataset into a human segmented dataset, is there any way I can extract all the humans from the segmented annotations?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.