bdd100k / bdd100k Goto Github PK
View Code? Open in Web Editor NEWToolkit of BDD100K Dataset for Heterogeneous Multitask Learning - CVPR 2020 Oral Paper
Home Page: https://www.bdd100k.com
License: BSD 3-Clause "New" or "Revised" License
Toolkit of BDD100K Dataset for Heterogeneous Multitask Learning - CVPR 2020 Oral Paper
Home Page: https://www.bdd100k.com
License: BSD 3-Clause "New" or "Revised" License
Hi,
I can see that the images from bdd100k are the keyframes of the videos taken every 10 seconds. I would like to know if it is possible to get the car speed. I saw that the "info" download information contains the IMU and GPS data (The GPS/IMU information recorded along with the videos). Is it possible to get that information as well per image? The json file of the video and the image json file has the same name, however the "info" json file has several images (each frame on the video), which image in the bdd100k corresponds to the "info" json file and the image json file always has 10000 of timestamp. Is any way to obtain this information from the dataset?
Thanks
Is there any overlap between the track ID of different video sequence annotation files?
After running:
python3 -m bdd100k.label.to_coco -l /redact/datasets/bdd100k/labels/det_20/det_train.json -o /redact/datasets/bdd100k/labels/det_20_coco/det_train.json --remove-ignore
from the cloned git directory, I receive the following traceback:
Traceback (most recent call last): File "/redact/anaconda3/envs/bdd100k/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/redact/anaconda3/envs/bdd100k/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/redact/workspace/pycharm_projects/bdd100k/bdd100k/label/to_coco.py", line 12, in <module> from scalabel.label.coco_typing import AnnType, GtType, ImgType, VidType File "/redact/anaconda3/envs/bdd100k/lib/python3.7/site-packages/scalabel/__init__.py", line 3, in <module> from . import bot, label, tools File "/redact/anaconda3/envs/bdd100k/lib/python3.7/site-packages/scalabel/label/__init__.py", line 3, in <module> from . import coco_typing, from_coco, io, to_coco, typing File "/redact/anaconda3/envs/bdd100k/lib/python3.7/site-packages/scalabel/label/to_coco.py", line 11, in <module> from pycocotools import mask as mask_utils # type: ignore File "/redact/anaconda3/envs/bdd100k/lib/python3.7/site-packages/pycocotools/mask.py", line 3, in <module> import pycocotools._mask as _mask File "pycocotools/_mask.pyx", line 1, in init pycocotools._mask ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
It appears that this is due to an older version of numpy being installed; I will test this on other labels and see if its specific to this label set.
I cloned the repo of bdd100k and ran pip install -r requirement.txt
. Further I want to convert the bdd dataset annotation to coco style annotation, I ran
python -m bdd100k.label.to_coco -m det -l bdd100k/labels/det_20/det_train.json -o bdd/labels/det_20/det_train_cocofmt.json
There are some ImportErrors like:
/bdd100k/bdd100k/label/to_coco.py", line 36, in <module> from scalabel.label.to_coco import ( ImportError: cannot import name 'load_coco_config' from 'scalabel.label.to_coco'
/bdd100k/bdd100k/label/to_coco.py", line 36, in <module> from scalabel.label.to_coco import ( ImportError: cannot import name 'process_category' from 'scalabel.label.to_coco'
This is due to the PR #304 of the scalable repo. But the new API is not updated in bdd100k repo scripts.
I wanted to generate label masks from the json files for semantic segmentation. I tried running the command below following the documentation to get instance segmentation masks but all the output images have only one value for every pixel. What is the right process to get semantic segmentation ground truths from the JSON file? Also, do I need to convert instance to semantic segmentation or is there code to directly generate semantic masks?
python3 -m bdd100k.vis.labels --image-dir bdd100k/images/10k/val -l bdd100k/labels/bdd100k_labels_images_val.json -s 1 --instance -o bdd100k/mask/val/
I have download the dataset, but I can't find the bdd100k_labels_images_train.json.
@XiaLiPKU After installing all the dependencies, i get the ImportError: attempted relative import with no known parent package.
Could you please help me to resolve this issue?
Command used:
python to_coco.py -i /home/workspace/data/bdd100k_dataset/bdd100k/json/det_v2_val_release.json -o /home/workspace/data/bdd100k_dataset/det_v2_val_release_coco.json --remove-ignore
OS: Ubuntu 18.04
Python 3.7.5
Hi, I am trying to run the evaluation script for detection alone, to evaluate my model output. But I am getting the following error:
File "bdd100k/bdd100k/eval/evaluate.py", line 133, in group_by_key
groups[d[key]].append(d)
KeyError: 'category'
Why there are more than 3,000 pictures in 10k but not in 100k. How do I count the categories of 10k data, such as weather, timeofday. Because my segment models trained on 10k are not good at night scenes, I want to count all the picture information.
Since, in the repo already bdd to coco conversion script is available, it would be great, if bdd2voc conversion script will be added to repo.
Hi thanks for your good repository
Is there any way to split the datasets in to night time and day?
I want to use the datasets for training the Cycle GAN
Hi,
I am trying to download the MOTS20 images via this link: http://dl.yf.io/bdd100k/mots20/bdd100k_seg_track_20_images.zip
and get a "403 Forbidden" error.
I can download the version in the tmp folder, but at least there labels and images do not fit together.
Just one example:
This training image (seg_track\train\002b485a-3f6603f2\002b485a-3f6603f2-0000180.jpg)
should belong to this colormap label (seg_track_20\colormaps\train\002b485a-3f6603f2\002b485a-3f6603f2-0000180.png)
Is this a problem with the tmp version?
And how do i get access to the real one?
Hello, is there a preferred way to fix or report incorrect labels? They've been reported in previous issues [1] [2] and I've run into it as well. But not sure if there's an ideal process to report/fix these.
One example on timeofday
in the 100k det_20 labels. Presumably this label is incorrect.
Image bdd100k/images/100k/train/0de6128f-d367dc4d.jpg
, with a reported time-of-day of "night".
$ jq '.[] | select(.name == "0de6128f-d367dc4d.jpg") | .name,.attributes' bdd100k/labels/det_20/det_train.json
"0de6128f-d367dc4d.jpg"
{
"weather": "snowy",
"timeofday": "night",
"scene": "city street"
}
I am planning on using the images that are not annotated for both the ins_seg and seg_track_20 tasks. If I were to annotate all of the train images present in the 100k train folder using a model trained using the 10k/seg_track_20 training set, would there be any images present there that are in the val/test sets of the ins_seg (10k) or seg_track_20 tasks? Or are the train/val/test splits all derived from the same videos, regardless of the task?
I found 137 images in the train dataset have no annotation, because they are absent from the annotation .json file.
Here is the list, tell me if the problem is reproducible
Hi! BDD is a great dataset! But is there any detailed documentation about lane line annotations? For instance, are they all straight lines?
cc @fyu
Could the authors(@fyu) kindly verify this?
I downloaded the latest annotations of instance segmentation along with the latest image set and wanted to train mask-rcnn.
I notice that there are no instances of 'pedestrian' which is category id: 1.
Could someone please cross-check this.
Hello, I have been checking out the labels for lane line detection. It seems that all lanes are marked as solid ones (i.e. for every pixel value of a mask, its 3d bit is always 0). for example, check val/b1d0a191-de8948f6.jpg. I think there is no way to differentiate between the dashed and the solid lines on this image based by annotations.
Hi, I followed https://doc.bdd100k.com/evaluate.html#detection to save my prediction results in the correct format but I get the following error when running bdd100k.eval.run:
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/env/lib/python3.7/site-packages/scalabel/common/parallel.py", line 30, in run
q_out.put((i, func(x[0])))
File "/home/env/lib/python3.7/site-packages/scalabel/label/io.py", line 18, in parse
return Frame(**humps.decamelize(raw_frame))
File "/home/env/lib/python3.7/site-packages/scalabel/label/typing.py", line 96, in __init__
super().__init__(**data)
File "pydantic/main.py", line 400, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 100 validation errors for Frame
labels -> 0 -> id
field required (type=value_error.missing)
labels -> 1 -> id
field required (type=value_error.missing)
labels -> 2 -> id
field required (type=value_error.missing)
labels -> 3 -> id
field required (type=value_error.missing)
labels -> 4 -> id
field required (type=value_error.missing)
labels -> 5 -> id
The issue seems that each label should also have an id. What should this id be set to?
I found a few images and labels that do not match in the images/10k images for semantic segmentation and the corresponding bitmasks downloaded in labels/sem_seg/masks. These are the following files
In images path but not in labels:
In labels path but not in images:
I am able to train a model successfully after removing these images but wanted to bring this to your attention.
Thanks.
I am trying to convert the ins_seg annotations to coco format using the instructions provided at https://doc.bdd100k.com/format.html#format-conversion . As of now, I am not interested in the scalabel components as I am using a larger detection repo to run experiments with this data; I ran the following code with the correct directories on my end:
python3 -m bdd100k.label.to_coco -m ins_seg|seg_track -i ${mask_base} -o ${out_path}
where mask_base is where my training bitmasks are, in this case. The output was a json file with no annotations. What could cause this issue? I have checked the bitmasks and they are all there. Also, I ran the scalabel version:
python3 -m bdd100k.label.to_coco -m ins_seg|seg_track -i ${in_path} -o ${out_path} -mb ${mask_base}
which is giving me a complete json file in the RLE format. The issue I am having with this annotation set is that it appears to be causing issues in my other code (which I believe is due to the scalabel content). I am still looking into this but was wondering if there was any advice you could provide to me?
I'm currently exploring data of the bdd100k dataset, i've experienced a very strange situation. I've filtered out all images that are not containing the "daytime" label for timeofday.
I still see some night images in the resulting set of images.
Some examples:
{"name": "2b2c44c7-958f0ba8.jpg", "attributes": {"weather": "undefined", "scene": "city street", "timeofday": "daytime"}
{"name": "a535e685-201c994e.jpg", "attributes": {"weather": "clear", "scene": "highway", "timeofday": "daytime"}
Check the images I've pasted here and you can see they're both night images.
How is this happening? Bad manual labeling?
I actually can not tell how many night images are "misclassified".
Dear authors,
May I ask if there exits a file called box_track_val_cocofmt.json
under bdd100k/labels/box_track_20
?
I was trying to run Quasi-Dense Tracking which is based on bdd100k. However, it requires box_track_val_cocofmt.json
.
I downloaded the detection labels but did not find this file.
Please ignore me if this json file does not exist originally.
Thanks!
Hey there!
While running the script to convert the annotation from bdd100k format to COCO the following exception is being raised. cannot import name 'NPROC' from 'scalabel.common.parallel'. What is the possible solution for the problem?
There are no weather or location conditions for 10K datasets.
Thanks for making a good library.
By the way, about 4000 sheets of semantic segmentation dataset (10K) are not in 100K json label files.
Therefore, it is not possible to know the weather or location condition of some images in the segmentation dataset (10K).
The current segmentation dataset (10K) of BDD100K is just a segmentation dataset. There is no information about various conditions such as weather.
Is there any way to infer the weather or location condition of the segmentation dataset (10K)?
Or can I get an instance segmenation label for 100K??
I got this error every time when I run to_coco.py on google colab
My code:
!python3 -m bdd100k.label.to_coco -i /content/bdd100k_label/labels/train.json -o /content/det_v2_train_release_coco.json
The error I got:
[2021-11-07 12:46:50,950 to_coco.py:546 main] Start format converting... 0% 0/69863 [00:00<?, ?it/s] Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/content/bdd100k/bdd100k/label/to_coco.py", line 557, in <module> main() File "/content/bdd100k/bdd100k/label/to_coco.py", line 547, in main frames = bdd100k_to_scalabel(dataset.frames, bdd100k_config) File "/content/bdd100k/bdd100k/label/to_scalabel.py", line 58, in bdd100k_to_scalabel image_anns.labels[i], bdd100k_config, cat_name2id File "/content/bdd100k/bdd100k/label/to_scalabel.py", line 35, in deal_bdd100k_category assert category_name in bdd100k_config.ignored_mapping AssertionError
I have no idea how to solve this problem. Any help will be appreciated. Thanks!!
I am currently working with BDD10K and would like to evaluate on the private test set. The link to the submission site says “We are moving our evaluation servers to Codalab. Stay tuned!”. I was wondering about the timeline for the evaluation servers to be back up? Thanks!
I am trying to download the Segmentation labels from https://bdd-data.berkeley.edu/portal.html#download, but I get the following error:
<Error> <Code>AccessDenied</Code> <Message>Request has expired</Message> <X-Amz-Expires>2</X-Amz-Expires> <Expires>2021-12-16T09:15:24Z</Expires> <ServerTime>2021-12-16T09:15:25Z</ServerTime> <RequestId>D4GYVYX8CK4938PT</RequestId> <HostId>JxMxnuwGMRc3M4BB52LPNR9PGsTlM++MsXnbHDVxrSA/V7RKvEeIQtBvz2KgQ1QNB0H4zasxDn8=</HostId> </Error>
Some of the other downloads work sometimes, but sometimes give the same error. I am using Chrome on Windows.
Excuse me ,i want Convert "ins_seg_train.json" to coco,i used this command "python -m bdd100k.label.to_coco -m ins_seg -i ins_seg_train.json -o ins_seg_train_coco.json".
but the program error:
[2021-11-11 13:53:21,270 to_coco.py:537 main] Loading annotations...
[2021-11-11 13:53:31,594 to_coco.py:546 main] Start format converting...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7000/7000 [00:00<00:00, 46697.44it/s]
[2021-11-11 13:53:31,746 to_coco.py:321 bdd100k2coco_ins_seg] Collecting annotations...
0%| | 0/7000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/stan/anaconda3/envs/labelme/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/stan/anaconda3/envs/labelme/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/stan/wkstion/labeltool/bdd100k/bdd100k/label/to_coco.py", line 557, in
main()
File "/home/stan/wkstion/labeltool/bdd100k/bdd100k/label/to_coco.py", line 548, in main
coco = convert_func(frames=frames, config=bdd100k_config.scalabel)
File "/home/stan/wkstion/labeltool/bdd100k/bdd100k/label/to_coco.py", line 344, in bdd100k2coco_ins_seg
image_anns.name.replace(".jpg", ".png"),
File "/home/stan/anaconda3/envs/labelme/lib/python3.7/posixpath.py", line 80, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType
What should I do ~!thanks!
Hi, I was trying to download the instance segmentation data, but all i see is semantic segmentation without differentiation among individual instances. Also, in the detection20 data, all poly2d fields are None. Please let me know if I am missing some other dataset.
Thanks!
there seems to be no config to convert polygon annotations of lane markings to bitmasks (and running to_mask.py without config argument complains about missing lane_mark.toml)
There are no labels for few (or many) images in train set of bdd10k. For example below images in train of bdd10k has no labels in 'bdd100k\labels\sem_seg\masks\train' .
78ac84ba-07bd30c2
52e3fd10-c205dec2
a5242a75-c9f4fb66
When I run the code bdd100k2coco.py I got an error as follow:
"File "bdd100k2coco.py", line 130, in bdd100k2coco_det
image["file_name"] = frame['name']
TypeError: list indices must be integers or slices, not str"
Does any one know why?
thanks
Hello, I found some name inconsistency in JSON files which make errors while converting format using to_coco.py script.
Most of the 'name' values in the JSON files consist of 'videoName-framenubmer.jpg'
.
However, the name of the following files is like videoName/videoName-framenubmer.jpg
.
A list of files:
0062e803-38c0a33a.json
000f157f-dab3a407.json
006fdb67-f4820206.json
0062f18d-f8cd3a65.json
can you update the bdd100k eval function on pypi?
I've already download bdd100k images.
Could you please provide the camera intrinsic and extrinsic parameters?
If so, it's really helpful for my research.
Thank you very much!
Hello,
I am trying to convert the bdd100k instance segmentation using this command:
python3 -m bdd100k.label.to_coco -m ins_seg --only-mask -i ./bdd100k/labels/ins_seg/bitmasks/val -o ./ins_seg_val_cocofmt_v2.json
Also, tried this:
python3 -m bdd100k.label.to_coco -m ins_seg -i ./bdd100k/labels/ins_seg/polygons/ins_seg_val.json -o ./ins_seg_val_cocofmt_v3.json -mb ./bdd100k/labels/ins_seg/bitmasks/val
The conversion is successful in both cases and the annotation looks like this
** that's not how coco annotations are.
Now, if you see the segmentation field above there's string encoding of the masks. Now, I am unsure if that's expected or not.
Further, assuming it's correct, I tried to load the annotations using loader from DETR https://github.com/facebookresearch/detr/blob/091a817eca74b8b97e35e4531c1c39f89fbe38eb/datasets/coco.py#L36
The line I have mentioned above is supposed to do the conversion but I am getting an error from the pycocotools that it's not expecting a string in the mask.
So, I am unsure where the problem is? Is the conversion correct to coco then the loader should work?
Note: I tried to convert the detections and they worked fine.
Thank you for any help you can provide.
When I used the script provided by the warehouse to convert the BDD format to COCO format, for the training set, I found that the number of images in the COCO format was only 69863 instead of 70,000 in BDD. The number of images in the validation set remained the same as before.
I am looking for mean and std of the dataset similar to the ImageNet or do I need to calculate it by myself?
Hi, I could not find information about image sensor used to record the rgb images. could you give me a link or sensor type?
Thank you so much
Hi, I generated detection results for a subset of the categories in BDD100K and I would like to get the evaluation scores.
Just running the default evaluation code does not work as it generates a default value of -1 for all scores of all missing categories:
Evaluate category: traffic sign
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.37s).
Accumulating evaluation results...
DONE (t=0.01s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = -1.000
...
The -1 scores are included in the final mAP calculation which is therefore completely wrong.
I also tried removing the categories in question (traffic light and traffic sign) from the config file but this results in an error from scalabel as it assumes that at least one cat_extensions
is set in the config:
File "/home/user/env/lib/python3.7/site-packages/scalabel/label/to_coco.py", line 551, in load_coco_config
categories, cat_extensions = cfgs["categories"], cfgs["cat_extensions"]
KeyError: 'cat_extensions'
How could I achieve this?
Hello there, thank you for the wonderful work on the dataset. I am trying to perform MOT evaluation on the validation set.
I have prepared the JSON files from validation set results from a custom trained model, and I am trying to run the following. However, there is no output from the program, there are no errors either.
python3 -m bdd100k.eval.run \
-t box_track \
-g /Users/admin/Desktop/bdd100kdata/bdd100k/labels/box_track_20/val \
-r /Users/admin/Desktop/bdd100kdata/bdd100k_epoch5/bdd \
--out_file /Users/admin/Desktop/bdd100kdata/bdd100K_epoch5/val_output \
--score_file /Users/admin/Desktop//bdd100kdata/bdd100k_epoch5/
In my own attempt to understand what is going on, I have placed print statements in the various files, and identified that in run.py, there are no more print outputs after load(args.gt, args.nproc) from the section of the code below. However, I'm not too sure what else I could do after this.
Perhaps you may have a solution? Thank you.
elif args.task == "box_track":
print("Mode: Box Track")
results = evaluate_track(
acc_single_video_mot,
gts=group_and_sort(
bdd100k_to_scalabel(
load(args.gt, args.nproc).frames, bdd100k_config
)
),
results=group_and_sort(
bdd100k_to_scalabel(
load(args.result, args.nproc).frames, bdd100k_config
)
),
config=bdd100k_config.scalabel,
iou_thr=args.iou_thr,
ignore_iof_thr=args.ignore_iof_thr,
nproc=args.nproc,
)
Hi @fyu I saw that you had previously delt with a similar issue. I seem to be getting a if frame["labels"]: KeyError: 'labels' when running the python -m bdd100k.label.to_coco -m det -i /home/sam/Desktop/labels-original/det_20/det_train.json -o /home/sam/Desktop/labels/det_20/det_train_cocofmt.json
I have looked within the to_coco.py file but can't figure why this won't run. Any advice would be great
It's reported in Table 6 of your paper that using the 7k BDD instance segmentation dataset could reach AP = 21.8. However, I cannot reproduce this under the default configuration of Mask R-CNN in Detectron2. Could you provide more implementation details? Thanks.
After visualisation it been observed that many objects labels has wrong values like Truck is annotated as Car and Most of time (Van) is sometime annotated as Car and sometime Truck.
Is it due to manual or human errors?
Few examples:
Further, such object annotated which human eyes also can't recognised
1st image Truck is annotated as Car
2nd image again Truck is annotated as Car
3rd image Car (Van is European country) is annotated as Truck, however most of images such object is annotated as Car.
4th & 5th image such objects are annotated that can't be recognised by human eyes.
Does BDD trying to correct or improved wrongly annotated objects?
run.py
have the same output typemain
in run.py
Your code doesn't work. In to_coco.py you are looking for key_name when you should be looking for keyName. Fix it and update your repo
When I try to convert the 10k image subset the function
bdd100k2coco_det( )
leads to the error:
if frame['labels']:
KeyError: 'labels'
when the following command is run:
python3 -m bdd100k.label.to_coco -m det -i "./labels/det_20/det_train.json" -o "./labels/det_20_coco/det_train.json"
An easy fix is to change the check
if frame['labels']:
to
if frame.get("labels", None):
Also, if the corresponding output directory is not found the main()
function complains, we could just add:
if not os.path.exists('/'.join(out_fn.split('/')[:-1])):
os.mkdir('/'.join(out_fn.split('/')[:-1]))
before dumping output json to check for the output directory.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.