greatv / labelme2yolo Goto Github PK

Efficiently converts LabelMe's JSON format to the YOLOv5 dataset format.

License: MIT License

Rust 100.00%

labelme2yolo's Introduction

Labelme2YOLO

Labelme2YOLO efficiently converts LabelMe's JSON format to the YOLOv5 dataset format. It also supports YOLOv5/YOLOv8 segmentation datasets, making it simple to convert existing LabelMe segmentation datasets to YOLO format.

New Features

export data as yolo polygon annotation (for YOLOv5 & YOLOV8 segmentation)
Now you can choose the output format of the label text. The two available alternatives are polygon and bounding box(bbox).

Performance

Labelme2YOLO is implemented in Rust, which makes it significantly faster than equivalent Python implementations. In fact, it can be up to 100 times faster, allowing you to process large datasets more efficiently.

Installation

pip install labelme2yolo

Arguments

[LABEL_LIST]... Comma-separated list of labels in the dataset.

Options

-d, --json_dir <JSON_DIR> Directory containing LabelMe JSON files.

--val_size <VAL_SIZE> Proportion of the dataset to use for validation (between 0.0 and 1.0) [default: 0.2].

--test_size <TEST_SIZE> Proportion of the dataset to use for testing (between 0.0 and 1.0) [default: 0].

--output_format <OUTPUT_FORMAT> Output format for YOLO annotations: 'bbox' or 'polygon' [default: bbox] [aliases: format] [possible values: polygon, bbox].

--seed Seed for random shuffling [default: 42].

-h, --help Print help.

-V, --version Print version.

How to Use

1. Converting JSON files and splitting training, validation datasets

You may need to place all LabelMe JSON files under labelme_json_dir and then run the following command:

labelme2yolo --json_dir /path/to/labelme_json_dir/

This tool will generate dataset labels and images with YOLO format in different folders, such as

/path/to/labelme_json_dir/YOLODataset/labels/train/
/path/to/labelme_json_dir/YOLODataset/labels/val/
/path/to/labelme_json_dir/YOLODataset/images/train/
/path/to/labelme_json_dir/YOLODataset/images/val/
/path/to/labelme_json_dir/YOLODataset/dataset.yaml

2. Converting JSON files and splitting training, validation, and test datasets with --val_size and --test_size

You may need to place all LabelMe JSON files under labelme_json_dir and then run the following command:

labelme2yolo --json_dir /path/to/labelme_json_dir/ --val_size 0.15 --test_size 0.15

This tool will generate dataset labels and images with YOLO format in different folders, such as

/path/to/labelme_json_dir/YOLODataset/labels/train/
/path/to/labelme_json_dir/YOLODataset/labels/test/
/path/to/labelme_json_dir/YOLODataset/labels/val/
/path/to/labelme_json_dir/YOLODataset/images/train/
/path/to/labelme_json_dir/YOLODataset/images/test/
/path/to/labelme_json_dir/YOLODataset/images/val/
/path/to/labelme_json_dir/YOLODataset/dataset.yaml

How to build package/wheel

pip install maturin
maturin develop

labelme2yolo's People

Contributors

Stargazers

Watchers

Forkers

jchain arshadoid freesoaring nrdout mcx kiri-i zhangzhenhu streamsnipersgames gray-stone patocl rosdan75 jeremylebon linwang9926 zouxiaodong

labelme2yolo's Issues

Multiprocessing error on high core count maschines

Due to the nature of pythons own multiprocessing module and its pool implementation, machines that report more than 64 logical cores with os.cpu_count() will crash the pool creation.
Our server has 128 cores which results in the following error:
ValueError: need at most 63 handles, got a sequence of length 129

A possible solution would be to change line 250 in l2y.py from:
with Pool(os.cpu_count()-1) as pool...
to
with Pool(min(os.cpu_count()-1, 63)) as pool....

Best Regards.

Error on convert only one file

When you have only one file to convert on folder the app crash.
It's a easy bug to reproduce.

Polygon Conversion issue for YOLOv8 Semantic Segmentation

When Converting below provided annotation file I get some points which are bigger than 1. What can be a reason for this?

{
	"version": "3.16.7",
	"flags": {},
	"shapes": [
		{
			"label": "cable",
			"line_color": null,
			"fill_color": null,
			"points": [
				[
					0.0,
					1185.46875
				],
				[
					12.261904761904772,
					1152.5223214285716
				],
				[
					964.6428571428571,
					831.09375
				],
				[
					2260.2140077821015,
					394.789640077821
				],
				[
					2870.136186770428,
					187.13521400778215
				],
				[
					3420.0,
					0.0
				],
				[
					3436.2840466926074,
					3.2830739299610895
				],
				[
					2952.7380952380956,
					174.84375
				],
				[
					2232.5,
					425.9598214285714
				],
				[
					1340.9533073929964,
					728.842412451362
				],
				[
					501.9455252918288,
					1016.1113813229573
				]
			],
			"shape_type": "polygon",
			"flags": {}
		}
	],
	"lineColor": [
		0,
		255,
		0,
		128
	],
	"fillColor": [
		255,
		0,
		0,
		128
	],
	"imagePath": "14_00354.jpg",
	"imageData": REMOVED,
	"imageHeight": 2160,
	"imageWidth": 3840
}

OUTPUT:

0 0.0 1.157684326171875 0.011974516369047629 1.1255100795200894 0.9420340401785714 0.811614990234375 2.2072402419747084 0.38553675788849706 2.8028673698929962 0.18274923242947474 3.33984375 0.0 3.3557461393482493 0.0032061268847276263 2.8835332961309526 0.170745849609375 2.18017578125 0.41597638811383925 1.309524714250973 0.7117601684095332 0.49018117704280156 0.9922962708232005

I'm using:

labelme2yolo Version: 0.0.9
Python 3.11.0
Windows 11

Also, 99 annotation files out of 922 have this issue, and rest seem fine. I double checked all the files with this issue and all of the points are within the image size range. I attached the file as .txt (github doesn't accept .json) so the issue can be reproduced if needed.
14_00354.txt

I tested in the example folder of labelme, and the resulting folders were all empty.

labelme\examples\instance_ segmentation\data_ annotated

The folder in images and the folder in labels are both empty.

labelme2yolo some error

Here is my json
{ "version": "0.3.3", "flags": {}, "shapes": [], "imagePath": "KunChuansafebelt_dinglinhe20240119a_000119.jpg", "imageData": null, "imageHeight": 1080, "imageWidth": 1920 }
Im trying to convert to YOLO Dataset using this command:
labelme2yolo --json_dir json --val_size 0.15 --test_size 0.15

Im using:

what errors

I know the reason is that the imageData of json is empty, so the image cannot be saved. I know how to change it, but would like the author to update it too

Doubt about the result format

YOLOv[X] use the following format:

Since:

One row per object
The columns are class, x_center, y_center, width, and height format.
These box coordinates must be normalized to the dimensions of the image (i.e. have values between 0 and 1)
Class numbers are zero-indexed (start from 0).

Are five columns at all.
Why labelme2yolo result on nine columns?

Ex.:
6 0.30028237951807224 0.6704261490406068 0.4816327811244979 0.6704261490406068 0.4816327811244979 0.8132195448460507 0.30028237951807224 0.8132195448460507
0 -0.000571034136546178 0.47830209727800094 0.12583458835341366 0.47830209727800094 0.12583458835341366 0.7038933511825078 -0.000571034136546178 0.7038933511825078

the pip package output a different file name

The labelme2yolo pip package has different outputs from @rooneysh's labelme2yolo python program,
the output text file name are differenet (randomly generated) while the original output the text having the same file name it was created from.
also when running it with output_format set to polygon, the values are different than the original's output when using --seg flag, which im inclined to believe is the correct one.

Retrain - Classes need keep the old index and not discover

Great code, guy!

I need to retrain my dataset. The problem is that I need to keep the old class index to retrain. Your code discover classes and put on list as is being received.
Can you add a parameter to inform all classes before start?
Thank you.

Train, Validation and Test split overlap

I was reviewing the files which were split by labelme2yolo and noticed that the function create splits with overlap.

In my case I have 1231 files, and I pass val_size=0.15 and test_size=0.1 and as the result of this it was split:

train=922
validation=185
test=124

Issues:

The amount of files by split seems fine.

But when I move all split files (images or annotations) into a single folder file manager says the file exists and I skipped the same files. This resulted in total of 1077 files. This means 154 files were not utilized.
Verified duplicates with a python script:

Overlap between Train and Val: 141 files
Overlap between Train and Test: 0 files
Overlap between Val and Test: 13 files

Currently I'm trying to find out why it's not right, but could you please also check this.

Also this issue was reported in the original branch. rooneysh/Labelme2YOLO#5

Issue converting labelme JSON to YOLODataset

Here is my labelme JSON:

{ "version": "5.3.1", "flags": {}, "shapes": [ { "label": "Center", "points": [ [ 1400, 900 ], [ 1500, 900 ], [ 1500, 1000 ], [ 1400, 1000 ] ], "group_id": null, "description": "", "shape_type": "Rectangle", "flags": {} }, { "label": "LeftGaurd", "points": [ [ 1320, 920 ], [ 1420, 920 ], [ 1420, 1020 ], [ 1320, 1020 ] ], "group_id": null, "description": "", "shape_type": "Circle", "flags": {} }, { "label": "LeftTackle", "points": [ [ 1240, 920 ], [ 1340, 920 ], [ 1340, 1020 ], [ 1240, 1020 ] ], "group_id": null, "description": "", "shape_type": "Circle", "flags": {} } ], "imagePath": "img/latest-00-QBShotGun-WRLeftHashmarkOn-WRRightHashmarkOn-FeatherLeft-WRRightHashmarkOff-FeatherRight-normal-16686378232057195.png", "imageData": "", "imageHeight": 1200, "imageWidth": 2500 }

Im trying to convert to YOLO Dataset using this command:
labelme2yolo --json_dir json --val_size 0.15 --test_size 0.15

It Generates a Folders YOLODataset -> images, label -> test, train val, but the folders are empty.

Im using:

Python 3.10.12
YoloV5
labelme2yolo 0.1.2

please help with this issue.
Thanks

file name changed

When I use labelme2yolo to convert my label format, I find that the converted file name becomes a set of hexadecimal numbers, how do I change this problem

Read ImageData from JSON

The new version seems to have canceled reading ImageDate from JSON and changed it to ImagePath.

Double fields

Hello, while converting labelme .json to YOLO format, instead of getting 5 numbers for each line, I am getting 9 (the coordinates of bbox are double)

Cannot convert to yolov8n segmentation format

I prepared my dataset in Labelme and whenever i try to convert to yolo format, the operation completes successfully but i somehow seem to always get the bounding boxes, and not the segments. i even specified output_format argument to "plygon", but still got bboxs