GithubHelp home page GithubHelp logo

labelme2yolo's Introduction

Labelme2YOLO

PyPI - Version PyPI - Downloads PYPI - Downloads

Labelme2YOLO efficiently converts LabelMe's JSON format to the YOLOv5 dataset format. It also supports YOLOv5/YOLOv8 segmentation datasets, making it simple to convert existing LabelMe segmentation datasets to YOLO format.

New Features

  • export data as yolo polygon annotation (for YOLOv5 & YOLOV8 segmentation)
  • Now you can choose the output format of the label text. The two available alternatives are polygon and bounding box(bbox).

Performance

Labelme2YOLO is implemented in Rust, which makes it significantly faster than equivalent Python implementations. In fact, it can be up to 100 times faster, allowing you to process large datasets more efficiently.

Installation

pip install labelme2yolo

Arguments

[LABEL_LIST]... Comma-separated list of labels in the dataset.

Options

-d, --json_dir <JSON_DIR> Directory containing LabelMe JSON files.

--val_size <VAL_SIZE> Proportion of the dataset to use for validation (between 0.0 and 1.0) [default: 0.2].

--test_size <TEST_SIZE> Proportion of the dataset to use for testing (between 0.0 and 1.0) [default: 0].

--output_format <OUTPUT_FORMAT> Output format for YOLO annotations: 'bbox' or 'polygon' [default: bbox] [aliases: format] [possible values: polygon, bbox].

--seed Seed for random shuffling [default: 42].

-h, --help Print help.

-V, --version Print version.

How to Use

1. Converting JSON files and splitting training, validation datasets

You may need to place all LabelMe JSON files under labelme_json_dir and then run the following command:

labelme2yolo --json_dir /path/to/labelme_json_dir/

This tool will generate dataset labels and images with YOLO format in different folders, such as

/path/to/labelme_json_dir/YOLODataset/labels/train/
/path/to/labelme_json_dir/YOLODataset/labels/val/
/path/to/labelme_json_dir/YOLODataset/images/train/
/path/to/labelme_json_dir/YOLODataset/images/val/
/path/to/labelme_json_dir/YOLODataset/dataset.yaml

2. Converting JSON files and splitting training, validation, and test datasets with --val_size and --test_size

You may need to place all LabelMe JSON files under labelme_json_dir and then run the following command:

labelme2yolo --json_dir /path/to/labelme_json_dir/ --val_size 0.15 --test_size 0.15

This tool will generate dataset labels and images with YOLO format in different folders, such as

/path/to/labelme_json_dir/YOLODataset/labels/train/
/path/to/labelme_json_dir/YOLODataset/labels/test/
/path/to/labelme_json_dir/YOLODataset/labels/val/
/path/to/labelme_json_dir/YOLODataset/images/train/
/path/to/labelme_json_dir/YOLODataset/images/test/
/path/to/labelme_json_dir/YOLODataset/images/val/
/path/to/labelme_json_dir/YOLODataset/dataset.yaml

How to build package/wheel

pip install maturin
maturin develop

labelme2yolo's People

Contributors

arshadoid avatar dependabot[bot] avatar greatv avatar rooneysh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

labelme2yolo's Issues

Multiprocessing error on high core count maschines

Due to the nature of pythons own multiprocessing module and its pool implementation, machines that report more than 64 logical cores with os.cpu_count() will crash the pool creation.
Our server has 128 cores which results in the following error:
ValueError: need at most 63 handles, got a sequence of length 129

A possible solution would be to change line 250 in l2y.py from:
with Pool(os.cpu_count()-1) as pool...
to
with Pool(min(os.cpu_count()-1, 63)) as pool....

Best Regards.

Polygon Conversion issue for YOLOv8 Semantic Segmentation

When Converting below provided annotation file I get some points which are bigger than 1. What can be a reason for this?

{
	"version": "3.16.7",
	"flags": {},
	"shapes": [
		{
			"label": "cable",
			"line_color": null,
			"fill_color": null,
			"points": [
				[
					0.0,
					1185.46875
				],
				[
					12.261904761904772,
					1152.5223214285716
				],
				[
					964.6428571428571,
					831.09375
				],
				[
					2260.2140077821015,
					394.789640077821
				],
				[
					2870.136186770428,
					187.13521400778215
				],
				[
					3420.0,
					0.0
				],
				[
					3436.2840466926074,
					3.2830739299610895
				],
				[
					2952.7380952380956,
					174.84375
				],
				[
					2232.5,
					425.9598214285714
				],
				[
					1340.9533073929964,
					728.842412451362
				],
				[
					501.9455252918288,
					1016.1113813229573
				]
			],
			"shape_type": "polygon",
			"flags": {}
		}
	],
	"lineColor": [
		0,
		255,
		0,
		128
	],
	"fillColor": [
		255,
		0,
		0,
		128
	],
	"imagePath": "14_00354.jpg",
	"imageData": REMOVED,
	"imageHeight": 2160,
	"imageWidth": 3840
} 

OUTPUT:

0 0.0 1.157684326171875 0.011974516369047629 1.1255100795200894 0.9420340401785714 0.811614990234375 2.2072402419747084 0.38553675788849706 2.8028673698929962 0.18274923242947474 3.33984375 0.0 3.3557461393482493 0.0032061268847276263 2.8835332961309526 0.170745849609375 2.18017578125 0.41597638811383925 1.309524714250973 0.7117601684095332 0.49018117704280156 0.9922962708232005

I'm using:

  • labelme2yolo Version: 0.0.9
  • Python 3.11.0
  • Windows 11

Also, 99 annotation files out of 922 have this issue, and rest seem fine. I double checked all the files with this issue and all of the points are within the image size range. I attached the file as .txt (github doesn't accept .json) so the issue can be reproduced if needed.
14_00354.txt

labelme2yolo some error

Here is my json
{ "version": "0.3.3", "flags": {}, "shapes": [], "imagePath": "KunChuansafebelt_dinglinhe20240119a_000119.jpg", "imageData": null, "imageHeight": 1080, "imageWidth": 1920 }
Im trying to convert to YOLO Dataset using this command:
labelme2yolo --json_dir json --val_size 0.15 --test_size 0.15

Im using:
image

image

what errors
image

I know the reason is that the imageData of json is empty, so the image cannot be saved. I know how to change it, but would like the author to update it too

Doubt about the result format

YOLOv[X] use the following format:

image
image

Since:

  • One row per object
  • The columns are class, x_center, y_center, width, and height format.
  • These box coordinates must be normalized to the dimensions of the image (i.e. have values between 0 and 1)
  • Class numbers are zero-indexed (start from 0).

Are five columns at all.
Why labelme2yolo result on nine columns?

Ex.:
6 0.30028237951807224 0.6704261490406068 0.4816327811244979 0.6704261490406068 0.4816327811244979 0.8132195448460507 0.30028237951807224 0.8132195448460507
0 -0.000571034136546178 0.47830209727800094 0.12583458835341366 0.47830209727800094 0.12583458835341366 0.7038933511825078 -0.000571034136546178 0.7038933511825078

the pip package output a different file name

The labelme2yolo pip package has different outputs from @rooneysh's labelme2yolo python program,
the output text file name are differenet (randomly generated) while the original output the text having the same file name it was created from.
also when running it with output_format set to polygon, the values are different than the original's output when using --seg flag, which im inclined to believe is the correct one.

Retrain - Classes need keep the old index and not discover

Great code, guy!

I need to retrain my dataset. The problem is that I need to keep the old class index to retrain. Your code discover classes and put on list as is being received.
Can you add a parameter to inform all classes before start?
Thank you.

Train, Validation and Test split overlap

I was reviewing the files which were split by labelme2yolo and noticed that the function create splits with overlap.

In my case I have 1231 files, and I pass val_size=0.15 and test_size=0.1 and as the result of this it was split:

train=922
validation=185
test=124

Issues:

The amount of files by split seems fine.

  1. But when I move all split files (images or annotations) into a single folder file manager says the file exists and I skipped the same files. This resulted in total of 1077 files. This means 154 files were not utilized.
  2. Verified duplicates with a python script:
  • Overlap between Train and Val: 141 files
  • Overlap between Train and Test: 0 files
  • Overlap between Val and Test: 13 files

Currently I'm trying to find out why it's not right, but could you please also check this.

Also this issue was reported in the original branch. rooneysh/Labelme2YOLO#5

Issue converting labelme JSON to YOLODataset

Here is my labelme JSON:

{ "version": "5.3.1", "flags": {}, "shapes": [ { "label": "Center", "points": [ [ 1400, 900 ], [ 1500, 900 ], [ 1500, 1000 ], [ 1400, 1000 ] ], "group_id": null, "description": "", "shape_type": "Rectangle", "flags": {} }, { "label": "LeftGaurd", "points": [ [ 1320, 920 ], [ 1420, 920 ], [ 1420, 1020 ], [ 1320, 1020 ] ], "group_id": null, "description": "", "shape_type": "Circle", "flags": {} }, { "label": "LeftTackle", "points": [ [ 1240, 920 ], [ 1340, 920 ], [ 1340, 1020 ], [ 1240, 1020 ] ], "group_id": null, "description": "", "shape_type": "Circle", "flags": {} } ], "imagePath": "img/latest-00-QBShotGun-WRLeftHashmarkOn-WRRightHashmarkOn-FeatherLeft-WRRightHashmarkOff-FeatherRight-normal-16686378232057195.png", "imageData": "", "imageHeight": 1200, "imageWidth": 2500 }

Im trying to convert to YOLO Dataset using this command:
labelme2yolo --json_dir json --val_size 0.15 --test_size 0.15

It Generates a Folders YOLODataset -> images, label -> test, train val, but the folders are empty.

Im using:

Python 3.10.12
YoloV5
labelme2yolo 0.1.2 

please help with this issue.
Thanks

file name changed

When I use labelme2yolo to convert my label format, I find that the converted file name becomes a set of hexadecimal numbers, how do I change this problem

Read ImageData from JSON

The new version seems to have canceled reading ImageDate from JSON and changed it to ImagePath.

Double fields

Hello, while converting labelme .json to YOLO format, instead of getting 5 numbers for each line, I am getting 9 (the coordinates of bbox are double)

Cannot convert to yolov8n segmentation format

I prepared my dataset in Labelme and whenever i try to convert to yolo format, the operation completes successfully but i somehow seem to always get the bounding boxes, and not the segments. i even specified output_format argument to "plygon", but still got bboxs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.