The caption-synthesize from arkboy1224

Synthesizing Captions for booru-tag style database

Recently, LAION type dataset research have discovered that it has insufficient information, alttexts were disasterously lacking information about actual images.

Rather we can call it miracle that models which were dependent at the open dataset, working despite of those discrepencies.

Google, OpenAi has discovered that synthetic captions are far beneficial for dependent task, and CapsFusion project tries to annotate the large scale dataset with synthetic way.

Meta also tries to make a high-quality refined dataset, with hierarchical way.

Unfortunately, locally we don't have enough resources to process all the large database.

But we can cover it in several way,

Tag-retrieval
Focal crop - Tag - Grouping
Tag relevance based reordering.

Which should help understanding what the tags actually belongs to.

extract-exif.py

The file supports Gradio Demo to to extract Stealth-PNGInfo type image metadata.

query-gpt4.py

The file is example template to query GPT-4V API to get annnotation, based on image and tag.

In the directory, the image should have same name with tag .txt file.

The txt file format:


copyright: 
character: erica_blandelli
general tags: 1girl arm_up blonde_hair blue_eyes breasts choker cleavage closed_mouth day full_body high_heels long_hair looking_at_viewer miniskirt outdoors pleated_skirt red_skirt sitting skirt smile solo

annotate.py

Gradio Demo of the annotation. Hooman should refine the GPT4V annotation. The sanitize cell, which shows the 'unused tags' will be added soon.

Note that GPT-4V DOES NOT ACCEPT ANY TYPES OF NSFW CONTENTS

For those work, one might need company contact, or fair-use agreement for those annotations.

Unfortunately, most of the open source / crawled dataset has potential risk to contain unclassified data.

And since Data Poisoning is being a severe issue, the problem will be important soon ™️.

If booru database supported 'where' is related with specific tag, then it could have been a novel dataset, which also have semantic information too.

arkboy1224 / caption-synthesize Goto Github PK

caption-synthesize's Introduction

Synthesizing Captions for booru-tag style database

extract-exif.py

query-gpt4.py

annotate.py

Note that GPT-4V DOES NOT ACCEPT ANY TYPES OF NSFW CONTENTS

caption-synthesize's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs