Comments (5)
if you are using a csv for token classification, you need to use stringified list. an example is provided here: https://huggingface.co/docs/autotrain/token_classification
your example doesnt look like correct data format.
Also, its better to use JSONL format for token classification task like this:
{"tokens": ["I", "love", "Paris"], "tags": ["O", "O", "B-LOC"]}
{"tokens": ["I", "live", "in", "New", "York"], "tags": ["O", "O", "O", "B-LOC", "I-LOC"]}
.
.
.
from autotrain-advanced.
I've stringified the data but get the same error. I also ran a test with your example data above, and I had the same error. Here is the csv data I used:
![Screenshot 2024-04-24 at 09 33 24](https://private-user-images.githubusercontent.com/50191908/325022289-56fee37d-34c0-40ef-8221-b71eed9a7f07.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTg4MDU0ODIsIm5iZiI6MTcxODgwNTE4MiwicGF0aCI6Ii81MDE5MTkwOC8zMjUwMjIyODktNTZmZWUzN2QtMzRjMC00MGVmLTgyMjEtYjcxZWVkOWE3ZjA3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjE5VDEzNTMwMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTZkOTFjNzU5ZTU1YzFlMDNlMzUzYzczZmY0YTYyZDUwOGZiMGMzZTgwYjhiOTdkMjIyZjg3NTIyZTkzYzYyYzMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.D7Dh0uQy64-p2CBssJk3JoGghPsjnBpU6wFdsqGNG2Y)
from autotrain-advanced.
thanks for reporting this issue. it turned out to be bigger than i had expected. the issue is now resolved in version 0.7.62+
sample jsonl:
{"tokens": ["I", "love", "Paris"], "tags": ["O", "O", "B-LOC"]}
{"tokens": ["I", "live", "in", "New", "York"], "tags": ["O", "O", "O", "B-LOC", "I-LOC"]}
{"tokens": ["I", "love", "Paris"], "tags": ["O", "O", "B-LOC"]}
{"tokens": ["I", "live", "in", "New", "York"], "tags": ["O", "O", "O", "B-LOC", "I-LOC"]}
{"tokens": ["I", "love", "Paris"], "tags": ["O", "O", "B-LOC"]}
{"tokens": ["I", "live", "in", "New", "York"], "tags": ["O", "O", "O", "B-LOC", "I-LOC"]}
{"tokens": ["I", "love", "Paris"], "tags": ["O", "O", "B-LOC"]}
{"tokens": ["I", "live", "in", "New", "York"], "tags": ["O", "O", "O", "B-LOC", "I-LOC"]}
{"tokens": ["I", "love", "Paris"], "tags": ["O", "O", "B-LOC"]}
{"tokens": ["I", "live", "in", "New", "York"], "tags": ["O", "O", "O", "B-LOC", "I-LOC"]}
{"tokens": ["I", "love", "Paris"], "tags": ["O", "O", "B-LOC"]}
{"tokens": ["I", "live", "in", "New", "York"], "tags": ["O", "O", "O", "B-LOC", "I-LOC"]}
{"tokens": ["I", "love", "Paris"], "tags": ["O", "O", "B-LOC"]}
{"tokens": ["I", "live", "in", "New", "York"], "tags": ["O", "O", "O", "B-LOC", "I-LOC"]}
{"tokens": ["I", "love", "Paris"], "tags": ["O", "O", "B-LOC"]}
{"tokens": ["I", "live", "in", "New", "York"], "tags": ["O", "O", "O", "B-LOC", "I-LOC"]}
{"tokens": ["I", "love", "Paris"], "tags": ["O", "O", "B-LOC"]}
{"tokens": ["I", "live", "in", "New", "York"], "tags": ["O", "O", "O", "B-LOC", "I-LOC"]}
sample csv:
tokens,tags
"['I', 'love', 'Paris']","['O', 'O', 'B-LOC']"
"['I', 'live', 'in', 'New', 'York']","['O', 'O', 'O', 'B-LOC', 'I-LOC']"
"['I', 'love', 'Paris']","['O', 'O', 'B-LOC']"
"['I', 'live', 'in', 'New', 'York']","['O', 'O', 'O', 'B-LOC', 'I-LOC']"
"['I', 'love', 'Paris']","['O', 'O', 'B-LOC']"
"['I', 'live', 'in', 'New', 'York']","['O', 'O', 'O', 'B-LOC', 'I-LOC']"
"['I', 'love', 'Paris']","['O', 'O', 'B-LOC']"
"['I', 'live', 'in', 'New', 'York']","['O', 'O', 'O', 'B-LOC', 'I-LOC']"
"['I', 'love', 'Paris']","['O', 'O', 'B-LOC']"
"['I', 'live', 'in', 'New', 'York']","['O', 'O', 'O', 'B-LOC', 'I-LOC']"
"['I', 'love', 'Paris']","['O', 'O', 'B-LOC']"
"['I', 'live', 'in', 'New', 'York']","['O', 'O', 'O', 'B-LOC', 'I-LOC']"
"['I', 'love', 'Paris']","['O', 'O', 'B-LOC']"
"['I', 'live', 'in', 'New', 'York']","['O', 'O', 'O', 'B-LOC', 'I-LOC']"
"['I', 'love', 'Paris']","['O', 'O', 'B-LOC']"
from autotrain-advanced.
please factory rebuild your autotrain space and make sure its on version 0.7.62 or above.
from autotrain-advanced.
It worked. Thanks for your fast support on this :)
from autotrain-advanced.
Related Issues (20)
- [BUG] Autotrain Object Detection Error: KeyError: 'autotrain_label' HOT 13
- [BUG] Object Detection AutoTrain Error: iteration over a 0-d tensor HOT 12
- Text Classification (Multi-Class) not supported anymore? HOT 2
- Image Classification Training Issue with Hugging Face's AutoTrain HOT 1
- [BUG]Tabular Classification Error HOT 2
- Sort models alphabetically in the UI HOT 1
- [FEATURE REQUEST]SimPO support
- [BUG] BuilderConfig 'qa' not found, when finetunnig custom embedding models HOT 2
- 404 or too many times HOT 3
- [FEATURE REQUEST]API of pony-diffusion-v6 is no longer displayed. HOT 7
- [BUG] "output tensor must have the same type as input tensor" error when i tried to finetune localy
- AutoTrain says "This space has been paused by owner" when I am not doing it. HOT 1
- can i use this on a orange pi 5? or cpu only? HOT 1
- [BUG] ImportError: cannot import name 'get_full_repo_name' from 'huggingface_hub' HOT 2
- NEFT noise alpha request
- [BUG] Incorrect Sort Parameter in fetch_models function HOT 3
- [FEATURE REQUEST] SD3 lora training support
- [BUG] KeyError: 'chat_template' HOT 10
- stable diffusion 3 support[FEATURE REQUEST]请支持sd3 HOT 1
- Where is the fine-tuned model output? HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autotrain-advanced.