huggingface / autotrain-advanced Goto Github PK

View Code? Open in Web Editor NEW

3.1K 3.1K 370.0 7.25 MB

🤗 AutoTrain Advanced

Home Page: https://huggingface.co/autotrain

License: Apache License 2.0

Makefile 0.18% Python 88.07% Dockerfile 0.35% Jupyter Notebook 2.67% HTML 8.73%

autotrain deep-learning huggingface machine-learning natural-language-processing natural-language-understanding python

autotrain-advanced's People

Contributors

Stargazers

Watchers

Forkers

sanardi stjordanis salujarohit adbmd goncaloperes restevesd dongpil codyworld guptam borenak wal1d ssundaranathan milan-chicago manikant92 nathanhundley steliord amit1nayak guillaumesimo gyanachand1 laranea testitesti22 luongtruong77 nishantyp ambitioner-c rakeshvarma-kasipeta mocoelho hamin123 aiinnova anshumank399 ishanbose raghavjha01 tiamat-tech marilia-cr-silva devmallyak sashank06 aditya-zutshi vishaljindal09 ankitshah009 bitoiu shuhua886 ppijbb yzhou1122 js-ts fudp muskanmahajan486 bridgecrew-perf6 julien-c omvishal1 summerflowers automationkit arianpasquali arer90 penny-admixture volgat techthiyanes opesaf63 iq-scm jfontestad keshavkmr48 geoxd ajunlonglive aniol13 melandz solanovisitor mjdhasan shi-kejian zengjixiang edsun3941 tmukande-debug f901107 pandyaved98 jmanhype ssarswat jhon-murillo osorioleomar loganamcnichols ibibek zeranamu hrafz thebadsektor orbitan wdshin sebaruizs abhishek-bak davidlanz vempaliakhil96 nicolas-soum willnco jesusoctavioas gavinchen1314 ramstorage buttlerkid ivoyavoya goreactdev computerauditor msgpo jinlmsft tonywhite11 wesley7137 zhengmk321

autotrain-advanced's Issues

Doc pages are missing --max_models

Hello,

--max_models is a required parameter. However, it doesn't appear in the task doc pages, e.g. https://huggingface.co/docs/autonlp/multi_class_classification.html

Thank you :)

How to use BERT trained model from Jupyter Notebook to another Ubuntu 20.04 server

We have finetuned our BERT model for text2text generation. It is working fine on the Jupyter notebook. But when I use the same trained model on another server of Ubuntu, then it shows the issue. This is my first post, so please bear with me. The issue I'm facing is that when I generate output on small sentences, it works fine. But on long sentences, it shows the following error:

At most 4 tokens in tensor([ 2, 2, 2, 2, 44763, 44763, 2, 44763]) can be equal to eos_token_id: 2. Make sure tensor([ 2, 2, 2, 2, 44763, 44763, 2, 44763]) are corrected.

My output generation code is:

from simpletransformers.seq2seq import Seq2SeqModel
#logging.basicConfig(level=logging.INFO)
#transformers_logger = logging.getLogger("transformers")
#transformers_logger.setLevel(logging.ERROR)
model = Seq2SeqModel(
    encoder_decoder_type="bart", encoder_decoder_name="PATHOFMODEL",use_cuda=False,)
while True:
    original = input("Enter text to paraphrase: ")
    to_predict = [original]
    preds = model.predict(to_predict)
    for pred in preds[0]:
        print(pred)

This code works fine on notebook server where I trained the model. But I install all the dependencies on simple ubuntu server and then run this code with the trained model files. It works for some sentences but not for some other sentences.

Here's the complete issue on StackOverflow.
https://stackoverflow.com/q/67195582
Someone told me to change the TensorFlow version. I tried it. It worked for one day. After that, I faced the same problem again.

can we handle 4:96 imbalanced dataset?

minority sample are around ~300 in number

Text generation task?

Any plans to implement this?

Training failed

Hello,

Tried to train a model using AutoNLP and all 5 models failed. I verified the input CSVs are valid by loading them using datasets.

Any idea?

The project is called "appliances" and the owner is "mostrovsky"

Brazilian Portuguese (pt-BR) language support

Really anxious for this feature!

specifying hub_model in create_project not working

I tried to create a project with the given configuration:

project = client.create_project(name="project", task="single_column_regression", language="en", max_models=1, hub_model="EleutherAI/gpt-neo-2.7B")

and keep getting

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-23-f15ed7bdca2f> in <module>
----> 1 project = client.create_project(name="project", task="single_column_regression", language="en", max_models=1, hub_model="EleutherAI/gpt-neo-2.7B")
      2

/usr/local/lib/python3.7/site-packages/autonlp/autonlp.py in create_project(self, name, task, language, max_models, hub_model)
     95             },
     96         }
---> 97         json_resp = http_post(path="/projects/create", payload=payload, token=self.token).json()
     98         proj_name = json_resp["proj_name"]
     99         created = json_resp["created"]

/usr/local/lib/python3.7/site-packages/autonlp/utils.py in http_post(path, token, payload, domain, suppress_logs, **kwargs)
     64     except requests.exceptions.ConnectionError:
     65         raise UnreachableAPIError("❌ Failed to reach AutoNLP API, check your internet connection")
---> 66     response.raise_for_status()
     67     return response
     68 

/usr/local/lib/python3.7/site-packages/requests/models.py in raise_for_status(self)
    941 
    942         if http_error_msg:
--> 943             raise HTTPError(http_error_msg, response=self)
    944 
    945     def close(self):

HTTPError: 400 Client Error: Bad Request for url: https://api.autonlp.huggingface.co/projects/create

also if I specify other models like gpt2. It works when I set hub_model="", though. Is the problem the models I try to specify which do not support the single_column_regression task?

Cannot upload training set from CLI

When I am trying to upload training set from the CLI as per the instructions given in the README I get the following error:-

If not specifying `clone_from`, you need to pass Repository a valid git clone.
Traceback (most recent call last):
  File "/Users/****/opt/anaconda3/bin/autonlp", line 8, in <module>
    sys.exit(main())
  File "/Users/****/opt/anaconda3/lib/python3.8/site-packages/autonlp/cli/autonlp.py", line 40, in main
    command.run()
  File "/Users/****/opt/anaconda3/lib/python3.8/site-packages/autonlp/cli/upload.py", line 109, in run
    project.upload(filepaths=files, split=self._split, col_mapping=col_maps)
  File "/Users/****/opt/anaconda3/lib/python3.8/site-packages/autonlp/project.py", line 171, in upload
    dataset_repo = Repository(
  File "/Users/****/opt/anaconda3/lib/python3.8/site-packages/huggingface_hub/repository.py", line 69, in __init__
    raise ValueError(
ValueError: If not specifying `clone_from`, you need to pass Repository a valid git clone.

Getting the same error from the python API too.

Arabic language support

Is it possible for the Arabic language?

How to eliminate AutoNLP projects

There is obviously the possibility of creating projects but is there a command to delete projects? Or where are those definitions stored (what is the location of the config file)

add Russian language

Regression available?

On the website it says that regression is one of the available tasks. In the documentation it's not mentioned. Is it (already) available?

JSONDecodeError when uploading a valid CSV

When attempting to upload a CSV training set for my model I receive a JSONDecodeError error. I tried uploading my smaller validation set too, but it also failed. I'm not entirely sure why JSON decoders are even being ran against a CSV file.

At first I thought maybe the CSV was invalid, but it checks out. I am not sure how to debug this problem.

Any help is greatly appreciated! Thank you.

Valid CSV

$ csvclean ~/training_set.csv
No errors.

Example CSV data

col_one,col_two
TRUE,"Lorem ipsum dolor sit amet, consectetur adipiscing elit"
FALSE,"Ut id ex luctus ""with quoted text inside"" vitae tincidunt nibh"
TRUE,"Nam ligula nibh, dapibus eget justo vitae"
FALSE,"Cras sed molestie enim. Etiam facilisis erat id bibendum"

Upload attempt

$ autonlp upload --project my_project \
	--split train \
	--col_mapping col_one:target,col_two:text \
	--files ~/training_set.csv

> INFO    Uploading files for project: my_project
> INFO    🗝 Retrieving credentials from config...
> INFO    ☁ Retrieving project 'my_project' from AutoNLP...
> INFO    🔄 Refreshing project status...
> INFO    🔄 Refreshing uploaded files information...
> INFO    🔄 Refreshing models information...
> INFO    🔄 Refreshing cost information...
> INFO    ✅ Successfully loaded project: 'my_project'!
> INFO    Mapping: {'col_one': 'target', 'col_two': 'text'}
Traceback (most recent call last):
  File "/usr/local/bin/autonlp", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/autonlp/cli/autonlp.py", line 57, in main
    details = err.response.json().get("detail")
  File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 900, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/local/Cellar/[email protected]/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/Cellar/[email protected]/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/Cellar/[email protected]/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Environment

$ autonlp --version
0.3.1

$ python -V
Python 3.9.5

$ pip -V
pip 21.1.3

Failed to upload processed data files to huggingface hub

INFO Fetching info for project: Summarization
INFO 🗝 Retrieving credentials from config...
INFO ☁ Retrieving project 'Summarization' from AutoNLP...
INFO 🔄 Refreshing project status...
INFO 🔄 Refreshing uploaded files information...
INFO 🔄 Refreshing models information...
INFO 🔄 Refreshing cost information...
INFO ✅ Successfully loaded project: 'Summarization'!
AutoNLP Project (id # 2068)

 • Name:        Summarization
 • Owner:       hiiamsid
 • Status:      ❌ Failed to upload processed data files to the huggingface hub
 • Task:        Summarization
 • Created at:  2021-10-18 06:33 Z
 • Last update: 2021-10-18 10:55 Z

💰 Project current cost: USD 481.79

~~~~~~~~~~~~~~ Files ~~~~~~~~~~~~~~

Dataset ID:
hiiamsid/autonlp-data-Summarization

📁 spanish_paraphrase_train.csv (id # 1775)
   • Split:             train
   • Processing status: ✅ Success!
   • Last update:       2021-10-18 10:44 Z
📁 es_paraphrase_test.csv (id # 1776)
   • Split:             valid
   • Processing status: ✅ Success!
   • Last update:       2021-10-18 10:43 Z

Add a list_project command to the CLI and Python package

Users should be able to list their projects

Swedish language support

Docs - Entity Extraction page is empty

Hi all, thanks for this amazing product!

I am currently testing various features on the trial version, but the Entity Extraction documentation page at https://huggingface.co/docs/autonlp/entity_extraction.html is currently empty. Is this feature not offered yet? Or can I find its documentation elsewhere? Thanks!

TypeError: get_project() missing 1 required positional argument: 'is_eval' when uploading

When uploading a dataset, upload.py raises a TypeError stating that get_project() is missing is_eval argument.

$ autonlp upload --project sentiment_detection --split train \
               --col_mapping review:text,sentiment:target \
               --files ~/datasets/train.csv
> INFO    Uploading files for project: sentiment_detection
Traceback (most recent call last):
  File "/Users/sb/.pyenv/versions/test/bin/autonlp", line 8, in <module>
    sys.exit(main())
  File "/Users/sb/.pyenv/versions/test/lib/python3.9/site-packages/autonlp/cli/autonlp.py", line 54, in main
    command.run()
  File "/Users/sb/.pyenv/versions/test/lib/python3.9/site-packages/autonlp/cli/upload.py", line 82, in run
    project = client.get_project(name=self._name)
TypeError: get_project() missing 1 required positional argument: 'is_eval'

If I manually revise upload.py as follows

project = client.get_project(name=self._name, is_eval=False)

things work as they should. Likewise, this happens when trying to run the train command.

$ autonlp train --project sentiment_detection
> INFO    Starting Training For Project: sentiment_detection
Traceback (most recent call last):
  File "/Users/sb/.pyenv/versions/test/bin/autonlp", line 8, in <module>
    sys.exit(main())
  File "/Users/sb/.pyenv/versions/test/lib/python3.9/site-packages/autonlp/cli/autonlp.py", line 54, in main
    command.run()
  File "/Users/sb/.pyenv/versions/test/lib/python3.9/site-packages/autonlp/cli/train.py", line 34, in run
    project = client.get_project(name=self._name)
TypeError: get_project() missing 1 required positional argument: 'is_eval'

I'm using autonlp version 0.3.0.

$ autonlp --version
0.3.0

Malay Language Support

Is it possible to add the Malay Language into the service?

Is GPT-neo also considered

Will autonlp also train GPT-neo models for tasks like regression and classification?

Hindi language pipeline isn't available for coref

autonlp metrics --project is broken

How to reproduce:

autonlp metrics --project <project_name>

Expected output:

2021-02-25 14:02:18.178 | INFO     | autonlp.autonlp:_login_from_conf:67 - 🗝 Retrieving credentials from config...
Traceback (most recent call last):
  File "C:\Users\sbran\miniconda3\envs\autonlp-front\Scripts\autonlp-script.py", line 33, in <module>
    sys.exit(load_entry_point('autonlp', 'console_scripts', 'autonlp')())
  File "c:\users\sbran\documents\dev\huggingface\autonlp\src\autonlp\cli\autonlp.py", line 34, in main
    command.run()
  File "c:\users\sbran\documents\dev\huggingface\autonlp\src\autonlp\cli\metrics.py", line 33, in run
    _ = client.get_metrics(model_id=self._model_id, project=self._project)
  File "c:\users\sbran\documents\dev\huggingface\autonlp\src\autonlp\autonlp.py", line 146, in get_metrics
    _metrics = Metrics.from_json_resp(
  File "c:\users\sbran\documents\dev\huggingface\autonlp\src\autonlp\metrics.py", line 21, in from_json_resp
    language=json_resp["config"]["language"],
KeyError: 'config'

This is due to fact we recently stripped config from project API response in the backend, in fact removing language information.

Domain specific pre-training and Question Answering

Hallo,
this is more an inquiry about future features than "an issue" on currently implemented functionalities.

(1) are you planning to add the QA task too ?
(2) Are you planning to offer also the possibility to continue pre-training on a specific domain ? ( I include a chart to show what I mean)

Readme: Python API

Parameter max_models not included in the create_project function call in the Readme.

Suggested:
project = client.create_project(name="sentiment_detection", task="binary_classification", language="en", max_models=5)

Choosing a separator

Hi! This issue is most like a recommendation.
When I tried to upload a .csv file in huggingface, I don't have the option to choose a separator.
I understand that the .csv file is a "comma separated values" but in some cases the .csv has another separator, like in this case.

I think the problem it will resolve if I change the "|" to "," but hopefully, in the future, you can implement something like what I said above.
Best regards!

Project is not in active state

My project status is still created and not in active state. Unable to train the model.

AutoNLP error with Git clone during data upload (python)

When calling the "project.upload" function in python I get the following error:
I've substituted the AutoNLP api token with "API_TOKEN" and the folder/model name with MODEL_PATH. How can I fix it?

CalledProcessError Traceback (most recent call last)
~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/site-packages/huggingface_hub/repository.py in clone_from(self, repo_url, use_auth_token)
147 encoding="utf-8",
--> 148 cwd=self.local_dir,
149 )

~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs)
511 raise CalledProcessError(retcode, process.args,
--> 512 output=stdout, stderr=stderr)
513 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '['git', 'clone', 'https://user:[email protected]/datasets/MODEL_PATH', '.']' returned non-zero exit status 128.

During handling of the above exception, another exception occurred:

OSError Traceback (most recent call last)
in
5 col_mapping={
6 "document":"text",
----> 7 "summary":"target"})
8
9 # Upload the validation set

~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/site-packages/autonlp/project.py in upload(self, filepaths, split, col_mapping, path_to_audio)
224 raise ValueError("'path_to_audio' must be provided when task is 'speech_recognition'")
225
--> 226 dataset_repo = self._clone_dataset_repo()
227 local_dataset_dir = dataset_repo.local_dir
228

~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/site-packages/autonlp/project.py in _clone_dataset_repo(self)
364 local_dir=local_dataset_dir,
365 clone_from=clone_from,
--> 366 use_auth_token=self._token,
367 )
368 try:

~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/site-packages/huggingface_hub/repository.py in init(self, local_dir, clone_from, use_auth_token, git_user, git_email)
59
60 if clone_from is not None:
---> 61 self.clone_from(repo_url=clone_from, use_auth_token=use_auth_token)
62 else:
63 if os.path.isdir(os.path.join(self.local_dir, ".git")):

~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/site-packages/huggingface_hub/repository.py in clone_from(self, repo_url, use_auth_token)
218
219 except subprocess.CalledProcessError as exc:
--> 220 raise EnvironmentError(exc.stderr)
221
222 def git_config_username_and_email(

OSError: Cloning into '.'...
remote: Repository not found.
fatal: repository 'https://huggingface.co/datasets/MODEL_PATH/' not found

Japanese and Vietnamese Language support

Is it possible for the Japanese and Vietnamese languages?

Korean language support

Is it possible?

Swedish language support

... would be much appreciated!

Italian language support

Please, make it possible.

Is there any way I can help?
What is the procedure to support a new language?

Multi Class Classification | Base Config Issue

While setting up project for Multiclass classification, in the very first step below:

(base)C:\WINDOWS\system32>autonlp create_project --name custom_classifier --language en --task multi_class_classification

INFO Creating project: custom_classifier with task: multi_class_classification
INFO 🗝 Retrieving credentials from config...
ERROR ❌ Oops! Something failed in AutoNLP backend..
ERROR Error code: 400; Details: 'Invalid config: 1 validation error for BaseConfig
max_models
field required (type=value_error.missing)'

Server 500 while trying to create a project

When i followed the instructions on github, i got an error while creating a project with this line;
project = client.create_project(name="test-sentiment", task="multi_class_classification", language="en")

It gave the errors as below;
autonlp.utils:http_post:75 - ❌ Operation failed! Details: Internal Server Error
...
...
HTTPError: 500 Server Error: Internal Server Error for url: https://api.autonlp.huggingface.co/projects/create

NOTE: i was able to login successfully.
from autonlp import AutoNLP
client = AutoNLP()
client.login(token="MY HUGGINGFACE TOKEN")
2021-03-22 14:59:57.056 | INFO | autonlp.autonlp:login:51 - 🗝 Successfully logged in as gurkandy
2021-03-22 14:59:57.057 | INFO | autonlp.autonlp:login:58 - 🗝 Storing credentials in: MY HOME FOLDER

Status: Failed to process data files

Hi,
I have created a project "intent" under user: hepbc. Am able to upload data files. But get the following when I try and get project status:
AutoNLP Project (id # 163)

 • Name:        intent
 • Owner:       hepbc
 • Status:      ❌ Failed to process data files
 • Task:        Multi Class Classification
 • Created at:  2021-05-09 16:40 Z
 • Last update: 2021-05-18 14:39 Z

💰 Project current cost: USD 7.50

~~~~~~~~~~~~~~ Files ~~~~~~~~~~~~~~

Dataset ID:
hepbc/autonlp-data-intent

📁 train.csv (id # 186)
   • Split:             train
   • Processing status: ✅ Success!
   • Last update:       2021-05-10 15:00 Z
📁 train.csv (id # 244)
   • Split:             train
   • Processing status: ✅ Success!
   • Last update:       2021-05-18 14:58 Z
📁 valid.csv (id # 187)
   • Split:             valid
   • Processing status: ✅ Success!
   • Last update:       2021-05-10 15:01 Z
📁 valid.csv (id # 245)
   • Split:             valid
   • Processing status: ✅ Success!
   • Last update:       2021-05-18 14:59 Z

~~~~~~~~~~~~ Models ~~~~~~~~~~~

+----+--------+--------+--------------------+--------------------+
|    |   ID   | Status |   Creation date    |    Last update     |
+----+--------+--------+--------------------+--------------------+
| 🚀 | 163655 | start  | 2021-05-18 14:59 Z | 2021-05-18 14:59 Z |
| ⌚ | 163656 | queued | 2021-05-18 14:59 Z | 2021-05-18 14:59 Z |
| ⌚ | 163657 | queued | 2021-05-18 14:59 Z | 2021-05-18 14:59 Z |
| ⌚ | 163658 | queued | 2021-05-18 14:59 Z | 2021-05-18 14:59 Z |
| ⌚ | 163659 | queued | 2021-05-18 14:59 Z | 2021-05-18 14:59 Z |
+----+--------+--------+--------------------+--------------------+

Request some help in identifying the issue. Many thanks!

-BC

docs: metrics doesn't accept --model

Hi,
The readme on the repo says I can use autonlp metrics --model xxx but that doesn't work

Also the CLI doesn't have a --version flag. But python says I'm on 0.0.1

To support Bangla in autonlp

Please add Bangla to the supported language so as it can cover bangla scenarios

Login autonlp behind a proxy server

Hello,

My account is enabled for AutoNLP.
I'm following the page to install autonlp on a Windows 10, Python version 3.8.5.

I tried the autonlp login via a terminal.

Command :
autonlp login --api-key MY_HUGGING_FACE_API_TOKEN

Results:

Traceback (most recent call last):
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\autonlp\utils.py", line 41, in http_get
response = requests.get(
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\requests\api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/whoami-v2 (Caused by SSLError(SSLCertVerificationError(1, 'SSL: CERTIFICATE_VERIFY_FAILED certificate verify failed: self signed certificate in certificate chain (_ssl.c:1123)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users<my-username>\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users<my-username>\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users<my-username>\Dev\huggingface\venv\Scripts\autonlp.exe_main.py", line 7, in
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\autonlp\cli\autonlp.py", line 52, in main
command.run()
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\autonlp\cli\login.py", line 31, in run
client.login(token=self._api_key)
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\autonlp\autonlp.py", line 41, in login
auth_resp = http_get(path="/whoami-v2", domain=config.HF_API, token=token, token_prefix="Bearer")
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\autonlp\utils.py", line 45, in http_get
raise UnreachableAPIError("❌ Failed to reach AutoNLP API, check your internet connection")
autonlp.utils.UnreachableAPIError: ❌ Failed to reach AutoNLP API, check your internet connection

Could you please tell me how to specify a proxy in the login command ?

Many thanks for your help on this,

Turkish language support

Is it possibile?

UX: strip models from project str

Model list can be very long, resulting in a clogged output
We need ways for the user to filter / shorten this list

Inconsistent Cost Estimates

I get that cost estimates are uncertain, but I would expect different parts of the command line experience to agree on the estimate. I have a train set with 20K examples.

When I ask for an estimate direct I get:

> autonlp estimate --num_train_samples 20000 --project_name intent_detection

> INFO    🗝 Retrieving credentials from config...
> INFO    ☁ Retrieving project 'intent_detection' from AutoNLP...
> INFO    🔄 Refreshing project status...
> INFO    🔄 Refreshing uploaded files information...
> INFO    🔄 Refreshing models information...
> INFO    🔄 Refreshing cost information...
> INFO    ✅ Successfully loaded project: 'intent_detection'!
Cost range: 7.5 - 12.5 USD

But the training command suggests a different price.

> autonlp train --project intent_detection             
                        
> INFO    Starting Training For Project: intent_detection
> INFO    🗝 Retrieving credentials from config...
> INFO    ☁ Retrieving project 'intent_detection' from AutoNLP...
> INFO    🔄 Refreshing project status...
> INFO    🔄 Refreshing uploaded files information...
> INFO    🔄 Refreshing models information...
> INFO    🔄 Refreshing cost information...
> INFO    ✅ Successfully loaded project: 'intent_detection'!
> INFO    🔄 Refreshing project status...
> INFO    🔄 Refreshing uploaded files information...
> INFO    🔄 Refreshing models information...
> INFO    🔄 Refreshing cost information...
> INFO    🔎 Calculating a cost estimate for the training...

💰 The training cost for this project will be in this range:
 USD 18.75 to USD 31.25

 Once training is complete, we will send you an email invoice for the actual training cost within that range.

Download eval/test predictions? Predict a file at a time instead of a sentence?

I'd like to review the predictions made by the model on the eval data but don't see that kept anywhere. Maybe you could let autonlp predict accept a file of sentences instead of one sentence on the command line?

Billing status not validated

Hi,
I'm on the free plan and don't have payment details on file, but can still launch a training job.
If that's on purpose, thank you. If not ....

My username is talolard

How to know evaluation task is finished

Currently, for finetuning, once the training is launched, we can know when the finetuned models are ready:

project.train(noprompt=True)
project.refresh()
all_jobs_finished = all(job.status == "success" for job in project.training_jobs)

However, we do not have an analogue way for evaluation. Currently I keep trying to clone the evaluation model repo until this succeeds once the evaluation is finished. Indeed, I have to wait a little more after the repo is cloned, because the README file is generated in a commit some time after the repo is created.

I wonder if it would be possible to implement something similar to the finetuning case, like:

evaluation_job = client.create_evaluation(...)
evaluation_job.refresh()
job_finished = (evaluation_job.status == "success")

'sentence' could not be found

> INFO    Uploading files for project: intercom_sentiment_model
> INFO    🗝 Retrieving credentials from config...
> INFO    ☁ Retrieving project 'intercom_sentiment_model' from AutoNLP...
> INFO    🔄 Refreshing project status...
> INFO    🔄 Refreshing uploaded files information...
> INFO    🔄 Refreshing models information...
> INFO    🔄 Refreshing cost information...
> INFO    ✅ Successfully loaded project: 'intercom_sentiment_model'!
> INFO    Mapping: {'sentence': 'text', 'label': 'target'}
> INFO    [1/1] 🔎 Validating /Users/robzeydelis/Downloads/train.csv and column mapping...
> ERROR   ❌ Something went wrong!
> ERROR   Details:
> ERROR   Columns 'sentence' could not be found in the provided file (which has columns: 'sentence','label')

It is not funding the column sentence, but it is showing the column sentence in the parentheses. Anyone know why? I tried removing all formatting and even creating a new csv file, but nothing works. Any help would be much appreciated!

Training failed how to restart training?

So training failed, how to re-train it due to a server error or something?
autonlp is latest version, python is 3.9

gorkemgoknar@Gorkem-MacBook-Pro:~/Desktop/autonlptest$ autonlp project_info --name sentiment_turkish
2021-03-11 16:35:20.687 | INFO | autonlp.cli.project_info:run:28 - Fetching info for project: sentiment_turkish
2021-03-11 16:35:20.688 | INFO | autonlp.autonlp:_login_from_conf:66 - 🗝 Retrieving credentials from config...
2021-03-11 16:35:20.688 | INFO | autonlp.autonlp:get_project:109 - ☁ Retrieving project 'sentiment_turkish' from AutoNLP...
2021-03-11 16:35:21.205 | INFO | autonlp.project:refresh:195 - 🔄 Refreshing uploaded files information...
2021-03-11 16:35:21.700 | INFO | autonlp.project:refresh:200 - 🔄 Refreshing models information...
2021-03-11 16:35:22.206 | INFO | autonlp.autonlp:get_project:121 - ✅ Successfully loaded project: 'sentiment_turkish'!
AutoNLP Project (id # 29)

 • Name:        sentiment_turkish
 • Owner:       gorkemgoknar
 • Status:      ❌ Failed to download data files from the huggingface hub
 • Task:        Binary Classification
 • Created at:  2021-03-11 12:51 Z
 • Last update: 2021-03-11 13:34 Z

~~~~~~~~~~~~~~ Files ~~~~~~~~~~~~~~

Dataset ID:
gorkemgoknar/autonlp-data-sentiment_turkish

📁 turkish_movie_train.csv (id # 27)
   • Split:             train
   • Processing status: ❌ Failed: server error
   • Last update:       2021-03-11 13:20 Z
📁 turkish_movie_valid.csv (id # 28)
   • Split:             valid
   • Processing status: ❓ Unhandled status! Please update autonlp
   • Last update:       2021-03-11 13:32 Z

~~~~~~~~~~~~ Models ~~~~~~~~~~~

🤷 No train jobs started yet!

Training in status: Failed

Hello,

I tried to train 5 models for my binary_classification problem however, I got this error, how can I know what went wrong to fix it and try again?

Thank you.
`📁 train.csv (id # 138)
• Split: train
• Processing status: ✅ Success!
• Last update: 2021-04-24 23:44 Z
📁 valid.csv (id # 139)
• Split: valid
• Processing status: ✅ Success!
• Last update: 2021-04-24 23:45 Z


+----+--------+--------+--------------------+--------------------+
|    |   ID   | Status |   Creation date    |    Last update     |
+----+--------+--------+--------------------+--------------------+
| ❌ | 128413 | failed | 2021-04-24 23:49 Z | 2021-04-24 23:57 Z |
| ❌ | 128414 | failed | 2021-04-24 23:49 Z | 2021-04-24 23:57 Z |
| ❌ | 128415 | failed | 2021-04-24 23:49 Z | 2021-04-24 23:57 Z |
| ❌ | 128416 | failed | 2021-04-24 23:49 Z | 2021-04-24 23:57 Z |
| ❌ | 128417 | failed | 2021-04-24 23:49 Z | 2021-04-24 23:57 Z |
+----+--------+--------+--------------------+--------------------+`

Gujarati language support

Is it possible to add Gujarati language into the service?

Auto NLP authentication issue

I tried authentication with the below command as per documentation, but getting the below error:
autonlp login --api-key YOUR_HUGGING_FACE_API_TOKEN

Error Trace:

> autonlp login --api-key YOUR_HUGGING_FACE_API_TOKEN
Traceback (most recent call last):
  File "/home/sudharsan/anaconda3/envs/auto/bin/autonlp", line 8, in <module>
    sys.exit(main())
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/autonlp/cli/autonlp.py", line 56, in main
    command.run()
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/autonlp/cli/login.py", line 31, in run
    client.login(token=self._api_key)
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/autonlp/autonlp.py", line 43, in login
    auth_resp = http_get(path="/whoami-v2", domain=config.HF_API, token=token, token_prefix="Bearer")
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/autonlp/utils.py", line 43, in http_get
    url=domain + path, headers=get_auth_headers(token=token, prefix=token_prefix), **kwargs
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/sessions.py", line 528, in request
    prep = self.prepare_request(req)
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/sessions.py", line 466, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/models.py", line 316, in prepare
    self.prepare_url(url, params)
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/models.py", line 390, in prepare_url
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL 'YOUR_HUGGING_FACE_API_TOKEN/whoami-v2': No schema supplied. Perhaps you meant http://YOUR_HUGGING_FACE_API_TOKEN/whoami-v2?

Setup:

OS: Ubuntu 18.04 LTS
Version of the Library used: autonlp==0.3.4
Python version: 3.7

Can anyone help to resolve the issue?

a question for using autonlp for very little data

Hello, I was planning to use autonlp for a personal project of mine for multi class classification. I have generated sentence embeddings for a lot of sentences and they have labels, which I am planning to use as data. I have very little data for training the model, around 200 data points. Shall I try using autonlp or shall I use something else? Thanks.

huggingface / autotrain-advanced Goto Github PK

autotrain-advanced's People

Contributors

Stargazers

Watchers

Forkers

autotrain-advanced's Issues

Valid CSV

Example CSV data

Upload attempt

Environment

Recommend Projects

Recommend Topics

Recommend Org

Jobs