GithubHelp home page GithubHelp logo

huggingface / autotrain-advanced Goto Github PK

View Code? Open in Web Editor NEW
3.1K 3.1K 370.0 7.25 MB

๐Ÿค— AutoTrain Advanced

Home Page: https://huggingface.co/autotrain

License: Apache License 2.0

Makefile 0.18% Python 88.07% Dockerfile 0.35% Jupyter Notebook 2.67% HTML 8.73%
autotrain deep-learning huggingface machine-learning natural-language-processing natural-language-understanding python

autotrain-advanced's People

Contributors

abhishekkrthakur avatar albertvillanova avatar anshumank399 avatar apolinario avatar davidberenstein1957 avatar julien-c avatar knoodrake avatar lewtun avatar liveaverage avatar marcd123 avatar mishig25 avatar rishiraj avatar rtrompier avatar sbrandeis avatar standardai avatar stefan-it avatar tcmaps avatar thomwolf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

autotrain-advanced's Issues

How to use BERT trained model from Jupyter Notebook to another Ubuntu 20.04 server

We have finetuned our BERT model for text2text generation. It is working fine on the Jupyter notebook. But when I use the same trained model on another server of Ubuntu, then it shows the issue. This is my first post, so please bear with me. The issue I'm facing is that when I generate output on small sentences, it works fine. But on long sentences, it shows the following error:

At most 4 tokens in tensor([ 2, 2, 2, 2, 44763, 44763, 2, 44763]) can be equal to eos_token_id: 2. Make sure tensor([ 2, 2, 2, 2, 44763, 44763, 2, 44763]) are corrected.

My output generation code is:

from simpletransformers.seq2seq import Seq2SeqModel
#logging.basicConfig(level=logging.INFO)
#transformers_logger = logging.getLogger("transformers")
#transformers_logger.setLevel(logging.ERROR)
model = Seq2SeqModel(
    encoder_decoder_type="bart", encoder_decoder_name="PATHOFMODEL",use_cuda=False,)
while True:
    original = input("Enter text to paraphrase: ")
    to_predict = [original]
    preds = model.predict(to_predict)
    for pred in preds[0]:
        print(pred)

This code works fine on notebook server where I trained the model. But I install all the dependencies on simple ubuntu server and then run this code with the trained model files. It works for some sentences but not for some other sentences.

Here's the complete issue on StackOverflow.
https://stackoverflow.com/q/67195582
Someone told me to change the TensorFlow version. I tried it. It worked for one day. After that, I faced the same problem again.

Training failed

Hello,

Tried to train a model using AutoNLP and all 5 models failed. I verified the input CSVs are valid by loading them using datasets.

Any idea?

The project is called "appliances" and the owner is "mostrovsky"

specifying hub_model in create_project not working

I tried to create a project with the given configuration:

project = client.create_project(name="project", task="single_column_regression", language="en", max_models=1, hub_model="EleutherAI/gpt-neo-2.7B")

and keep getting

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-23-f15ed7bdca2f> in <module>
----> 1 project = client.create_project(name="project", task="single_column_regression", language="en", max_models=1, hub_model="EleutherAI/gpt-neo-2.7B")
      2

/usr/local/lib/python3.7/site-packages/autonlp/autonlp.py in create_project(self, name, task, language, max_models, hub_model)
     95             },
     96         }
---> 97         json_resp = http_post(path="/projects/create", payload=payload, token=self.token).json()
     98         proj_name = json_resp["proj_name"]
     99         created = json_resp["created"]

/usr/local/lib/python3.7/site-packages/autonlp/utils.py in http_post(path, token, payload, domain, suppress_logs, **kwargs)
     64     except requests.exceptions.ConnectionError:
     65         raise UnreachableAPIError("โŒ Failed to reach AutoNLP API, check your internet connection")
---> 66     response.raise_for_status()
     67     return response
     68 

/usr/local/lib/python3.7/site-packages/requests/models.py in raise_for_status(self)
    941 
    942         if http_error_msg:
--> 943             raise HTTPError(http_error_msg, response=self)
    944 
    945     def close(self):

HTTPError: 400 Client Error: Bad Request for url: https://api.autonlp.huggingface.co/projects/create

also if I specify other models like gpt2. It works when I set hub_model="", though. Is the problem the models I try to specify which do not support the single_column_regression task?

Cannot upload training set from CLI

When I am trying to upload training set from the CLI as per the instructions given in the README I get the following error:-

If not specifying `clone_from`, you need to pass Repository a valid git clone.
Traceback (most recent call last):
  File "/Users/****/opt/anaconda3/bin/autonlp", line 8, in <module>
    sys.exit(main())
  File "/Users/****/opt/anaconda3/lib/python3.8/site-packages/autonlp/cli/autonlp.py", line 40, in main
    command.run()
  File "/Users/****/opt/anaconda3/lib/python3.8/site-packages/autonlp/cli/upload.py", line 109, in run
    project.upload(filepaths=files, split=self._split, col_mapping=col_maps)
  File "/Users/****/opt/anaconda3/lib/python3.8/site-packages/autonlp/project.py", line 171, in upload
    dataset_repo = Repository(
  File "/Users/****/opt/anaconda3/lib/python3.8/site-packages/huggingface_hub/repository.py", line 69, in __init__
    raise ValueError(
ValueError: If not specifying `clone_from`, you need to pass Repository a valid git clone.

Getting the same error from the python API too.

How to eliminate AutoNLP projects

There is obviously the possibility of creating projects but is there a command to delete projects? Or where are those definitions stored (what is the location of the config file)

Regression available?

On the website it says that regression is one of the available tasks. In the documentation it's not mentioned. Is it (already) available?

JSONDecodeError when uploading a valid CSV

When attempting to upload a CSV training set for my model I receive a JSONDecodeError error. I tried uploading my smaller validation set too, but it also failed. I'm not entirely sure why JSON decoders are even being ran against a CSV file.

At first I thought maybe the CSV was invalid, but it checks out. I am not sure how to debug this problem.

Any help is greatly appreciated! Thank you.

Valid CSV

$ csvclean ~/training_set.csv
No errors.

Example CSV data

col_one,col_two
TRUE,"Lorem ipsum dolor sit amet, consectetur adipiscing elit"
FALSE,"Ut id ex luctus ""with quoted text inside"" vitae tincidunt nibh"
TRUE,"Nam ligula nibh, dapibus eget justo vitae"
FALSE,"Cras sed molestie enim. Etiam facilisis erat id bibendum"

Upload attempt

$ autonlp upload --project my_project \
	--split train \
	--col_mapping col_one:target,col_two:text \
	--files ~/training_set.csv

> INFO    Uploading files for project: my_project
> INFO    ๐Ÿ— Retrieving credentials from config...
> INFO    โ˜ Retrieving project 'my_project' from AutoNLP...
> INFO    ๐Ÿ”„ Refreshing project status...
> INFO    ๐Ÿ”„ Refreshing uploaded files information...
> INFO    ๐Ÿ”„ Refreshing models information...
> INFO    ๐Ÿ”„ Refreshing cost information...
> INFO    โœ… Successfully loaded project: 'my_project'!
> INFO    Mapping: {'col_one': 'target', 'col_two': 'text'}
Traceback (most recent call last):
  File "/usr/local/bin/autonlp", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/autonlp/cli/autonlp.py", line 57, in main
    details = err.response.json().get("detail")
  File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 900, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/local/Cellar/[email protected]/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/Cellar/[email protected]/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/Cellar/[email protected]/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Environment

$ autonlp --version
0.3.1

$ python -V
Python 3.9.5

$ pip -V
pip 21.1.3

Failed to upload processed data files to huggingface hub

INFO Fetching info for project: Summarization
INFO ๐Ÿ— Retrieving credentials from config...
INFO โ˜ Retrieving project 'Summarization' from AutoNLP...
INFO ๐Ÿ”„ Refreshing project status...
INFO ๐Ÿ”„ Refreshing uploaded files information...
INFO ๐Ÿ”„ Refreshing models information...
INFO ๐Ÿ”„ Refreshing cost information...
INFO โœ… Successfully loaded project: 'Summarization'!
AutoNLP Project (id # 2068)

 โ€ข Name:        Summarization
 โ€ข Owner:       hiiamsid
 โ€ข Status:      โŒ Failed to upload processed data files to the huggingface hub
 โ€ข Task:        Summarization
 โ€ข Created at:  2021-10-18 06:33 Z
 โ€ข Last update: 2021-10-18 10:55 Z

๐Ÿ’ฐ Project current cost: USD 481.79

~~~~~~~~~~~~~~ Files ~~~~~~~~~~~~~~

Dataset ID:
hiiamsid/autonlp-data-Summarization

๐Ÿ“ spanish_paraphrase_train.csv (id # 1775)
   โ€ข Split:             train
   โ€ข Processing status: โœ… Success!
   โ€ข Last update:       2021-10-18 10:44 Z
๐Ÿ“ es_paraphrase_test.csv (id # 1776)
   โ€ข Split:             valid
   โ€ข Processing status: โœ… Success!
   โ€ข Last update:       2021-10-18 10:43 Z

TypeError: get_project() missing 1 required positional argument: 'is_eval' when uploading

When uploading a dataset, upload.py raises a TypeError stating that get_project() is missing is_eval argument.

$ autonlp upload --project sentiment_detection --split train \
               --col_mapping review:text,sentiment:target \
               --files ~/datasets/train.csv
> INFO    Uploading files for project: sentiment_detection
Traceback (most recent call last):
  File "/Users/sb/.pyenv/versions/test/bin/autonlp", line 8, in <module>
    sys.exit(main())
  File "/Users/sb/.pyenv/versions/test/lib/python3.9/site-packages/autonlp/cli/autonlp.py", line 54, in main
    command.run()
  File "/Users/sb/.pyenv/versions/test/lib/python3.9/site-packages/autonlp/cli/upload.py", line 82, in run
    project = client.get_project(name=self._name)
TypeError: get_project() missing 1 required positional argument: 'is_eval'

If I manually revise upload.py as follows

project = client.get_project(name=self._name, is_eval=False)

things work as they should. Likewise, this happens when trying to run the train command.

$ autonlp train --project sentiment_detection
> INFO    Starting Training For Project: sentiment_detection
Traceback (most recent call last):
  File "/Users/sb/.pyenv/versions/test/bin/autonlp", line 8, in <module>
    sys.exit(main())
  File "/Users/sb/.pyenv/versions/test/lib/python3.9/site-packages/autonlp/cli/autonlp.py", line 54, in main
    command.run()
  File "/Users/sb/.pyenv/versions/test/lib/python3.9/site-packages/autonlp/cli/train.py", line 34, in run
    project = client.get_project(name=self._name)
TypeError: get_project() missing 1 required positional argument: 'is_eval'

I'm using autonlp version 0.3.0.

$ autonlp --version
0.3.0

autonlp metrics --project is broken

How to reproduce:

autonlp metrics --project <project_name>

Expected output:

2021-02-25 14:02:18.178 | INFO     | autonlp.autonlp:_login_from_conf:67 - ๐Ÿ— Retrieving credentials from config...
Traceback (most recent call last):
  File "C:\Users\sbran\miniconda3\envs\autonlp-front\Scripts\autonlp-script.py", line 33, in <module>
    sys.exit(load_entry_point('autonlp', 'console_scripts', 'autonlp')())
  File "c:\users\sbran\documents\dev\huggingface\autonlp\src\autonlp\cli\autonlp.py", line 34, in main
    command.run()
  File "c:\users\sbran\documents\dev\huggingface\autonlp\src\autonlp\cli\metrics.py", line 33, in run
    _ = client.get_metrics(model_id=self._model_id, project=self._project)
  File "c:\users\sbran\documents\dev\huggingface\autonlp\src\autonlp\autonlp.py", line 146, in get_metrics
    _metrics = Metrics.from_json_resp(
  File "c:\users\sbran\documents\dev\huggingface\autonlp\src\autonlp\metrics.py", line 21, in from_json_resp
    language=json_resp["config"]["language"],
KeyError: 'config'

This is due to fact we recently stripped config from project API response in the backend, in fact removing language information.

Domain specific pre-training and Question Answering

Hallo,
this is more an inquiry about future features than "an issue" on currently implemented functionalities.

(1) are you planning to add the QA task too ?
(2) Are you planning to offer also the possibility to continue pre-training on a specific domain ? ( I include a chart to show what I mean)

image

Readme: Python API

Parameter max_models not included in the create_project function call in the Readme.

Suggested:
project = client.create_project(name="sentiment_detection", task="binary_classification", language="en", max_models=5)

Choosing a separator

Hi! This issue is most like a recommendation.
When I tried to upload a .csv file in huggingface, I don't have the option to choose a separator.
I understand that the .csv file is a "comma separated values" but in some cases the .csv has another separator, like in this case.

image

I think the problem it will resolve if I change the "|" to "," but hopefully, in the future, you can implement something like what I said above.
Best regards!

AutoNLP error with Git clone during data upload (python)

When calling the "project.upload" function in python I get the following error:
I've substituted the AutoNLP api token with "API_TOKEN" and the folder/model name with MODEL_PATH. How can I fix it?


CalledProcessError Traceback (most recent call last)
~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/site-packages/huggingface_hub/repository.py in clone_from(self, repo_url, use_auth_token)
147 encoding="utf-8",
--> 148 cwd=self.local_dir,
149 )

~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs)
511 raise CalledProcessError(retcode, process.args,
--> 512 output=stdout, stderr=stderr)
513 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '['git', 'clone', 'https://user:[email protected]/datasets/MODEL_PATH', '.']' returned non-zero exit status 128.

During handling of the above exception, another exception occurred:

OSError Traceback (most recent call last)
in
5 col_mapping={
6 "document":"text",
----> 7 "summary":"target"})
8
9 # Upload the validation set

~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/site-packages/autonlp/project.py in upload(self, filepaths, split, col_mapping, path_to_audio)
224 raise ValueError("'path_to_audio' must be provided when task is 'speech_recognition'")
225
--> 226 dataset_repo = self._clone_dataset_repo()
227 local_dataset_dir = dataset_repo.local_dir
228

~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/site-packages/autonlp/project.py in _clone_dataset_repo(self)
364 local_dir=local_dataset_dir,
365 clone_from=clone_from,
--> 366 use_auth_token=self._token,
367 )
368 try:

~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/site-packages/huggingface_hub/repository.py in init(self, local_dir, clone_from, use_auth_token, git_user, git_email)
59
60 if clone_from is not None:
---> 61 self.clone_from(repo_url=clone_from, use_auth_token=use_auth_token)
62 else:
63 if os.path.isdir(os.path.join(self.local_dir, ".git")):

~/opt/anaconda3/envs/aitoci_py37/lib/python3.7/site-packages/huggingface_hub/repository.py in clone_from(self, repo_url, use_auth_token)
218
219 except subprocess.CalledProcessError as exc:
--> 220 raise EnvironmentError(exc.stderr)
221
222 def git_config_username_and_email(

OSError: Cloning into '.'...
remote: Repository not found.
fatal: repository 'https://huggingface.co/datasets/MODEL_PATH/' not found

Italian language support

Please, make it possible.

Is there any way I can help?
What is the procedure to support a new language?

Multi Class Classification | Base Config Issue

While setting up project for Multiclass classification, in the very first step below:

(base)C:\WINDOWS\system32>autonlp create_project --name custom_classifier --language en --task multi_class_classification

INFO Creating project: custom_classifier with task: multi_class_classification
INFO ๐Ÿ— Retrieving credentials from config...
ERROR โŒ Oops! Something failed in AutoNLP backend..
ERROR Error code: 400; Details: 'Invalid config: 1 validation error for BaseConfig
max_models
field required (type=value_error.missing)'

Server 500 while trying to create a project

When i followed the instructions on github, i got an error while creating a project with this line;
project = client.create_project(name="test-sentiment", task="multi_class_classification", language="en")

It gave the errors as below;
autonlp.utils:http_post:75 - โŒ Operation failed! Details: Internal Server Error
...
...
HTTPError: 500 Server Error: Internal Server Error for url: https://api.autonlp.huggingface.co/projects/create

NOTE: i was able to login successfully.
from autonlp import AutoNLP
client = AutoNLP()
client.login(token="MY HUGGINGFACE TOKEN")
2021-03-22 14:59:57.056 | INFO | autonlp.autonlp:login:51 - ๐Ÿ— Successfully logged in as gurkandy
2021-03-22 14:59:57.057 | INFO | autonlp.autonlp:login:58 - ๐Ÿ— Storing credentials in: MY HOME FOLDER

Status: Failed to process data files

Hi,
I have created a project "intent" under user: hepbc. Am able to upload data files. But get the following when I try and get project status:
AutoNLP Project (id # 163)

 โ€ข Name:        intent
 โ€ข Owner:       hepbc
 โ€ข Status:      โŒ Failed to process data files
 โ€ข Task:        Multi Class Classification
 โ€ข Created at:  2021-05-09 16:40 Z
 โ€ข Last update: 2021-05-18 14:39 Z

๐Ÿ’ฐ Project current cost: USD 7.50

~~~~~~~~~~~~~~ Files ~~~~~~~~~~~~~~

Dataset ID:
hepbc/autonlp-data-intent

๐Ÿ“ train.csv (id # 186)
   โ€ข Split:             train
   โ€ข Processing status: โœ… Success!
   โ€ข Last update:       2021-05-10 15:00 Z
๐Ÿ“ train.csv (id # 244)
   โ€ข Split:             train
   โ€ข Processing status: โœ… Success!
   โ€ข Last update:       2021-05-18 14:58 Z
๐Ÿ“ valid.csv (id # 187)
   โ€ข Split:             valid
   โ€ข Processing status: โœ… Success!
   โ€ข Last update:       2021-05-10 15:01 Z
๐Ÿ“ valid.csv (id # 245)
   โ€ข Split:             valid
   โ€ข Processing status: โœ… Success!
   โ€ข Last update:       2021-05-18 14:59 Z

~~~~~~~~~~~~ Models ~~~~~~~~~~~

+----+--------+--------+--------------------+--------------------+
|    |   ID   | Status |   Creation date    |    Last update     |
+----+--------+--------+--------------------+--------------------+
| ๐Ÿš€ | 163655 | start  | 2021-05-18 14:59 Z | 2021-05-18 14:59 Z |
| โŒš | 163656 | queued | 2021-05-18 14:59 Z | 2021-05-18 14:59 Z |
| โŒš | 163657 | queued | 2021-05-18 14:59 Z | 2021-05-18 14:59 Z |
| โŒš | 163658 | queued | 2021-05-18 14:59 Z | 2021-05-18 14:59 Z |
| โŒš | 163659 | queued | 2021-05-18 14:59 Z | 2021-05-18 14:59 Z |
+----+--------+--------+--------------------+--------------------+

Request some help in identifying the issue. Many thanks!

-BC

docs: metrics doesn't accept --model

Hi,
The readme on the repo says I can use autonlp metrics --model xxx but that doesn't work
image

Also the CLI doesn't have a --version flag. But python says I'm on 0.0.1
image

Login autonlp behind a proxy server

Hello,

My account is enabled for AutoNLP.
I'm following the page to install autonlp on a Windows 10, Python version 3.8.5.

I tried the autonlp login via a terminal.

Command :
autonlp login --api-key MY_HUGGING_FACE_API_TOKEN

Results:


Traceback (most recent call last):
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\autonlp\utils.py", line 41, in http_get
response = requests.get(
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\requests\api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/whoami-v2 (Caused by SSLError(SSLCertVerificationError(1, 'SSL: CERTIFICATE_VERIFY_FAILED certificate verify failed: self signed certificate in certificate chain (_ssl.c:1123)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users<my-username>\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users<my-username>\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users<my-username>\Dev\huggingface\venv\Scripts\autonlp.exe_main
.py", line 7, in
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\autonlp\cli\autonlp.py", line 52, in main
command.run()
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\autonlp\cli\login.py", line 31, in run
client.login(token=self._api_key)
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\autonlp\autonlp.py", line 41, in login
auth_resp = http_get(path="/whoami-v2", domain=config.HF_API, token=token, token_prefix="Bearer")
File "c:\users<my-username>\dev\huggingface\venv\lib\site-packages\autonlp\utils.py", line 45, in http_get
raise UnreachableAPIError("โŒ Failed to reach AutoNLP API, check your internet connection")
autonlp.utils.UnreachableAPIError: โŒ Failed to reach AutoNLP API, check your internet connection


Could you please tell me how to specify a proxy in the login command ?

Many thanks for your help on this,

++

Inconsistent Cost Estimates

I get that cost estimates are uncertain, but I would expect different parts of the command line experience to agree on the estimate. I have a train set with 20K examples.

When I ask for an estimate direct I get:

> autonlp estimate --num_train_samples 20000 --project_name intent_detection

> INFO    ๐Ÿ— Retrieving credentials from config...
> INFO    โ˜ Retrieving project 'intent_detection' from AutoNLP...
> INFO    ๐Ÿ”„ Refreshing project status...
> INFO    ๐Ÿ”„ Refreshing uploaded files information...
> INFO    ๐Ÿ”„ Refreshing models information...
> INFO    ๐Ÿ”„ Refreshing cost information...
> INFO    โœ… Successfully loaded project: 'intent_detection'!
Cost range: 7.5 - 12.5 USD

But the training command suggests a different price.

> autonlp train --project intent_detection             
                        
> INFO    Starting Training For Project: intent_detection
> INFO    ๐Ÿ— Retrieving credentials from config...
> INFO    โ˜ Retrieving project 'intent_detection' from AutoNLP...
> INFO    ๐Ÿ”„ Refreshing project status...
> INFO    ๐Ÿ”„ Refreshing uploaded files information...
> INFO    ๐Ÿ”„ Refreshing models information...
> INFO    ๐Ÿ”„ Refreshing cost information...
> INFO    โœ… Successfully loaded project: 'intent_detection'!
> INFO    ๐Ÿ”„ Refreshing project status...
> INFO    ๐Ÿ”„ Refreshing uploaded files information...
> INFO    ๐Ÿ”„ Refreshing models information...
> INFO    ๐Ÿ”„ Refreshing cost information...
> INFO    ๐Ÿ”Ž Calculating a cost estimate for the training...

๐Ÿ’ฐ The training cost for this project will be in this range:
 USD 18.75 to USD 31.25

 Once training is complete, we will send you an email invoice for the actual training cost within that range.

Billing status not validated

Hi,
I'm on the free plan and don't have payment details on file, but can still launch a training job.
If that's on purpose, thank you. If not ....
image
image

My username is talolard

How to know evaluation task is finished

Currently, for finetuning, once the training is launched, we can know when the finetuned models are ready:

project.train(noprompt=True)
project.refresh()
all_jobs_finished = all(job.status == "success" for job in project.training_jobs)

However, we do not have an analogue way for evaluation. Currently I keep trying to clone the evaluation model repo until this succeeds once the evaluation is finished. Indeed, I have to wait a little more after the repo is cloned, because the README file is generated in a commit some time after the repo is created.

I wonder if it would be possible to implement something similar to the finetuning case, like:

evaluation_job = client.create_evaluation(...)
evaluation_job.refresh()
job_finished = (evaluation_job.status == "success")

'sentence' could not be found

> INFO    Uploading files for project: intercom_sentiment_model
> INFO    ๐Ÿ— Retrieving credentials from config...
> INFO    โ˜ Retrieving project 'intercom_sentiment_model' from AutoNLP...
> INFO    ๐Ÿ”„ Refreshing project status...
> INFO    ๐Ÿ”„ Refreshing uploaded files information...
> INFO    ๐Ÿ”„ Refreshing models information...
> INFO    ๐Ÿ”„ Refreshing cost information...
> INFO    โœ… Successfully loaded project: 'intercom_sentiment_model'!
> INFO    Mapping: {'sentence': 'text', 'label': 'target'}
> INFO    [1/1] ๐Ÿ”Ž Validating /Users/robzeydelis/Downloads/train.csv and column mapping...
> ERROR   โŒ Something went wrong!
> ERROR   Details:
> ERROR   Columns 'sentence' could not be found in the provided file (which has columns: 'sentence','label')

It is not funding the column sentence, but it is showing the column sentence in the parentheses. Anyone know why? I tried removing all formatting and even creating a new csv file, but nothing works. Any help would be much appreciated!

Training failed how to restart training?

So training failed, how to re-train it due to a server error or something?
autonlp is latest version, python is 3.9

gorkemgoknar@Gorkem-MacBook-Pro:~/Desktop/autonlptest$ autonlp project_info --name sentiment_turkish
2021-03-11 16:35:20.687 | INFO | autonlp.cli.project_info:run:28 - Fetching info for project: sentiment_turkish
2021-03-11 16:35:20.688 | INFO | autonlp.autonlp:_login_from_conf:66 - ๐Ÿ— Retrieving credentials from config...
2021-03-11 16:35:20.688 | INFO | autonlp.autonlp:get_project:109 - โ˜ Retrieving project 'sentiment_turkish' from AutoNLP...
2021-03-11 16:35:21.205 | INFO | autonlp.project:refresh:195 - ๐Ÿ”„ Refreshing uploaded files information...
2021-03-11 16:35:21.700 | INFO | autonlp.project:refresh:200 - ๐Ÿ”„ Refreshing models information...
2021-03-11 16:35:22.206 | INFO | autonlp.autonlp:get_project:121 - โœ… Successfully loaded project: 'sentiment_turkish'!
AutoNLP Project (id # 29)

 โ€ข Name:        sentiment_turkish
 โ€ข Owner:       gorkemgoknar
 โ€ข Status:      โŒ Failed to download data files from the huggingface hub
 โ€ข Task:        Binary Classification
 โ€ข Created at:  2021-03-11 12:51 Z
 โ€ข Last update: 2021-03-11 13:34 Z

~~~~~~~~~~~~~~ Files ~~~~~~~~~~~~~~

Dataset ID:
gorkemgoknar/autonlp-data-sentiment_turkish

๐Ÿ“ turkish_movie_train.csv (id # 27)
   โ€ข Split:             train
   โ€ข Processing status: โŒ Failed: server error
   โ€ข Last update:       2021-03-11 13:20 Z
๐Ÿ“ turkish_movie_valid.csv (id # 28)
   โ€ข Split:             valid
   โ€ข Processing status: โ“ Unhandled status! Please update autonlp
   โ€ข Last update:       2021-03-11 13:32 Z

~~~~~~~~~~~~ Models ~~~~~~~~~~~

๐Ÿคท No train jobs started yet!

Training in status: Failed

Hello,

I tried to train 5 models for my binary_classification problem however, I got this error, how can I know what went wrong to fix it and try again?

Thank you.
`๐Ÿ“ train.csv (id # 138)
โ€ข Split: train
โ€ข Processing status: โœ… Success!
โ€ข Last update: 2021-04-24 23:44 Z
๐Ÿ“ valid.csv (id # 139)
โ€ข Split: valid
โ€ข Processing status: โœ… Success!
โ€ข Last update: 2021-04-24 23:45 Z


+----+--------+--------+--------------------+--------------------+
|    |   ID   | Status |   Creation date    |    Last update     |
+----+--------+--------+--------------------+--------------------+
| โŒ | 128413 | failed | 2021-04-24 23:49 Z | 2021-04-24 23:57 Z |
| โŒ | 128414 | failed | 2021-04-24 23:49 Z | 2021-04-24 23:57 Z |
| โŒ | 128415 | failed | 2021-04-24 23:49 Z | 2021-04-24 23:57 Z |
| โŒ | 128416 | failed | 2021-04-24 23:49 Z | 2021-04-24 23:57 Z |
| โŒ | 128417 | failed | 2021-04-24 23:49 Z | 2021-04-24 23:57 Z |
+----+--------+--------+--------------------+--------------------+`

Auto NLP authentication issue

Login authentication fails while using the Hugging Face API token.

I tried authentication with the below command as per documentation, but getting the below error:
autonlp login --api-key YOUR_HUGGING_FACE_API_TOKEN

Error Trace:

> autonlp login --api-key YOUR_HUGGING_FACE_API_TOKEN
Traceback (most recent call last):
  File "/home/sudharsan/anaconda3/envs/auto/bin/autonlp", line 8, in <module>
    sys.exit(main())
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/autonlp/cli/autonlp.py", line 56, in main
    command.run()
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/autonlp/cli/login.py", line 31, in run
    client.login(token=self._api_key)
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/autonlp/autonlp.py", line 43, in login
    auth_resp = http_get(path="/whoami-v2", domain=config.HF_API, token=token, token_prefix="Bearer")
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/autonlp/utils.py", line 43, in http_get
    url=domain + path, headers=get_auth_headers(token=token, prefix=token_prefix), **kwargs
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/sessions.py", line 528, in request
    prep = self.prepare_request(req)
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/sessions.py", line 466, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/models.py", line 316, in prepare
    self.prepare_url(url, params)
  File "/home/sudharsan/anaconda3/envs/auto/lib/python3.7/site-packages/requests/models.py", line 390, in prepare_url
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL 'YOUR_HUGGING_FACE_API_TOKEN/whoami-v2': No schema supplied. Perhaps you meant http://YOUR_HUGGING_FACE_API_TOKEN/whoami-v2?

Setup:

  • OS: Ubuntu 18.04 LTS
  • Version of the Library used: autonlp==0.3.4
  • Python version: 3.7

Can anyone help to resolve the issue?

a question for using autonlp for very little data

Hello, I was planning to use autonlp for a personal project of mine for multi class classification. I have generated sentence embeddings for a lot of sentences and they have labels, which I am planning to use as data. I have very little data for training the model, around 200 data points. Shall I try using autonlp or shall I use something else? Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.