Comments (21)
I fixed the problem in ACCESS and EASSE, if you install them again with the latest github version, it should work fine.
Thanks for bringing that up.
from access.
/usr/bin/python3.6 /home/qwh/桌面/access/scripts/evaluate.py
[nltk_data] Downloading package stopwords to /home/qwh/nltk_data...
[nltk_data] Package stopwords is already up-to-date!
[nltk_data] Downloading package perluniprops to /home/qwh/nltk_data...
[nltk_data] Package perluniprops is already up-to-date!
[nltk_data] Downloading package punkt to /home/qwh/nltk_data...
[nltk_data] Unzipping tokenizers/punkt.zip.
Evaluating pretrained model
BLEU: 76.08
SARI: 41.87
FKGL: 7.22
Quality estimation: {'Compression ratio': 0.94, 'Sentence splits': 1.2, 'Levenshtein similarity': 0.87, 'Exact matches': 0.04, 'Additions proportion': 0.16, 'Deletions proportion': 0.17, 'Lexical complexity score': 7.93}
Process finished with exit code 0
SUCCESS!!THANK YOU VERY MUCH!
from access.
Ok thanks, I think that's a different problem.
It seems that the sacrebleu package was reorganized recently.
Can you try with again with pip install sacrebleu==1.4.5
?
from access.
Please add a more detailed description of your issue, and describe the steps that you already tried to fix the problem.
"Please" or "Thank you" won't hurt either.
from access.
Your version of easse is most likely not up to date.
from access.
Sorry. Sorry. Thank you for your help. Thank you for your patience.
1:git clone [email protected]:facebookresearch/access.git
[email protected]: Permission denied (publickey).
fatal: Unable to read remote warehouse.
Please confirm that you have the correct access rights and the warehouse exists.
2:Download zip-----unzip---cd access
3:pip install -e .
Collecting easse@ git+git://github.com/feralvam/easse.git@5dce4474a72baa5a16e3764f1ed4225a1751dbf2 (from access==0.1)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://pypi.org/simple/easse/
4: https://github.com/feralvam/easse
Download zip-----Unzip-----cd easse-------Copy directory "easese" to "access"
5:https://github.com/facebookresearch/text-simplification-evaluation
Download zip-----Unzip-----cd text-simlification-------Copy directory "tseval" to "access"
6:python scripts/evaluate.py
Evaluating pretrained model
Downloading...
... 100% - 622 MB - 1.33 MB/s - 468s
Extracting...
Downloading...
... 100% - 623 MB - 1.36 MB/s - 459s
Extracting...
Traceback (most recent call last):
File "scripts/evaluate.py", line 28, in
evaluate_simplifier_on_turkcorpus(simplifier, phase='test')
File "/home/qwh/桌面/access/access/evaluation/general.py", line 32, in evaluate_simplifier_on_turkcorpus
quality_estimation=True)
File "/home/qwh/桌面/access/easse/cli.py", line 115, in evaluate_system_output
orig_sents, refs_sents = get_orig_and_refs_sents(test_set, orig_sents_path, refs_sents_paths)
File "/home/qwh/桌面/access/easse/cli.py", line 39, in get_orig_and_refs_sents
orig_sents = get_orig_sents(test_set)
File "/home/qwh/桌面/access/easse/utils/resources.py", line 76, in get_orig_sents
return read_lines(TEST_SETS_PATHS[(test_set, 'orig')])
KeyError: ('turk', 'orig')
from access.
Ok actually the problem is that the version of EASSE that you use is too recent and introduced breaking changes (we'll fix this soon).
In the meantime please install the version of EASSE that was used at the time of release of ACCESS.
You can do so by running pip install git+git://github.com/feralvam/easse.git@5dce4474a72baa5a16e3764f1ed4225a1751dbf2
.
Make sure you have the latest pip installed pip install pip --upgrade
.
from access.
pip install git+git://github.com/feralvam/easse.git@5dce4474a72baa5a16e3764f1ed4225a1751dbf2
........
Successfully built easse
ERROR: access 0.1 has requirement nltk==3.4.5, but you'll have nltk 3.4.3 which is incompatible.
Installing collected packages: preshed, blis, plac, thinc
Attempting uninstall: preshed
Found existing installation: preshed 3.0.2
Uninstalling preshed-3.0.2:
Successfully uninstalled preshed-3.0.2
Rolling back uninstall of preshed
Moving to /home/qwh/.local/lib/python3.6/site-packages/preshed-3.0.2.dist-info/
from /home/qwh/.local/lib/python3.6/site-packages/~reshed-3.0.2.dist-info
Moving to /home/qwh/.local/lib/python3.6/site-packages/preshed/
from /home/qwh/.local/lib/python3.6/site-packages/~reshed
ERROR: Could not install packages due to an EnvironmentError: [Errno 13] 权限不够: '/usr/local/lib/python3.6/dist-packages/preshed/init.pxd'
Consider using the --user
option or check the permissions.
But before I run it, I have uninstall "nltk".
Thank you!
from access.
I think the problem is not due to NLTK, but more to a permission issue (see last part of traceback):
ERROR: Could not install packages due to an EnvironmentError: [Errno 13] 权限不够: '/usr/local/lib/python3.6/dist-packages/preshed/init.pxd'
Consider using the --user option or check the permissions.
from access.
Great, closing the issue :)
from access.
Hi @louismartin tried exploring the project on google colab and facing the same problem
Am I missing something?
Traceback (most recent call last):
File "/content/access/scripts/evaluate.py", line 27, in
print(evaluate_simplifier_on_turkcorpus(simplifier, phase='test'))
File "/content/access/access/evaluation/general.py", line 31, in evaluate_simplifier_on_turkcorpus
quality_estimation=True)
File "/usr/local/lib/python3.6/dist-packages/easse/cli.py", line 105, in evaluate_system_output
orig_sents, sys_sents, refs_sents = get_sents(test_set, orig_sents_path, sys_sents_path, refs_sents_paths)
File "/usr/local/lib/python3.6/dist-packages/easse/cli.py", line 28, in get_sents
orig_sents_path = TEST_SETS_PATHS[(test_set, 'orig')]
KeyError: ('turkcorpus_test_legacy', 'orig')
from access.
Hi, can you run !pip freeze | grep easse
please?
from access.
fairseq==0.6.2
from access.
I'm sorry, I would like to check the easse
version, fairseq does not seem to be the problem here
from access.
-
!pip install -e /content/access/
-
!pip install --force-reinstall easse@git+git://github.com/feralvam/easse.git@5dce4474a72baa5a16e3764f1ed4225a1751dbf2
-
!pip install --force-reinstall fairseq@git+https://github.com/louismartin/fairseq.git@controllable-sentence-simplification
4.import nltk
nltk.download('all')
-
!pip install nevergrad==0.2.3
from nevergrad.instrumentation import var -
!pip install git+git://github.com/facebookresearch/text-simplification-evaluation.git
-
!python /content/access/scripts/evaluate.py
-
!pip freeze | grep fairseq
This is my invoking sequence. Could be of some help to debug
from access.
I'm sorry, I would like to check the
easse
version, fairseq does not seem to be the problem here
easse==0.1
from access.
Ok thanks a lot.
I think I put the wrong version of easse on the README, my mistake.
Can you please try to run:
pip install --force-reinstall easse@git+git://github.com/feralvam/easse.git@580ec953e4742c3ae806cc85d867c16e9f584505
and try again ?
from access.
Hi @louismartin
Used the above mentioned command, still got some error:
Traceback (most recent call last):
File "/content/access/scripts/evaluate.py", line 27, in
print(evaluate_simplifier_on_turkcorpus(simplifier, phase='test'))
File "/content/access/access/evaluation/general.py", line 31, in evaluate_simplifier_on_turkcorpus
quality_estimation=True)
File "/usr/local/lib/python3.6/dist-packages/easse/cli.py", line 130, in evaluate_system_output
lowercase=lowercase)
File "/usr/local/lib/python3.6/dist-packages/easse/bleu.py", line 22, in corpus_bleu
sys_sents = [utils_prep.normalize(sent, lowercase, tokenizer) for sent in sys_sents]
File "/usr/local/lib/python3.6/dist-packages/easse/bleu.py", line 22, in
sys_sents = [utils_prep.normalize(sent, lowercase, tokenizer) for sent in sys_sents]
File "/usr/local/lib/python3.6/dist-packages/easse/utils/preprocessing.py", line 12, in normalize
normalized_sent = sacrebleu.tokenize_13a(sentence)
AttributeError: module 'sacrebleu' has no attribute 'tokenize_13a'
Also,
easse==0.2.1
fairseq==0.6.2
from access.
Hi @louismartin
Used the above mentioned command, still got some error:Traceback (most recent call last):
File "/content/access/scripts/evaluate.py", line 27, in
print(evaluate_simplifier_on_turkcorpus(simplifier, phase='test'))
File "/content/access/access/evaluation/general.py", line 31, in evaluate_simplifier_on_turkcorpus
quality_estimation=True)
File "/usr/local/lib/python3.6/dist-packages/easse/cli.py", line 130, in evaluate_system_output
lowercase=lowercase)
File "/usr/local/lib/python3.6/dist-packages/easse/bleu.py", line 22, in corpus_bleu
sys_sents = [utils_prep.normalize(sent, lowercase, tokenizer) for sent in sys_sents]
File "/usr/local/lib/python3.6/dist-packages/easse/bleu.py", line 22, in
sys_sents = [utils_prep.normalize(sent, lowercase, tokenizer) for sent in sys_sents]
File "/usr/local/lib/python3.6/dist-packages/easse/utils/preprocessing.py", line 12, in normalize
normalized_sent = sacrebleu.tokenize_13a(sentence)
AttributeError: module 'sacrebleu' has no attribute 'tokenize_13a'Also,
easse==0.2.1
fairseq==0.6.2
Tried checking the version for the same
sacrebleu==1.4.7
from access.
This worked well!!! Thank you so much for your help <3
Got the result:
Evaluating pretrained model
{'bleu': 76.07533495738832, 'sari': 41.24344083480672, 'sari_legacy': 41.866226081519535, 'fkgl': 7.224963716884172, 'quality_estimation': {'Compression ratio': 0.9402640450938302, 'Sentence splits': 1.2000928505106778, 'Levenshtein similarity': 0.86603988608316, 'Exact copies': 0.03899721448467967, 'Additions proportion': 0.15796500942038566, 'Deletions proportion': 0.16669992308336373, 'Lexical complexity score': 7.925871582450909}}
Also I wanted to ask whether we can try the model on custom data? If yes, then is there a guide to do so? This may sound a bit silly but I'm new to NLP and exploring text simplification projects so it would mean a lot if you can share some guidelines for the same.
Thanks a ton!
from access.
You're welcome :)
Yes you can do so by using the python generate.py < my_data.txt
script.
from access.
Related Issues (20)
- Error at the end of train.py HOT 11
- Is it possible to perform transfer learning on this model? HOT 9
- How to accelerate the generation? HOT 6
- Why model is replacing Proper noun to pronoun? E.g "Oxygen" to "It" HOT 3
- Double tokenization HOT 4
- Typo in the License File HOT 1
- The evaluate.py script fails; pytorch version issue HOT 4
- requirement "torch==1.2.0" unfound HOT 2
- "https://bitbucket.org/eigen/eigen/get/b2e267dc99d4.zip" unfound HOT 4
- Dependency Version Conflict HOT 1
- Reversing the results HOT 5
- Generate.py is consuming too large memory on GPU when handling large file HOT 2
- How to make it supports Multi GPU training? HOT 6
- Reproduce the result in paper HOT 7
- What/where is turkcorpus_{phase}_legacy? HOT 3
- generate.py open too many temporary files HOT 3
- Generation without using a file HOT 3
- How to use `DependencyTreeDepthRatioPreprocessor` during generation? HOT 1
- Control token values during training HOT 2
- Met with problem when training (python ./scripts/train.py) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from access.