Comments (4)
Hi,
Thanks for checking out the tool and opening this issue. I also saw the other issue earlier this week. Sorry for the slow reply, I was out of the office earlier this week.
I might imagine that the issue #17 and this issue are related. I have experienced this "PLEASE REQUEST AN API_KEY FROM NCBI" error when I've submitted too many queries in a very short time to NCBI via Eutils efetch
. Are you running a lot of Cenote-Taker 2
jobs in parallel?
If so, I can add an argument to my tool to add your API key.
Let me know,
Mike
from cenote-taker2.
Thanks for your reply in time. It was indeed caused by the parallel and I rerun the failed jobs with single model, and it was work finally . So, I suppose that an argument to add user API key is a best way to resolve the problem.
Looking forward your updating.
from cenote-taker2.
Dear CT2 team, I had a problem with "PLEASE REQUEST AN API_KEY FROM NCBI" when running CT2, it seems that casused by eutils. does CT2 parse the taxonomic rank information by this way? How could I fix the problem? Looking forward your reply. Thank a lot.
(cenote-taker2) [yut@amms-sugon8401@006]$ cat 350.log 00000000000000000000000000 00000000000000000000000000 000000000^^^^^^^^000000000 000000^^^^^^^^^^^^^^000000 00000^^^^^CENOTE^^^^^00000 00000^^^^^TAKER!^^^^^00000 00000^^^^^^^^^^^^^^^^00000 000000^^^^^^^^^^^^^^000000 000000000^^^^^^^^000000000 00000000000000000000000000 00000000000000000000000000 Version 2.1.3 @@@@@@@@@@@@@@@@@@@@@@@@@ Your specified arguments: original contigs: /mnt/data/share/database/Human_gut_virome_database/Public_Merged_Human_Virome_Database/SPLITS_8_PART/006/PMHVD_DREP_rep_seq.part_006.part_350.fasta forward reads: /mnt/data/share/database/Human_gut_virome_database/Public_Merged_Human_Virome_Database/PMHVD_DREP_REP_SEQ_CT2_OUT/006/no_reads reverse reads: /mnt/data/share/database/Human_gut_virome_database/Public_Merged_Human_Virome_Database/PMHVD_DREP_REP_SEQ_CT2_OUT/006/no_reads title of this run: 350_out Isolate source: unknown collection date: unknown metagenome_type: unknown SRA run number: unknown SRA experiment number: unknown SRA sample number: unknown Bioproject number: unknown template file: /mnt/data/share/software/Cenote-Taker2/dummy_template.sbt minimum circular contig length: 1000 minimum linear contig length: 1 virus domain database: standard min. viral hallmarks for linear: 0 min. viral hallmarks for circular: 0 handle known seqs: do_not_check_knowns contig assembler: unknown_assembler DNA or RNA: DNA HHsuite tool: hhblits original or TPA: original Do BLASTP?: no_blastp Do Prophage Pruning?: True Filter out plasmids?: True Run BLASTN against nt? none Location of Cenote scripts: /mnt/data/share/software/Cenote-Taker2 Location of scratch directory: none GB of memory: 50 number of CPUs available for run: 10 Annotation mode? True @@@@@@@@@@@@@@@@@@@@@@@@@ scratch space will not be used in this run HHsuite database locations: /mnt/data/share/software/Cenote-Taker2/NCBI_CD/NCBI_CD /mnt/data/share/software/Cenote-Taker2/pfam_32_db/pfam /mnt/data/share/software/Cenote-Taker2/pdb70/pdb70 no CRISPR file given Prophage pruning requires --lin_minimum_hallmark_genes >= 1. changing to: --lin_minimum_hallmark_genes 1 time update: locating inputs: 11-18-21---21:44:16 /mnt/data/share/database/Human_gut_virome_database/Public_Merged_Human_Virome_Database/PMHVD_DREP_REP_SEQ_CT2_OUT/006/PMHVD_DREP_rep_seq.part_006.part_350.fasta File with .fasta extension detected, attempting to keep contigs over 1 nt and find circular sequences with apc.pl No circular contigs detected. no reads provided or reads not found No circular fasta files detected. time update: running IRF for ITRs in non-circular contigs 11-18-21---21:44:18 time update: running prodigal on linear contigs 11-18-21---21:44:18 time update: running linear contigs with hmmscan against virus hallmark gene database: standard 11-18-21---21:46:01 Starting pruning of non-DTR/circular contigs with viral domains pruning script opened fna files found mv: cannot stat './350_out7.AA.sorted.fasta': No such file or directory cut: ./350_out7.AA.hmmscan.sort.out: No such file or directory mv: cannot stat './350_out8.AA.sorted.fasta': No such file or directory cut: ./350_out8.AA.hmmscan.sort.out: No such file or directory mv: cannot stat './350_out18.AA.sorted.fasta': No such file or directory cut: ./350_out18.AA.hmmscan.sort.out: No such file or directory mv: cannot stat './350_out20.AA.sorted.fasta': No such file or directory cut: ./350_out20.AA.hmmscan.sort.out: No such file or directory mv: cannot stat './350_out23.AA.sorted.fasta': No such file or directory cut: ./350_out23.AA.hmmscan.sort.out: No such file or directory time update: HMMSCAN of common viral domains beginning 11-18-21---21:46:56 time update: making tables for hmmscan and rpsblast outputs 11-18-21---21:49:57 time update: running RPSBLAST on each sequence 11-18-21---21:50:05 /mnt/data/share/database/Human_gut_virome_database/Public_Merged_Human_Virome_Database/PMHVD_DREP_REP_SEQ_CT2_OUT/006/350_out/no_end_contigs_with_viral_domain/COMBINED_RESULTS_PRUNE.AA.rpsblast.out time update: parsing tables into virus_signal.seq files for hmmscan and rpsblast outputs 11-18-21---21:50:27 time update: Identifying virus chunks, chromosomal junctions, and pruning contigs as necessary 11-18-21---21:50:50 Running file: 350_out10.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out11.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out12.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out13.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out14.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out15.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out16.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out17.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out19.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out1.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out21.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out22.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out24.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out25.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out26.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out2.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out3.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out4.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out5.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out6.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] Running file: 350_out9.virus_signal.seq Window +/- to the right ... Chunk_end Window midpoint 0 1 + ... none 2500 0 1 + ... none 2500 [2 rows x 6 columns] time update: Making prophage table 11-18-21---21:53:10 FINISHED PRUNING CONTIGS WITH AT LEAST 1 VIRAL DOMAIN(S) Grabbing ORFs wihout RPS-BLAST hits and separating them into individual files for HHsearch time update: running HHsearch or HHblits 11-18-21---21:53:11 Combining tbl files from all search results AND fix overlapping ORF module No ITR contigs with minimum hallmark genes found. Annotating linear contigs time update: running BLASTX, annotate linear contigs 11-18-21---21:53:11 429 Too Many Requests PLEASE REQUEST AN API_KEY FROM NCBI No do_post output returned from 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy&id=35344&rettype=full&retmode=xml&edirect_os=linux&edirect=13.3&tool=edirect&email=yut@amms-sugon8401' Result of do_post http request is $VAR1 = bless( { '_headers' => bless( { 'referrer-policy' => 'origin-when-cross-origin', 'client-response-num' => 1, 'content-length' => '87', 'client-peer' => '130.14.29.110:443', 'x-xss-protection' => '1; mode=block', 'access-control-expose-headers' => 'X-RateLimit-Limit,X-RateLimit-Remaining,Retry-After', 'x-ua-compatible' => 'IE=Edge', 'x-ratelimit-remaining' => '0', 'content-type' => 'application/json', 'x-test-test' => 'test42', 'client-ssl-cert-subject' => '/C=US/ST=Maryland/L=Bethesda/O=National Library of Medicine/CN=*.ncbi.nlm.nih.gov', 'client-ssl-cipher' => 'ECDHE-RSA-AES256-GCM-SHA384', 'vary' => 'Accept-Encoding', 'client-ssl-cert-issuer' => '/C=US/O=DigiCert Inc/CN=DigiCert TLS RSA SHA256 2020 CA1', 'date' => 'Thu, 18 Nov 2021 13:54:47 GMT', 'client-date' => 'Thu, 18 Nov 2021 13:54:48 GMT', 'server' => 'Finatra', 'strict-transport-security' => 'max-age=31536000; includeSubDomains; preload', 'x-ratelimit-limit' => '3', '::std_case' => { 'x-test-test' => 'X-Test-Test', 'x-ratelimit-remaining' => 'X-RateLimit-Remaining', 'client-ssl-cert-subject' => 'Client-SSL-Cert-Subject', 'x-ua-compatible' => 'X-UA-Compatible', 'access-control-expose-headers' => 'Access-Control-Expose-Headers', 'client-peer' => 'Client-Peer', 'referrer-policy' => 'Referrer-Policy', 'client-response-num' => 'Client-Response-Num', 'x-xss-protection' => 'X-XSS-Protection', 'content-security-policy' => 'Content-Security-Policy', 'x-ratelimit-limit' => 'X-RateLimit-Limit', 'strict-transport-security' => 'Strict-Transport-Security', 'client-ssl-socket-class' => 'Client-SSL-Socket-Class', 'client-date' => 'Client-Date', 'client-ssl-cert-issuer' => 'Client-SSL-Cert-Issuer', 'client-ssl-cipher' => 'Client-SSL-Cipher' }, 'client-ssl-socket-class' => 'IO::Socket::SSL', 'retry-after' => '2', 'content-security-policy' => 'upgrade-insecure-requests', 'connection' => 'close' }, 'HTTP::Headers' ), '_request' => bless( { '_method' => 'POST', '_headers' => bless( { 'content-type' => 'application/x-www-form-urlencoded', 'user-agent' => 'libwww-perl/6.39', '::std_case' => { 'if-ssl-cert-subject' => 'If-SSL-Cert-Subject' } }, 'HTTP::Headers' ), '_uri' => bless( do{\(my $o = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi')}, 'URI::https' ), '_content' => 'db=taxonomy&id=35344&rettype=full&retmode=xml&edirect_os=linux&edirect=13.3&tool=edirect&email=yut@amms-sugon8401', '_uri_canonical' => $VAR1->{'_request'}{'_uri'} }, 'HTTP::Request' ), '_msg' => 'Too Many Requests', '_protocol' => 'HTTP/1.1', '_content' => '{"error":"API rate limit exceeded","api-key":"218.241.250.70","count":"4","limit":"3"} ', '_rc' => 429 }, 'HTTP::Response' ); time update: running PHANOTATE, annotate linear contigs 11-18-21---21:56:39
from cenote-taker2.
Hi,
A simple fix for your problem can be achieved without making changes to Cenote-Taker 2. Please let me know if this is suitable for your needs, or you'd like me to push a quick patch for your use. (I will include the API key option in the next update regardless).
First, you need to get an API key from NCBI. I found the instructions to be pretty easy: here
Then you can just export the API key variable before your Cenote-Taker 2 run, using the key they give you, e.g.:
export NCBI_API_KEY="my_api_key_12345" ; python /path/to/Cenote-Taker2/run_cenote-taker2.py -c MY_CONTIGS.fasta -r my_contigs1_ct -m 32 -t 32 -p true -db virion
Now, this only gives you 10 requests per second instead of the 3 requests per second. So, if you are running a lot of jobs in parallel, it could still fail. If you have many people in your lab with NCBI accounts, you could perhaps get API keys for each of them. Further, NCBI suggests that you reach out to then to get access to more requests per second. I haven't tried this.
Good luck, and please let me know if you think you still need a quick patch pushed for you to do this.
Mike
from cenote-taker2.
Related Issues (20)
- Getting phage contigs even after "phage pruning" is set to true HOT 3
- Catalog content no_end_contigs_with_viral_domain HOT 2
- Contigs in the CONTIG_SUMMARY.tsv have neither ".rotate.fasta" nor ".tax_guide.blastx.out" HOT 1
- Pipeline hangs at PHANOTATE step HOT 6
- Error in script for VcontACT? HOT 1
- input reads (=megahit output) of cenote HOT 1
- How to include locus_tag or gene_id in gbf? HOT 2
- erro: Argument list too long when using latest cenote-Taker2.1.5 HOT 2
- erro:grep: sequin directory/*.fsa: No such file or directory when using latest cenote-Taker2.0.1 HOT 2
- BlastN "no high coverage hits HOT 3
- DTR removing HOT 1
- Quiet failure (strict mode option?) HOT 1
- cenote-taker2 vs (blastn nt & diamond nr) HOT 1
- Updates HOT 3
- DTR HOT 3
- ResolvePackageNotFound: - bbtools=37.62 HOT 1
- replace tbl2asn with table2asn
- ValueError: unsupported format character 'I' (0x49) at index 66 HOT 3
- Minimum protein length HOT 2
- ERROR: Invalid inference value tRNAscan-SE HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cenote-taker2.