takaram / kofam_scan Goto Github PK
View Code? Open in Web Editor NEWCLI tool to annotate genes with KOfam
Home Page: https://www.genome.jp/tools/kofamkoala/
License: MIT License
CLI tool to annotate genes with KOfam
Home Page: https://www.genome.jp/tools/kofamkoala/
License: MIT License
Hey @takaram
I am running the kofam_scan software giving the paths to parallel, hmmsearch to config.yml file. But I am getting the following error-
#my config.yml file-
profile: /work/student/jigyasa-arora/kofamscan/db/profiles
ko_list: /work/student/jigyasa-arora/kofamscan/db/ko_list
hmmsearch: /apps/free72/hmmer/3.1b2/bin
parallel: /home/j/jigyasa-arora/local/parallel-20191022/src
#the code I am running-
$/work/student/jigyasa-arora/kofamscan/bin/kofamscan-1.1.0/exec_annotation -o 230-01-kopfam 230-01-prokka.faa
#error-
Traceback (most recent call last):
8: from /work/student/jigyasa-arora/kofamscan/bin/kofamscan-1.1.0/exec_annotation:7:in <main>' 7: from /work/student/jigyasa-arora/kofamscan/bin/kofamscan-1.1.0/lib/kofam_scan/cli.rb:21:in
run'
6: from /work/student/jigyasa-arora/kofamscan/bin/kofamscan-1.1.0/lib/kofam_scan/executor.rb:8:in execute' 5: from /work/student/jigyasa-arora/kofamscan/bin/kofamscan-1.1.0/lib/kofam_scan/executor.rb:35:in
execute'
4: from /work/student/jigyasa-arora/kofamscan/bin/kofamscan-1.1.0/lib/kofam_scan/executor.rb:74:in run_hmmsearch' 3: from /work/student/jigyasa-arora/kofamscan/bin/kofamscan-1.1.0/lib/kofam_scan/parallel.rb:27:in
exec'
2: from /home/j/jigyasa-arora/lib/ruby/2.6.0/open3.rb:101:in popen3' 1: from /home/j/jigyasa-arora/lib/ruby/2.6.0/open3.rb:213:in
popen_run'
/home/j/jigyasa-arora/lib/ruby/2.6.0/open3.rb:213:in `spawn': Permission denied - /home/j/jigyasa-arora/local/parallel-20191022 (Errno::EACCES)
Hello,
I've tried the command the tutorial suggests
$ ./exec_annotation -o output.tsv input.fasta
but I when I run it on my terminal, I get the message "Error: KO list not given"
I've also tried this command:
$ ./exec_annotation -p /path/to/profiles/directory -k /path/to/ko_list/directory -o output.tsv input.fasta --cpu=8
but I get this message
/home/hanna/kofamscan/bin/kofam_scan-1.3.0/lib/kofam_scan/ko.rb:11:in
gets': Is a directory @ io_fillbuf - fd:5 /home/hanna/kofamscan/db/ (Errno::EISDIR)
from /home/hanna/kofamscan/bin/kofam_scan-1.3.0/lib/kofam_scan/ko.rb:11:in parse' from /home/hanna/kofamscan/bin/kofam_scan-1.3.0/lib/kofam_scan/executor.rb:80:in
block in parse_ko'
from /home/hanna/kofamscan/bin/kofam_scan-1.3.0/lib/kofam_scan/executor.rb:80:in open' from /home/hanna/kofamscan/bin/kofam_scan-1.3.0/lib/kofam_scan/executor.rb:80:in
parse_ko'
from /home/hanna/kofamscan/bin/kofam_scan-1.3.0/lib/kofam_scan/executor.rb:22:in execute' from /home/hanna/kofamscan/bin/kofam_scan-1.3.0/lib/kofam_scan/executor.rb:8:in
execute'
from /home/hanna/kofamscan/bin/kofam_scan-1.3.0/lib/kofam_scan/cli.rb:21:in run' from ./exec_annotation:7:in
I've searched for a way to solve this issue but none of the solutions worked. 😣
Hi,
I'm on a HPC system. That means users don't have write permissions to the kofam folder.
When I execute the following:
./exec_annotation -o result -p $profile -k $ko $FA
I get
Ignoring etc-1.2.0 because its extensions are not built. Try: gem pristine etc --version 1.2.0
Traceback (most recent call last):
10: from ./exec_annotation:7:in `<main>'
9: from /scratch/students/apptest/2022-09-23-kofam/kofambin/lib/kofam_scan/cli.rb:21:in `run'
8: from /scratch/students/apptest/2022-09-23-kofam/kofambin/lib/kofam_scan/executor.rb:8:in `execute'
7: from /scratch/students/apptest/2022-09-23-kofam/kofambin/lib/kofam_scan/executor.rb:35:in `execute'
6: from /scratch/students/apptest/2022-09-23-kofam/kofambin/lib/kofam_scan/executor.rb:107:in `run_hmmsearch'
5: from /scratch/students/apptest/2022-09-23-kofam/kofambin/lib/kofam_scan/parallel.rb:27:in `exec'
4: from /home/apps/ruby/2.7.2/lib/ruby/gems/2.7.0/gems/open3-0.1.1/lib/open3.rb:102:in `popen3'
3: from /home/apps/ruby/2.7.2/lib/ruby/gems/2.7.0/gems/open3-0.1.1/lib/open3.rb:227:in `popen_run'
2: from /scratch/students/apptest/2022-09-23-kofam/kofambin/lib/kofam_scan/parallel.rb:28:in `block in exec'
1: from /scratch/students/apptest/2022-09-23-kofam/kofambin/lib/kofam_scan/parallel.rb:28:in `puts'
/scratch/students/apptest/2022-09-23-kofam/kofambin/lib/kofam_scan/parallel.rb:28:in `write': Broken pipe (Errno::EPIPE)
any idea or hint in the right direction would be appreciated.
Best regards
Hello!
It's not clear to me what a good e-value cut-off is to use?
Is there guidance?
Thanks
Mick
Hi,
i set and download kofam_scan-1.3.0 as follow steps:
conda create -n kofamscan
conda activate kofamscan
conda install -c conda-forge ruby
conda install -c bioconda hmmer
conda install -c conda-forge parallel
mkdir KofamKOALA && cd KofamKOALA
wget https://www.genome.jp/ftp/db/kofam/ko_list.gz
wget https://www.genome.jp/ftp/db/kofam/profiles.tar.gz
gunzip ko_list.gz
tar -xzvf profiles.tar.gz
but when i run "exec_annotation -o Dlongan.querryko --cpu 2 --format mapper -E 1e-5 d.pep.fasta " report an error as follws:
config.rb:26:in initialize': undefined method
map' for "/path/profiles":String (NoMethodError)
}.merge(initial_values.map { |k, v| [k.intern, v] }.to_h)
^^^^
Did you mean? tap
I have no idea how to slove this problem, can you give me some help, thanks.
wen
Hello! I've come across a strange issue that happens when running multiple instances of kofamscan at once. Error: Unknown KO: /dev/null
is thrown for one of the instances running wile the other completes. This is making it difficult to run in batches. Is there a way around this problem?
commmand:
./exec_annotation --cpu 20 -p ../../../2023/annotation_comparison/analysis/profiles/ -k ../../../2023/annotation_comparison/analysis/ko_list --format mapper -o ../../kegg_annotations/147893.tsv ../../translated_files/GTDB/faa/147893.faa
Dear @takaram Do you think it would be possible to make a mmseqs2 implementation of kofam_scan. mmseqs2 profiles work analogous to hmmer profiles but should be 300 times faster.
I can help with the implementation.
What do you think?
Hello,
I have a question regarding the adaptive score threshold of the KOfam database and the possibility to adjust it in KofamScan with the option -T, --threshold-scale
.
-T, --threshold-scale=VALUE
The score thresholds are multiplied by VALUE. For example, with -T2 option, the thresholds become twice as strict.
Do you have some guidance or experience whether to adjust the score threshold or how to chose it?
In the paper of Takuya Aramaki et al. with the title "KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold" (DOI https://doi.org/10.1093/bioinformatics/btz859) I read that the KOfam database of profile hidden Markov models (pHMMs) contains an adaptive score threshold that is pre-computed for every KO family and if a new sequence has a hmmsearch score above this threshold it is considered a reliable match and KoFam scan highlights it with an asterisk (*) in the output file.
It is described that the adaptive score threshold is determined by maximizing the F-measure over a positive and negative datasets' sequence similarity scores (bit scores) and computed pHMMs.
Thus, the adaptive score threshold is a criterion to assign a KO to new sequences.
I am using KofamScan for my own project and I wonder whether it makes sense or it is advisable to relax the score threshold, and vice-versa to make the threshold more strict with KofamScan's -T, --threshold-scale
option?
Can you share some guidance or your experience about this option please?
Thanks and BR,
Bernhard
Hello, can this tool be applied to prokaryotic genome, eukaryotic genome and viral genome?
Thanks for your reply!
Best wishes!
I'm looking at the code here but not very familiar w/ Ruby:
module KofamScan
class Result
extend Autoload
autoload :WithEvalueThreshold
autoload :WithThresholdScale
autoload :WithThresholdScaleAndEvalueThreshold
def self.create(query_list, threshold_scale: nil, e_value_threshold: nil)
if threshold_scale && e_value_threshold
WithThresholdScaleAndEvalueThreshold.new(query_list, threshold_scale, e_value_threshold)
elsif e_value_threshold
WithEvalueThreshold.new(query_list, e_value_threshold)
elsif threshold_scale
WithThresholdScale.new(query_list, threshold_scale)
else
Result.new(query_list)
end
end
Here's what ko_list
looks like:
knum threshold score_type profile_type F-measure nseq nseq_used alen mlen eff_nseq re/pos definition
K00001 357.90 domain all 0.256163 2367 1915 1975 464 14.05 0.590 alcohol dehydrogenase [EC:1.1.1.1]
K00002 443.17 full all 0.430391 2376 2273 5878 503 7.51 0.590 alcohol dehydrogenase (NADP+) [EC:1.1.1.2]
K00003 286.37 domain all 0.945500 6268 5369 3257 782 7.03 0.590 homoserine dehydrogenase [EC:1.1.1.3]
K00004 369.60 domain trim 0.809403 1572 1320 1364 436 5.41 0.590 (R,R)-butanediol dehydrogenase / meso-butanediol dehydrogenase / diacetyl reductase [EC:1.1.1.4 1.1.1.- 1.1.1.303]
K00005 320.93 full all 0.984022 1449 1051 682 366 1.99 0.590 glycerol dehydrogenase [EC:1.1.1.6]
K00006 316.47 full all 0.899202 2118 1971 3274 549 4.64 0.590 glycerol-3-phosphate dehydrogenase (NAD+) [EC:1.1.1.8]
K00007 520.80 full all 0.998236 358 283 834 469 1.01 0.589 D-arabinitol 4-dehydrogenase [EC:1.1.1.11]
K00008 420.17 full all 0.500025 3737 3281 3597 524 8.29 0.590 L-iditol 2-dehydrogenase [EC:1.1.1.14]
K00009 147.27 full trim 0.997067 1556 1192 1443 459 3.28 0.590 mannitol-1-phosphate 5-dehydrogenase [EC:1.1.1.17]
I'm seeing a score threshold but not e-value threshold. Can you elaborate on the scheme used for determining significant hits?
I'm using the eukaryotes profile (profiles/eukaryote.hal)
and all the programs are in my path.
Thanks for your help!
[bobbieshaban@spartan-bm067 kofam]$ ./exec_annotation -o result.txt R5300_S1.prodigal.proteins.fa --cpu=10
Traceback (most recent call last):
6: from .../exec_annotation:7:in `<main>'
5: from ../kofam/lib/kofam_scan/cli.rb:16:in `run'
4: from .../kofam/lib/kofam_scan/config.rb:11:in `load'
3: from .../psych.rb:349:in `safe_load'
2: from .../psych.rb:390:in `parse'
1: from ..../psych.rb:456:in `parse': (<unknown>): did not find expected key while parsing a block mapping at line 4 column 1 (Psych::SyntaxError)
Hiya!
First, thanks for giving us all an avenue beyond the web interface for KO annotations :)
I wanted to suggested adding a check to see if the default ./tmp
directory exists before launching a run. This was very much me not paying enough attention and making a silly mistake, but I've been running kofamscan simultaneously for several assemblies called from the same directory for a couple days, and just realized now they were all constantly overwriting the same files in the ./tmp
directory, making it all meaningless 😬
It might save someone else from being as silly as I was if when the command is executed the program checked to see if the temporary directory that will be used (whether default or specified by the user) already exists, and if it does, exit and tell the user.
Just a suggestion, thanks again!
Hi,
I have notice you have separate eukaryote and prokaryote, do you have plans on splitting prokaryote into bacteria and archaea?
Thanks,
Jie
Hi,
I am using kofan_scan 1.2.0, I am wondering if not set --threshold-scale, that means using the precomputed score as cutoff which means --threshold-scale set to 1, am I understanding it in a right way?
Thank u
Hello,
I was wondering why, even if I select Prokaryote as a profile, I still have annotations relative to eukaryotes.
For example, I get genes related to the "Longevity regulating pathway - worm [PATH:ko04212]".
Is it because the same gene can be involved in different pathways between Eukaryotes and Procaryotes, or am I setting the profiles wrong?
I have done it in this way:
--PROFILE=/my_path...../Kofam/profiles/prokaryote.hal
Thanks for the awesome tool!
Gabri
what is the default e value? i could not find it in the readme? Is it 0.01 as with the online version? (https://www.genome.jp/tools/kofamkoala/)
Hi @takaram
I am continuously facing an issue of 'core dumped', when running kofamscan. I have downloaded hmm profiles and kofamscan-1.3.0 tool with ko_list recently, and I am following this tutorial-
https://taylorreiter.github.io/2019-05-11-kofamscan/
I am running the following command:
./exec_annotation -p /path/to/profiles/ -k /path_to/ko_list -o out.tsv final_proteins.faa --cpu 20
I have attached the error file .txt for your reference, if that may help. Appreciate, if anyone can help, and do let me know if further details are required.
Hi there,
I am just updated kofam 1.2 to 1.3, running on a HPC. Usually when i run the program the tmp directory that is created disappears after the job has finished. In the new version 1.3 - the job finishes but the tmp directory is still there. Is this normal? My slurm file has no debugging info so i can't check if everything completed correctly.
I complied the program on our HPC as follows:
virtualenv ~/kofamscan
module load gcc/7.3.0 nixpkgs/16.09 hmmer/3.2.1 ruby/2.6.1 parallel/20160722
source kofamscan/bin/activate
cd kofamscan/bin
wget ftp://ftp.genome.jp/pub/tools/kofam_scan/kofam_scan-1.3.0.tar.gz
tar xvzf kofam_scan-1.3.0.tar.gz
mkdir db
cd db
wget ftp://ftp.genome.jp/pub/db/kofam/ko_list.gz
wget ftp://ftp.genome.jp/pub/db/kofam/profiles.tar.gz
gunzip ko_list.gz
tar xvzf profiles.tar.gz
Edit config.yml - added paths to ko and profiles.
I am running the program as follows:
#!/bin/bash
#SBATCH --mem=xxxx
#SBATCH --nodes=1
#SBATCH --account=xxx
#SBATCH --cpus-per-task=32
#SBATCH --time=0-23:59
#SBATCH --job-name=xxx
module load gcc/7.3.0 nixpkgs/16.09 hmmer/3.2.1 ruby/2.6.1 parallel/20160722
source /home/user/kofamscan/bin/activate
/home/user/kofamscan/bin/kofam_scan-1.3.0/./exec_annotation -o /path_to_output_dir/kofamscan.txt /path_to_orf_file/xxx.faa --tmp-dir=DIR
Hi,
How does kofam_scan deal with hmm file that without "threshold score_type profile_type F-measure" these 4 columns information in file "ko_list", it seems to me that there will never be a valid hit of those hmms.
Thank you,
Jie
The current organization of the code makes the code a bit hard to install because the main script (exec_annotation
) use require_relative
to find the lib/
directory (therefore the lib/
directory and exec_annotation
must be in the same directory).
I don't know the conventions in the Ruby community but if I trust this book, the main script (exec_annotation
) :
bin/
directory$LOAD_PATH
) to find the lib/
directory (I'm not really fond of this idea; I would rather rely on the RUBYLIB
variable and modify LOAD_PATH
if this fails)require 'lib/kofam_scan'
instead of require_relative 'lib/kofam_scan'
This would simplify installation in our environment for example and probably in most environments too.
I have manually assigned hmmscan in the config.yml
file. Recently discovered that when it is pointing to a broken path, the call to exec_annotation
will hang up forever at lib/kofam_scan/parallel.rb:28
rather than throw a file not found error.
~/software/KOALA/kofam_scan-1.3.0/bin/exec_annotation -o ./kofam.out -p ~/software/KOALA/kofam_scan-1.3.0/db/profiles -k ~/software/KOALA/kofam_scan-1.3.0/db/profiles/eukaryote.hal --cpu 5 --tmp-dir ./tmp -E 1e-5 -f detail-tsv Pshi.short.pep
Traceback (most recent call last):
11: from /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/exec_annotation:7:in <main>' 10: from /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/lib/kofam_scan/cli.rb:21:in
run'
9: from /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/lib/kofam_scan/executor.rb:8:in execute' 8: from /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/lib/kofam_scan/executor.rb:22:in
execute'
7: from /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/lib/kofam_scan/executor.rb:80:in parse_ko' 6: from /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/lib/kofam_scan/executor.rb:80:in
open'
5: from /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/lib/kofam_scan/executor.rb:80:in block in parse_ko' 4: from /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/lib/kofam_scan/ko.rb:12:in
parse'
3: from /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/lib/kofam_scan/ko.rb:12:in each_line' 2: from /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/lib/kofam_scan/ko.rb:15:in
block in parse'
1: from /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/lib/kofam_scan/ko.rb:15:in new' /home/jingtian/software/KOALA/kofam_scan-1.3.0/bin/lib/kofam_scan/ko.rb:35:in
initialize': wrong number of arguments (given 1, expected 12) (ArgumentError)
I'm trying to run kofam_scan with ruby 2.6.6, and I got the error as below:
Traceback (most recent call last):
5: from ./exec_annotation:7:in `<main>'
4: from /work/LAS/mash-lab/jing/bin/kofam_scan/lib/kofam_scan/cli.rb:21:in `run'
3: from /work/LAS/mash-lab/jing/bin/kofam_scan/lib/kofam_scan/executor.rb:8:in `execute'
2: from /work/LAS/mash-lab/jing/bin/kofam_scan/lib/kofam_scan/executor.rb:35:in `execute'
1: from /work/LAS/mash-lab/jing/bin/kofam_scan/lib/kofam_scan/executor.rb:104:in `run_hmmsearch'
/work/LAS/mash-lab/jing/bin/kofam_scan/lib/kofam_scan/executor.rb:152:in `lookup_profiles': undefined method `end_with?' for nil:NilClass (NoMethodError)
Has anybody else had issues when trying to read the output files?
They are not always fixed-width. Specifically, a high value in the "thrshld" column will push the next columns ("score", "E-Value" etc.) down a space. This makes it impossible to read correctly (see line 6 in the example).
Trying to use at least x2 spaces as a delimiter doesn't work either as some columns have only one space between them.
I would be so grateful if anybody had a solution! :)
output_file.txt
Hi There,
I am running a protein annotation with kofamscan. An error message keep showing up that "hmmsearch was not run successfully". After successfully running with another .fasta file, I realized that something goes wrong with my original file. Although I finally found that ONE sequence caused that problem and ran successfully by removing it, I still don't know why.
Could anyone help explain it?
Thanks~
The sequence attached below:
VIRSorter_k141_1676536_flag=0_multi=16_9953_len=13692-cat_2_4 # 6084 # 13691 # -1 # ID=10186_4;partial=01;start_type=Edge;rbs_motif=None;rbs_spacer=None;gc_cont=0.465
NGEAGTSGTSGISGINGTNGLNGTGGSSGTSGLSGVDGTSGTAGTSGTSGYSGTDGTSGT
SGISGADGMPGTSGTSGISGVDGTSGTSGINGTSGSSGTTGTSGSSGTSGISGVDGTSGT
SGLSGVDGTSGSSGTSGSSGTSGISGVDGTSGTAGTSGSSGTSGTSGISGIDGTSGSSGT
NGTSGSSGTSGISGVDGTSGTAGTSGTSGIDGTSGTSGISGVDGTSGTSGTSGISGVDGV
DGTNGTSGTSGISGVDGTSGTAGSSGTSGTTGTSGSSGTSGISGVDGTSGSSGTSGTSGI
DGTSGTSGISGVDGTSGTSGTSGSSGTSGTSGISGVDGTSGTNGSSGTSGSSGTAGTSGT
SGISGVDGTSGTSGTGTSGTSGTSGTVGTSGSSGSSGTSGISGANGEAGTSGTSGISGLN
GTNGLNGTGGSSGTSGISGVDGTSGTAGTSGTSGYSGTDGTSGTSGISGADGMPGTSGTS
GISGVDGTSGTSGTTGTSGTSGTTGTSGSSGTSGISGVDGTSGSSGTSGTSGISGVDGTS
GTSGSSGTSGTSGTSGTSGTSGISGVDGTSGSSGTSGSSGTSGSSGTSGISGINGTNGSS
GTSGISGVDGTSGTSGIDGTSGTSGIDGTSGTSGISGINGTSGTNGSSGSSGTSGLSGVD
GTSGTSGIDGTSGTSGIDGTSGTSGISGINGTSGTNGSSGSSGTSGISGVDGTSGTSGSS
GSSGTSGISGVDGTSGTSGISGIDGTSGTAGTSGTSGVDGTSGTSGISGINGTNGSSGTS
GVSGVDGTSGTSGLDGTHGTSGTTGTSGSSGTSGISGANGEAGTSGTSGISGINGTNGIA
GTGGSSGTSGISGVDGTSGTAGTSGTSGYSGTDGTSGTSGISGADGMPGTSGSSGTSGLS
GVDGTSGTAGTSGSSGTSGTTGTSGSSGTSGISGVDGTSGTAGTSGTSGISGVDGTSGSS
GTSGSSGTSGSSGTSGTSGISGVDGTSGSSGTSGSSGTSGSSGTSGISGINGTNGSSGTS
GISGVDGTSGTSGIDGTSGTSGINGTSGTSGISGVDGTSGTNGSSGSSGTSGLSGVDGTS
GTSGIDGTSGTSGIDGTSGTSGISGINGTSGTNGSSGSSGTSGLSGVDGTSGTAGTSGSS
GTSGISGVDGTSGTSGISGVDGTSGTAGTSGTSGVNGTSGTSGISGINGTNGSSGTSGIS
GVDGTSGTSGLDGTHGTSGSSGTSGTSGSSGTSGISGANGEAGTSGTSGISGVAGTNGIA
GTGGSSGTSGLSGVDGTSGTAGTSGTSGYSGTDGTSGTSGISGADGMPGTSGTSGTNGSS
GTSGLSGVDGTSGTSGTNGTSGSSGTNGSSGTSGTSGTSGISGVDGTSGTAGSSGTSGSS
GTSGLSGVDGTSGSSGTSGSSGTSGSSGTSGTSGISGVDGTSGTSGSSGTSGSSGTSGIS
GVDGTSGTSGSSGTSGIDGTSGTTGTSGISGISGTSGTNGTSGSSGTSGISGVDGTSGSS
GTSGDAGTSGTSGITGTSGISGISGTSGTNGSSGSSGTSGLSGVDGTSGTSGSSGTSGTT
GTSGTSGISGVDGTSGTSGSAGTSGTSGVDGTSGVSGVSGINGTNGSSGTSGISGVDGTS
GTSGTVGTSGTSGTNGTSGSSGTSGISGANGEAGTSGTSGISGINGTAGRQGTGGSSGTS
GVSGVDGTSGTAGTSGTSGISGTTGTSGTSGISGADGMPGTSGTSGINGTSGSSGTSGSS
GTSGSSGTSGISGINGTNGTSGISGVDGTSGSSGTSGTSGSSGTSGSSGTSGISGINGTN
GSSGTSGISGVDGTSGTSGSSGTSGSSGTSGSSGTSGSSGTSGISGVDGTSGSSGTSGIS
GVDGTSGTSGTSGSSGTSGSSGTSGSSGTSGTSGISGVDGTNGTSGTSGTSGSSGTSGSS
GTSGSSGSSGTSGISGVDGTSGSSGTSGSSGTSGISGVDGTSGTSGTSGSSGTSGSSGTS
GSSGTSGISGVNGTSGSSGTSGISGVDGTSGTAGTSGSSGTSGSAGTSGSSGTSGISGIN
GTSGTNGSSGSSGTSGVDGTSGTSGSNGTSGSSGTSGISGANGAPGTSGTSGLSGVDGTS
GTAGTSGSSGTSGSSGTSGISGVDGTSGTAGSSGTSGSSGTSGSSGTSGSSGTSGISGIN
GTSGSSGTSGSSGTSGTSGTSGSSGTSGTSGISGVDGTSGSSGTNGTSGTSGTKGTSGTS
GSSGTSGSSGSSGTSGISGINGTSGSSGTSGISGVDGTSGTAGSSGTSGTSGTSGIDGTN
GTSGSSGTSGISGINGTNGSSGTSGISGVDGIDGTSGSSGTNGTSGSSGTSGISGANGAP
GTSGTSGLSGVSGISGTNGTSGTSGTSGTTGTSGISGLNGTTGTSGTSGTGFSAILNATN
NRLITSDGTQTNAVAEANLTFDGEILNLAGVFKSKTGEGSSITANTLLYAADTALGNGWI
IDYVVKATTGVAMRTGTILAVTDGIDVTFTETSSPDLGASTAAVTFGLTINSTDLEIAAN
ISFGTWDVKVAVRVI*
Hello, I annotated the gene set with kofamscan
software.
How many evalue and score filtering results do you recommend?
Thanks
shenglong
Hi,
I've seen a few Errors recently and wondered if you knew what might be causing them, or better still, how to fix them!
Error message examples:
kofamscan/bin/lib/kofam_scan/executor.rb:54:in delete': No such file or directory @ apply2files - ./tmp/tabular/K20351 (Errno::ENOENT) ... kofamscan/bin/lib/kofam_scan/executor.rb:62:in
initialize': No such file or directory @ rb_sysopen - ./tmp/tabular/K06344 (Errno::ENOENT)
Otherwise, it outputs a file just fine with plenty of hits, but I'm seeing more than a few of these.
Best wishes and thanks,
Tim
Thanks for providing so useful tool!
When I used it, I encountered a error as follows:
/mdata/xxx/software/miniconda3/bin/lib/kofam_scan/parallel.rb:28:in
'write': Broken pipe (Errno::EPIPE) from /mdata/xxx/software/miniconda3/bin/lib/kofam_scan/parallel.rb:28:in
puts'
from /mdata/xxx/software/miniconda3/bin/lib/kofam_scan/parallel.rb:28:inblock in exec' from /mdata/xxx/software/miniconda3/lib/ruby/2.5.0/open3.rb:205:in
popen_run'
from /mdata/xxx/software/miniconda3/lib/ruby/2.5.0/open3.rb:95:inpopen3' from /mdata/xxx/software/miniconda3/bin/lib/kofam_scan/parallel.rb:27:in
exec'
from /mdata/xxx/software/miniconda3/bin/lib/kofam_scan/executor.rb:107:inrun_hmmsearch' from /mdata/xxx/software/miniconda3/bin/lib/kofam_scan/executor.rb:35:in
execute'
from /mdata/xxx/software/miniconda3/bin/lib/kofam_scan/executor.rb:8:inexecute' from /mdata/xxx/software/miniconda3/bin/lib/kofam_scan/cli.rb:21:in
run'
from /mdata/xxx/software/miniconda3/bin/exec_annotation:7:in `
Do you have any idea about how to solve this problem?
Thanks so much!
Pandeng
Hi takaram,
I found a K02487 hit in my result, and it doesnt has a match in KEGG web.
Could you explain a little bit about this kind of hit?
Thanks.
Hi,
Am I understanding correctly that the query.fasta file is the sequence file to be analyzed? Or is it some kind of reference file?
Hi takaram,
short question.
Can you give an example of the profile.yml file, per chance?
Thanks,
Thierry
Hi, kofam_scan_colleagues:
Recently, I run the kofam_scan software using the following command line:
exec_annotation -o 5k+_assembled_hits_to_vHMMs_ko.txt 5k+_all_assemblied_contigs.part_329_prot.fasta -k ~/database/kofam/ko_list -p ~/database/kofam/profiles/ -E 0.00001 --cpu 5
But, it seems like the "-E 0.00001" did not work. The result file contained many hits which have the e-value higher than 0.00001.
I tried "-E 0.00001" and "-E 1.0e-05", but both can not work.
My input file was attached.
Thank you all and look forward to your reply.
5k+_all_assemblied_contigs.part_329_prot.txt
Is it possible to obtain more than one output at the same time? I mean, the mapper and de detail-tsv output.
Hi guy,
Thank to develop the great tools for kegg annotation.
But i was so confusion about the exec_annotation shell script, on account of the exec_annotation only can be run on the folder that it belong. if i hope execute it in other directory with the absolute path in exec_annotation, it was complaint about below error:
internal:gem_prelude:1:in require': cannot load such file -- rubygems.rb (LoadError) from <internal:gem_prelude>:1:in
internal:gem_prelude'
I had try to modifed the exec_annotation with below code, although it was error like above:
#!/usr/bin/env /Data/software/ruby-2.7.1/bin/ruby
# frozen_string_literal: true
#require_relative 'lib/kofam_scan'
require_relative '/Data/software/kofam_scan-1.3.0/lib/kofam_scan'
#require 'kofam_scan/cli'
require '/Data/software/kofam_scan-1.3.0/lib/kofam_scan/cli'
KofamScan::CLI.run(ARGV)
What should i do for this issue?
In addition, i want to exec it in everywhere by using the absolute path of exec_annotation. how can i configure it ?
Best,
Hanhuihong
Hi takaram,
Once hit score is over the threshold, then this hit will be marked as valid hit, resulted in some multi hits for one sequence.
But, each hmm has its own threhold, which makes me a little bit confusing, as seems like these thresholds are not comparable to each other.
How would you suggest us to deal with multi hmm hits from one protein sequence?
I am getting an output text file with inconsistent formatting, here is a snippet of some of the output I am getting.
Take the following snippet of text output,
NODE_2_length_14983_cov_10.199960_8 K01424
NODE_2_length_14983_cov_10.199960_9
NODE_2_length_14983_cov_10.199960_10 K02055
Running expand test.txt -t 1 > test.txt
cleans up all my output files,
For reference I am using version kofamscan=1.3.0
and my script is,
exec_annotation \
--cpu $task.cpus \
-k $ko_list \
-p $profiles \
-f mapper \
-o ${sample}.kofam.txt \
${faa}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.