tseemann / barrnap Goto Github PK
View Code? Open in Web Editor NEW:microscope: :leo: Bacterial ribosomal RNA predictor
License: GNU General Public License v3.0
:microscope: :leo: Bacterial ribosomal RNA predictor
License: GNU General Public License v3.0
Hi Team,
Could you please provide us an update in the below mail?
Could you also provide us the ECCN number of the product in the subject.
ECCN – Export Control Classification Number
If you do not have your software classified with an ECCN, please kindly answer the following questions so that we may self-assess:
NO YES
Does the Software perform any encryption or utilize any encryption processes?
If the answer is YES to the above, please indicate if the encryption is coded into the application or separately called (such as using SSL)
If the answer is YES to the above, please indicate what function(s) the cryptography/encryption serves
A, Copyright protection purposes (Includes using a license key/code)
B, User authentication purposes
C, A core part of the functionality such as to encrypt databases
D, To encrypt communications between the software and a host system
Background information
An Export Control Classification Number (ECCN) is a specific alpha-numeric code that identifies the level of export control for items e.g. software that are exported from member states of the Wassenaar Arrangement, including the United States. After obtaining the ECCN, the exporter must determine whether an export license is required to export items to certain countries.
We would be waiting for the reply!
Regards,
Kriti Bhatnagar
Software analyst
New Products & Complex Team
EMIT | IT OPS | CES | WDS | SAM
HCL Technologies Limited
(CIN: L74140DL1991PLC046369)
10th Floor, ODC-IV, Software Tower 6, Sector 126
Noida SEZ, Uttar Pradesh – 201301, India
Phone: +1-4088093746 (ext.4144395)
Email:- [email protected]
for
ExxonMobil Global Services Company
22777 Springwoods Village Parkway
Spring, TX 77389
United States of America
@satta has been trying to package Barrnap into Debian-Med but has reported that the SILVA alignments (23S) have a licence with is incompatible with Debian.
(It's only free for academic/non-commerical: http://www.arb-silva.de/silva-license-information/ )
Goal would be to construct new 23S alignments from Refseq and build our own models.
Hi, developer,
Thanks for creating such efficient software. I have used it to find the 16S rRNA hits in my de-novo assembled genome bins. My purpose is to search for archaea and bacteria, so I run the result separately with -k bac
and -k arc
.
However, the result is so confusing. For example, one of the bin found two 16S hits of archaea and also two hits of bacteria. The header of the hits are >16S_rRNA::NODE_2_length_100533_cov_5.789665:250-1687(-)
and >16S_rRNA::NODE_8_length_10807_cov_5.393508:10362-10807(-)
in bacteria output. The header of the hits are >16S_rRNA::NODE_2_length_100533_cov_5.789665:251-1678(-)
and >16S_rRNA::NODE_8_length_10807_cov_5.393508:10363-10803(-)
And I blast both fasta hits to RDP classifier, and the archaea
hits outputs are 16S_rRNA::NODE_2_length_100533_cov_5.789665:251-1678(-);+;Bacteria;100%;"Bacteroidetes";98%;"Bacteroidia";96%;"Bacteroidales";96%;"Rikenellaceae";38%;Mucinivorans;33% 16S_rRNA::NODE_8_length_10807_cov_5.393508:10363-10803(-);+;Bacteria;99%;Firmicutes;70%;Clostridia;61%;Clostridiales;61%;Ruminococcaceae;43%;Hydrogenoanaerobacterium;14%
Also bacteria hits outputs are 16S_rRNA::NODE_2_length_100533_cov_5.789665:250-1687(-);+;Bacteria;100%;"Bacteroidetes";98%;"Bacteroidia";94%;"Bacteroidales";94%;"Rikenellaceae";34%;Mucinivorans;24% 16S_rRNA::NODE_8_length_10807_cov_5.393508:10362-10807(-);+;Bacteria;99%;Firmicutes;78%;Clostridia;53%;Clostridiales;53%;Ruminococcaceae;40%;Hydrogenoanaerobacterium;14%
So my question are -
(1) The result of bacteria and archaea are the same, both are bacteria. Why they are classified into two parts, bacteria and archaea?
(2) The two hits came from one genome bin, why they can be predicted and have two 16S with different taxonomy classification?
Thanks so much for your patience!
Best.
Hello!
I'm trying to use the --outseq parameter and to get the 16S sequences; however, all output generated were empty. What am I doing wrong?
Best,
I'm getting the following error:
[barrnap] This is barrnap 0.9
[barrnap] Written by Torsten Seemann
[barrnap] Obtained from https://github.com/tseemann/barrnap
[barrnap] Detected operating system: linux
[barrnap] Adding /ebio/abt3_projects/Georg_animal_feces/bin/llg/.snakemake/conda/6db1e2f9/lib/barrnap/bin/../binaries/linux to end of PATH
[barrnap] Checking for dependencies:
[barrnap] Found nhmmer - /ebio/abt3_projects/Georg_animal_feces/bin/llg/.snakemake/conda/6db1e2f9/bin/nhmmer
[barrnap] Found bedtools - /ebio/abt3_projects/Georg_animal_feces/bin/llg/.snakemake/conda/6db1e2f9/bin/bedtools
[barrnap] Will use 4 threads
[barrnap] Setting evalue cutoff to 1e-06
[barrnap] Will tag genes < 0.8 of expected length.
[barrnap] Will reject genes < 0.25 of expected length.
[barrnap] Using database: /ebio/abt3_projects/Georg_animal_feces/bin/llg/.snakemake/conda/6db1e2f9/lib/barrnap/bin/../db/bac.hmm
[barrnap] Scanning /ebio/abt3_scratch/nyoungblut/LLG_8797531528/genomes/X361_fail_Common_Opossum__maxbin2__High.001.fna for bac rRNA genes... please wait
[barrnap] Command: nhmmer --cpu 4 -E 1e-06 --w_length 3878 -o /dev/null --tblout /dev/stdout '/ebio/abt3_projects/Georg_animal_feces/bin/llg/.snakemake/conda/6db1e2f9/lib/barrnap/bin/../db/bac.hmm' '/ebio/abt3_scratch/nyoungblut/LLG_8797531528/genomes/X361_fail_Common_Opossum__maxbin2__High.001.fna'
[barrnap] ERROR: nhmmer failed to run - # Target file: /ebio/abt3_scratch/nyoungblut/LLG_8797531528/genomes/X361_fail_Common_Opossum__maxbin2__High.001.fna
However, when I activate that conda env and run nhmmer myself, the run completes successfully:
$ nhmmer --cpu 4 -E 1e-06 --w_length 3878 -o /dev/null --tblout /dev/stdout '/ebio/abt3_projects/Georg_animal_feces/bin/llg/.snakemake/conda/6db1e2f9/lib/barrnap/bin/../db/bac.hmm' '/ebio/abt3_scratch/nyoungblut/LLG_8797531528/genomes/X361_fail_Common_Opossum__maxbin2__High.001.fna' || echo "ERROR!"
# target name accession query name accession hmmfrom hmm to alifrom ali to envfrom env to sq len strand E-value score bias description of target
#------------------- ---------- -------------------- ---------- ------- ------- ------- ------- ------- ------- ------- ------ --------- ------ ----- ---------------------
#
# Program: nhmmer
# Version: 3.1b2 (February 2015)
# Pipeline mode: SEARCH
# Query file: /ebio/abt3_projects/Georg_animal_feces/bin/llg/.snakemake/conda/6db1e2f9/lib/barrnap/bin/../db/bac.hmm
# Target file: /ebio/abt3_scratch/nyoungblut/LLG_8797531528/genomes/X361_fail_Common_Opossum__maxbin2__High.001.fna
# Option settings: nhmmer -o /dev/null --tblout /dev/stdout -E 1e-06 --w_length 3878 --cpu 4 /ebio/abt3_projects/Georg_animal_feces/bin/llg/.snakemake/conda/6db1e2f9/lib/barrnap/bin/../db/bac.hmm /ebio/abt3_scratch/nyoungblut/LLG_8797531528/genomes/X361_fail_Common_Opossum__maxbin2__High.001.fna
# Current dir: /ebio/abt3_projects/Georg_animal_feces/bin/llmga
# Date: Fri Jan 29 09:11:22 2021
# [ok]
The input genome fasta file contains 123 contigs, and it is a valid fasta file.
My conda env:
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
_r-mutex 1.0.1 anacondar_1 conda-forge
alsa-lib 1.1.5 h516909a_1002 conda-forge
arb-bio-tools 6.0.6 haa8b8d8_8 bioconda
attrs 19.3.0 py_0 conda-forge
backcall 0.1.0 py_0 conda-forge
barrnap 0.9 2 bioconda
bedtools 2.29.2 hc088bd4_0 bioconda
bibtexparser 1.1.0 py_0 conda-forge
binutils_impl_linux-64 2.33.1 h53a641e_8 conda-forge
binutils_linux-64 2.33.1 h9595d00_17 conda-forge
bioconductor-biobase 2.42.0 r351h14c3975_1 bioconda
bioconductor-biocgenerics 0.28.0 r351_1 bioconda
bioconductor-biocparallel 1.16.6 r351h1c2f66e_0 bioconda
bioconductor-biostrings 2.50.2 r351h14c3975_0 bioconda
bioconductor-dada2 1.10.0 r351hf484d3e_0 bioconda
bioconductor-delayedarray 0.8.0 r351h14c3975_0 bioconda
bioconductor-genomeinfodb 1.18.1 r351_0 bioconda
bioconductor-genomeinfodbdata 1.2.1 r351_0 bioconda
bioconductor-genomicalignments 1.18.1 r351h14c3975_0 bioconda
bioconductor-genomicranges 1.34.0 r351h14c3975_0 bioconda
bioconductor-iranges 2.16.0 r351h14c3975_0 bioconda
bioconductor-rsamtools 1.34.0 r351hf484d3e_0 bioconda
bioconductor-s4vectors 0.20.1 r351h14c3975_0 bioconda
bioconductor-shortread 1.40.0 r351hf484d3e_0 bioconda
bioconductor-summarizedexperiment 1.12.0 r351_0 bioconda
bioconductor-xvector 0.22.0 r351h14c3975_0 bioconda
bioconductor-zlibbioc 1.28.0 r351h14c3975_0 bioconda
biom-format 2.1.8 py36ha112f06_1 conda-forge
blas 2.14 openblas conda-forge
blast 2.9.0 pl526h3066fca_4 bioconda
bleach 3.1.1 py_0 conda-forge
bokeh 1.4.0 py36h9f0ad1d_1 conda-forge
boost 1.68.0 py36h8619c78_1001 conda-forge
boost-cpp 1.68.0 h11c811c_1000 conda-forge
bwidget 1.9.14 0 conda-forge
bzip2 1.0.8 h516909a_3 conda-forge
ca-certificates 2019.11.28 hecc5488_0 conda-forge
cachecontrol 0.12.5 py_0 conda-forge
cairo 1.16.0 h18b612c_1001 conda-forge
certifi 2019.11.28 py36h9f0ad1d_1 conda-forge
cffi 1.13.2 py36h8022711_0 conda-forge
chardet 3.0.4 py36h9880bd3_1008 conda-forge
click 7.0 py_0 conda-forge
cryptography 2.8 py36h45558ae_2 conda-forge
curl 7.68.0 hf8cf82a_0 conda-forge
cutadapt 2.8 py36h516909a_0 bioconda
cycler 0.10.0 py36_0 conda-forge
cython 0.29.15 py36h831f99a_1 conda-forge
dbus 1.13.6 he372182_0 conda-forge
deblur 1.1.0 py36_0 bioconda
decorator 4.4.1 py_0 conda-forge
defusedxml 0.6.0 py_0 conda-forge
dendropy 4.4.0 pyh864c0ab_2 bioconda
dnaio 0.4.1 py36h516909a_0 bioconda
emperor 1.0.0 py36_0 conda-forge
entrez-direct 13.9 pl526h375a9b1_0 bioconda
entrypoints 0.3 py36h9f0ad1d_1002 conda-forge
expat 2.2.9 he1b5a44_2 conda-forge
fastcluster 1.1.26 py36h7c3b610_2 conda-forge
fasttree 2.1.10 0 bioconda
fontconfig 2.13.1 he4413a7_1000 conda-forge
freetype 2.10.0 he06d7ca_2 conda-forge
future 0.18.2 py36h5fab9bb_3 conda-forge
gcc_impl_linux-64 7.3.0 hd420e75_5 conda-forge
gcc_linux-64 7.3.0 h553295d_17 conda-forge
gettext 0.19.8.1 hf34092f_1004 conda-forge
gfortran_impl_linux-64 7.3.0 hdf63c60_5 conda-forge
gfortran_linux-64 7.3.0 h553295d_17 conda-forge
giflib 5.2.1 h516909a_2 conda-forge
glib 2.58.3 py36hd3ed26a_1004 conda-forge
gmp 6.2.0 h58526e2_4 conda-forge
gneiss 0.4.6 py_0 bioconda
gnutls 3.6.5 hd3a4fd2_1002 conda-forge
graphite2 1.3.13 he1b5a44_1001 conda-forge
gsl 2.5 h294904e_1 conda-forge
gst-plugins-base 1.14.5 h0935bb2_2 conda-forge
gstreamer 1.14.5 h36ae1b5_2 conda-forge
gxx_impl_linux-64 7.3.0 hdf63c60_5 conda-forge
gxx_linux-64 7.3.0 h553295d_17 conda-forge
h5py 2.10.0 nompi_py36h513d04c_102 conda-forge
harfbuzz 2.4.0 h37c48d4_1 conda-forge
hdf5 1.10.5 nompi_h3c11f04_1104 conda-forge
hdmedians 0.13 py36h785e9b2_1002 conda-forge
hmmer 3.1b2 3 bioconda
icu 58.2 hf484d3e_1000 conda-forge
idna 2.9 py36_0 conda-forge
ijson 2.6.1 py_0 conda-forge
importlib_metadata 1.5.0 py36_0 conda-forge
ipykernel 5.1.4 py36h5ca1d4c_0 conda-forge
ipython 7.12.0 py36h5ca1d4c_0 conda-forge
ipython_genutils 0.2.0 py36_0 conda-forge
ipywidgets 7.5.1 pyh9f0ad1d_1 conda-forge
iqtree 1.6.12 he513fc3_1 bioconda
jedi 0.16.0 py36h9f0ad1d_1 conda-forge
jinja2 2.11.1 py_0 conda-forge
joblib 0.14.1 pyh9f0ad1d_0 conda-forge
jpeg 9c h14c3975_1001 conda-forge
jsonschema 3.2.0 py36h9f0ad1d_1 conda-forge
jupyter_client 6.0.0 py_0 conda-forge
jupyter_core 4.6.3 py36h9f0ad1d_2 conda-forge
kiwisolver 1.1.0 py36hdb11119_1 conda-forge
krb5 1.16.4 h2fd8d38_0 conda-forge
lcms2 2.9 hbd6801e_2 conda-forge
ld_impl_linux-64 2.33.1 h53a641e_8 conda-forge
libarbdb 6.0.6 haa8b8d8_8 bioconda
libblas 3.8.0 14_openblas conda-forge
libcblas 3.8.0 14_openblas conda-forge
libcurl 7.68.0 hda55be3_0 conda-forge
libedit 3.1.20170329 hf8c457e_1001 conda-forge
libffi 3.2.1 he1b5a44_1007 conda-forge
libgcc 7.2.0 h69d50b8_2 conda-forge
libgcc-ng 9.2.0 h24d8f2e_2 conda-forge
libgfortran-ng 7.3.0 hdf63c60_5 conda-forge
libgomp 9.2.0 h24d8f2e_2 conda-forge
libiconv 1.15 h516909a_1006 conda-forge
liblapack 3.8.0 14_openblas conda-forge
liblapacke 3.8.0 14_openblas conda-forge
libopenblas 0.3.7 h5ec1e0e_6 conda-forge
libpng 1.6.37 hed695b0_2 conda-forge
libsodium 1.0.17 h516909a_0 conda-forge
libssh2 1.8.2 h22169c7_2 conda-forge
libstdcxx-ng 9.2.0 hdf63c60_2 conda-forge
libtiff 4.1.0 hc7e4089_6 conda-forge
libuuid 2.32.1 h14c3975_1000 conda-forge
libwebp-base 1.1.0 h516909a_3 conda-forge
libxcb 1.13 h14c3975_1002 conda-forge
libxml2 2.9.9 h13577e0_2 conda-forge
lockfile 0.12.2 py36_0 conda-forge
lz4-c 1.8.3 hf484d3e_1001 conda-forge
mafft 7.310 he1b5a44_3 bioconda
make 4.3 hd18ef5c_1 conda-forge
markupsafe 1.1.1 py36he6145b8_2 conda-forge
matplotlib 3.1.1 py36_0 conda-forge
matplotlib-base 3.1.1 py36hfd891ef_0 conda-forge
mistune 0.8.4 py36h8c4c3a4_1002 conda-forge
more-itertools 8.2.0 py_1 conda-forge
msgpack-python 1.0.0 py36hdb11119_2 conda-forge
natsort 7.0.1 py_0 conda-forge
nbconvert 5.6.1 py36h9f0ad1d_1 conda-forge
nbformat 5.0.4 py_0 conda-forge
ncurses 6.1 hf484d3e_1002 conda-forge
nettle 3.4.1 h14c3975_1002 conda-forge
networkx 2.4 py_1 conda-forge
nose 1.3.7 py36h9f0ad1d_1004 conda-forge
notebook 6.0.3 py36h9f0ad1d_1 conda-forge
numpy 1.18.1 py36h7314795_1 conda-forge
olefile 0.46 pyh9f0ad1d_1 conda-forge
openjdk 11.0.1 h600c080_1018 conda-forge
openssl 1.1.1d h516909a_0 conda-forge
packaging 20.1 py_0 conda-forge
pandas 0.25.3 py36hb3f55d8_0 conda-forge
pandoc 2.9.2.1 0 conda-forge
pandocfilters 1.4.2 py36_0 conda-forge
pango 1.40.14 he7ab937_1005 conda-forge
parso 0.6.1 py_0 conda-forge
patsy 0.5.1 py_0 conda-forge
pcre 8.44 he1b5a44_0 conda-forge
perl 5.26.2 h36c2ea0_1008 conda-forge
perl-app-cpanminus 1.7044 pl526_1 bioconda
perl-archive-tar 2.32 pl526_0 bioconda
perl-base 2.23 pl526_1 bioconda
perl-business-isbn 3.004 pl526_0 bioconda
perl-business-isbn-data 20140910.003 pl526_0 bioconda
perl-carp 1.38 pl526_3 bioconda
perl-common-sense 3.74 pl526_2 bioconda
perl-compress-raw-bzip2 2.087 pl526he1b5a44_0 bioconda
perl-compress-raw-zlib 2.087 pl526hc9558a2_0 bioconda
perl-constant 1.33 pl526_1 bioconda
perl-data-dumper 2.173 pl526_0 bioconda
perl-digest-hmac 1.03 pl526_3 bioconda
perl-digest-md5 2.55 pl526_0 bioconda
perl-encode 2.88 pl526_1 bioconda
perl-encode-locale 1.05 pl526_6 bioconda
perl-exporter 5.72 pl526_1 bioconda
perl-exporter-tiny 1.002001 pl526_0 bioconda
perl-extutils-makemaker 7.36 pl526_1 bioconda
perl-file-listing 6.04 pl526_1 bioconda
perl-file-path 2.16 pl526_0 bioconda
perl-file-temp 0.2304 pl526_2 bioconda
perl-html-parser 3.72 pl526h6bb024c_5 bioconda
perl-html-tagset 3.20 pl526_3 bioconda
perl-html-tree 5.07 pl526_1 bioconda
perl-http-cookies 6.04 pl526_0 bioconda
perl-http-daemon 6.01 pl526_1 bioconda
perl-http-date 6.02 pl526_3 bioconda
perl-http-message 6.18 pl526_0 bioconda
perl-http-negotiate 6.01 pl526_3 bioconda
perl-io-compress 2.087 pl526he1b5a44_0 bioconda
perl-io-html 1.001 pl526_2 bioconda
perl-io-socket-ssl 2.066 pl526_0 bioconda
perl-io-zlib 1.10 pl526_2 bioconda
perl-json 4.02 pl526_0 bioconda
perl-json-xs 2.34 pl526h6bb024c_3 bioconda
perl-libwww-perl 6.39 pl526_0 bioconda
perl-list-moreutils 0.428 pl526_1 bioconda
perl-list-moreutils-xs 0.428 pl526_0 bioconda
perl-lwp-mediatypes 6.04 pl526_0 bioconda
perl-lwp-protocol-https 6.07 pl526_4 bioconda
perl-mime-base64 3.15 pl526_1 bioconda
perl-mozilla-ca 20180117 pl526_1 bioconda
perl-net-http 6.19 pl526_0 bioconda
perl-net-ssleay 1.88 pl526h90d6eec_0 bioconda
perl-ntlm 1.09 pl526_4 bioconda
perl-parent 0.236 pl526_1 bioconda
perl-pathtools 3.75 pl526h14c3975_1 bioconda
perl-scalar-list-utils 1.52 pl526h516909a_0 bioconda
perl-socket 2.027 pl526_1 bioconda
perl-storable 3.15 pl526h14c3975_0 bioconda
perl-test-requiresinternet 0.05 pl526_0 bioconda
perl-time-local 1.28 pl526_1 bioconda
perl-try-tiny 0.30 pl526_1 bioconda
perl-types-serialiser 1.0 pl526_2 bioconda
perl-uri 1.76 pl526_0 bioconda
perl-www-robotrules 6.02 pl526_3 bioconda
perl-xml-namespacesupport 1.12 pl526_0 bioconda
perl-xml-parser 2.44_01 pl526ha1d75be_1002 conda-forge
perl-xml-sax 1.02 pl526_0 bioconda
perl-xml-sax-base 1.09 pl526_0 bioconda
perl-xml-sax-expat 0.51 pl526_3 bioconda
perl-xml-simple 2.25 pl526_1 bioconda
perl-xsloader 0.24 pl526_0 bioconda
pexpect 4.8.0 py36h9f0ad1d_1 conda-forge
pickleshare 0.7.5 py36h9f0ad1d_1002 conda-forge
pigz 2.3.4 hed695b0_1 conda-forge
pillow 7.0.0 py36h8328e55_1 conda-forge
pip 20.0.2 py36_1 conda-forge
pixman 0.38.0 h516909a_1003 conda-forge
pluggy 0.12.0 py_0 conda-forge
prometheus_client 0.7.1 py_0 conda-forge
prompt_toolkit 3.0.3 py_0 conda-forge
psutil 5.7.0 py36h8c4c3a4_1 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
ptyprocess 0.6.0 py36_1000 conda-forge
py 1.8.1 py_0 conda-forge
pycparser 2.19 py36_1 conda-forge
pygments 2.5.2 py_0 conda-forge
pyopenssl 19.1.0 py36_0 conda-forge
pyparsing 2.4.6 py_0 conda-forge
pyqt 5.9.2 py36hcca6a23_4 conda-forge
pyrsistent 0.15.7 py36h8c4c3a4_1 conda-forge
pysocks 1.7.1 py36h5fab9bb_3 conda-forge
pytest 5.3.5 py36h9f0ad1d_2 conda-forge
python 3.6.7 h357f687_1008_cpython conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python_abi 3.6 1_cp36m conda-forge
pytz 2019.3 py_0 conda-forge
pyyaml 5.3.1 py36h8c4c3a4_0 conda-forge
pyzmq 19.0.0 py36h9947dbf_1 conda-forge
q2-alignment 2020.2.0 py36_0 qiime2/label/r2020.2
q2-composition 2020.2.0 py36_0 qiime2/label/r2020.2
q2-cutadapt 2020.2.0 py36_0 qiime2/label/r2020.2
q2-dada2 2020.2.0 py36_0 qiime2/label/r2020.2
q2-deblur 2020.2.0 py36_0 qiime2/label/r2020.2
q2-demux 2020.2.0 py36_0 qiime2/label/r2020.2
q2-diversity 2020.2.0 py36_0 qiime2/label/r2020.2
q2-emperor 2020.2.0 py36_0 qiime2/label/r2020.2
q2-feature-classifier 2020.2.0 py36_0 qiime2/label/r2020.2
q2-feature-table 2020.2.0 py36_0 qiime2/label/r2020.2
q2-fragment-insertion 2020.2.0 py36_0 qiime2/label/r2020.2
q2-gneiss 2020.2.0 py36_0 qiime2/label/r2020.2
q2-longitudinal 2020.2.0 py36_0 qiime2/label/r2020.2
q2-metadata 2020.2.0 py36_0 qiime2/label/r2020.2
q2-phylogeny 2020.2.0 py36_0 qiime2/label/r2020.2
q2-quality-control 2020.2.0 py36_0 qiime2/label/r2020.2
q2-quality-filter 2020.2.0 py36_0 qiime2/label/r2020.2
q2-sample-classifier 2020.2.0 py36_0 qiime2/label/r2020.2
q2-taxa 2020.2.0 py36_0 qiime2/label/r2020.2
q2-types 2020.2.0 py36_0 qiime2/label/r2020.2
q2-vsearch 2020.2.0 py36_0 qiime2/label/r2020.2
q2cli 2020.2.0 py36_0 qiime2/label/r2020.2
q2templates 2020.2.0 py36_0 qiime2/label/r2020.2
qiime2 2020.2.0 py36_0 qiime2/label/r2020.2
qt 5.9.7 h52cfd70_2 conda-forge
r-assertthat 0.2.1 r35h6115d3f_1 conda-forge
r-backports 1.1.5 r35hcdcec82_0 conda-forge
r-base 3.5.1 h08e1455_1008 conda-forge
r-bh 1.72.0_3 r35h6115d3f_0 conda-forge
r-bitops 1.0_6 r35hcdcec82_1003 conda-forge
r-cli 2.0.2 r35h6115d3f_0 conda-forge
r-cluster 2.1.0 r35h9bbef5b_2 conda-forge
r-colorspace 1.4_1 r35hcdcec82_1 conda-forge
r-crayon 1.3.4 r351h6115d3f_1 conda-forge
r-data.table 1.12.6 r35hcdcec82_0 conda-forge
r-digest 0.6.25 r35h0357c0b_1 conda-forge
r-ellipsis 0.3.0 r35hcdcec82_0 conda-forge
r-fansi 0.4.1 r35hcdcec82_0 conda-forge
r-farver 2.0.3 r35h0357c0b_0 conda-forge
r-formatr 1.7 r35h6115d3f_1 conda-forge
r-futile.logger 1.4.3 r351h6115d3f_1 conda-forge
r-futile.options 1.0.1 r351h6115d3f_0 conda-forge
r-ggplot2 3.2.1 r35h6115d3f_0 conda-forge
r-glue 1.3.1 r35hcdcec82_1 conda-forge
r-gtable 0.3.0 r35h6115d3f_2 conda-forge
r-hwriter 1.3.2 r351h6115d3f_1 conda-forge
r-labeling 0.3 r351h6115d3f_1 conda-forge
r-lambda.r 1.2.4 r35h6115d3f_0 conda-forge
r-lattice 0.20_40 r35hcdcec82_0 conda-forge
r-latticeextra 0.6_28 r351h6115d3f_1 conda-forge
r-lazyeval 0.2.2 r35hcdcec82_1 conda-forge
r-lifecycle 0.1.0 r35h6115d3f_0 conda-forge
r-magrittr 1.5 r351h6115d3f_1 conda-forge
r-mass 7.3_51.5 r35hcdcec82_0 conda-forge
r-matrix 1.2_18 r35h7fa42b6_2 conda-forge
r-matrixstats 0.55.0 r35hcdcec82_0 conda-forge
r-mgcv 1.8_31 r35hcdcec82_0 conda-forge
r-munsell 0.5.0 r351h6115d3f_1 conda-forge
r-nlme 3.1_144 r35h9bbef5b_0 conda-forge
r-permute 0.9_5 r35h6115d3f_2 conda-forge
r-pillar 1.4.3 r35h6115d3f_0 conda-forge
r-pkgconfig 2.0.3 r35h6115d3f_0 conda-forge
r-plyr 1.8.5 r35h0357c0b_0 conda-forge
r-r6 2.4.1 r35h6115d3f_0 conda-forge
r-rcolorbrewer 1.1_2 r351h6115d3f_1 conda-forge
r-rcpp 1.0.3 r35h0357c0b_0 conda-forge
r-rcppparallel 4.4.4 r35h0357c0b_0 conda-forge
r-rcurl 1.98_1.1 r35hcdcec82_0 conda-forge
r-reshape2 1.4.3 r35h0357c0b_1004 conda-forge
r-rlang 0.4.4 r35hcdcec82_0 conda-forge
r-scales 1.1.0 r35h6115d3f_0 conda-forge
r-snow 0.4_3 r351h6115d3f_0 conda-forge
r-stringi 1.4.3 r35h0357c0b_2 conda-forge
r-stringr 1.4.0 r35h6115d3f_1 conda-forge
r-tibble 2.1.3 r35hcdcec82_1 conda-forge
r-utf8 1.1.4 r35hcdcec82_1002 conda-forge
r-vctrs 0.2.3 r35hcdcec82_0 conda-forge
r-vegan 2.5_6 r35hbf399a0_1 conda-forge
r-viridislite 0.3.0 r351h6115d3f_1 conda-forge
r-withr 2.1.2 r351h6115d3f_0 conda-forge
r-zeallot 0.1.0 r35h6115d3f_1001 conda-forge
raxml 8.2.12 h516909a_2 bioconda
readline 8.0 h46ee950_1 conda-forge
requests 2.23.0 py36h9f0ad1d_1 conda-forge
scikit-bio 0.5.5 py36h3010b51_1000 conda-forge
scikit-learn 0.22.1 py36hcdab131_1 conda-forge
scipy 1.4.1 py36h2d22cac_3 conda-forge
seaborn 0.10.0 py_1 conda-forge
send2trash 1.5.0 py_0 conda-forge
sepp 4.3.10 py36heb1dbbb_2 bioconda
setuptools 45.2.0 py36_0 conda-forge
sina 1.6.0 hc7f9b0f_0 bioconda
sip 4.19.8 py36hf484d3e_1000 conda-forge
six 1.14.0 py36_0 conda-forge
sortmerna 2.0 he860b03_4 bioconda
sqlite 3.30.1 hcee41ef_0 conda-forge
statsmodels 0.11.1 py36h8c4c3a4_2 conda-forge
tbb 2019.9 hc9558a2_1 conda-forge
terminado 0.8.3 py36h9f0ad1d_1 conda-forge
testpath 0.4.4 py_0 conda-forge
tk 8.6.10 hed695b0_1 conda-forge
tktable 2.10 hb7b940f_3 conda-forge
tornado 6.0.3 py36h516909a_4 conda-forge
traitlets 4.3.3 py36h9f0ad1d_1 conda-forge
tzlocal 2.0.0 py_0 conda-forge
unifrac 0.10.0 py36h6bb024c_1 bioconda
urllib3 1.25.7 py36h9f0ad1d_1 conda-forge
vsearch 2.7.0 1 bioconda
wcwidth 0.1.8 pyh9f0ad1d_1 conda-forge
webencodings 0.5.1 py_1 conda-forge
wheel 0.34.2 py36_0 conda-forge
widgetsnbextension 3.5.1 py36h9f0ad1d_4 conda-forge
xopen 0.8.4 py36h9f0ad1d_1 conda-forge
xorg-fixesproto 5.0 h14c3975_1002 conda-forge
xorg-inputproto 2.3.2 h14c3975_1002 conda-forge
xorg-kbproto 1.0.7 h14c3975_1002 conda-forge
xorg-libice 1.0.10 h516909a_0 conda-forge
xorg-libsm 1.2.3 h84519dc_1000 conda-forge
xorg-libx11 1.6.9 h516909a_0 conda-forge
xorg-libxau 1.0.9 h14c3975_0 conda-forge
xorg-libxdmcp 1.1.3 h516909a_0 conda-forge
xorg-libxext 1.3.4 h516909a_0 conda-forge
xorg-libxfixes 5.0.3 h516909a_1004 conda-forge
xorg-libxi 1.7.10 h516909a_0 conda-forge
xorg-libxrender 0.9.10 h516909a_1002 conda-forge
xorg-libxtst 1.2.3 h516909a_1002 conda-forge
xorg-recordproto 1.14.2 h516909a_1002 conda-forge
xorg-renderproto 0.11.1 h14c3975_1002 conda-forge
xorg-xextproto 7.3.0 h14c3975_1002 conda-forge
xorg-xproto 7.0.31 h14c3975_1007 conda-forge
xz 5.2.4 h516909a_1002 conda-forge
yaml 0.2.2 h516909a_1 conda-forge
zeromq 4.3.2 he1b5a44_2 conda-forge
zipp 3.0.0 py_0 conda-forge
zlib 1.2.11 h516909a_1010 conda-forge
zstd 1.4.4 h3b9ef0a_2 conda-forge
Hello
I try to run barrnap to identify rRNA from a eukaryotic genome , the commad as follow:
barrnap --kingdom euk --threads 20 --outseq rRNA.fasta < chr1.fasta
After running, we got following error . Can you supply suggestions to solve this problem? Thanks!
[barrnap] This is barrnap 0.9
[barrnap] Written by Torsten Seemann
[barrnap] Obtained from https://github.com/tseemann/barrnap
[barrnap] Detected operating system: linux
[barrnap] Adding /miniconda3/lib/barrnap/bin/../binaries/linux to end of PATH
[barrnap] Checking for dependencies:
[barrnap] Found nhmmer - /miniconda3/bin/nhmmer
[barrnap] Found bedtools -/miniconda3/bin/bedtools
[barrnap] Will use 20 threads
[barrnap] Setting evalue cutoff to 1e-06
[barrnap] Will tag genes < 0.8 of expected length.
[barrnap] Will reject genes < 0.25 of expected length.
[barrnap] Using database: /miniconda3/lib/barrnap/bin/../db/euk.hmm
[barrnap] Scanning chr1.fasta for euk rRNA genes... please wait
[barrnap] Command: nhmmer --cpu 20 -E 1e-06 --w_length 3878 -o /dev/null --tblout /dev/stdout '/miniconda3/lib/barrnap/bin/../db/euk.hmm' 'chr1.fasta'
[barrnap] ERROR: nhmmer failed to run - Error: Invalid alphabet type in target for nhmmer. Expect DNA or RNA.
I am sure there are no other alphabets in the fasta sequence except A/T/C/G.
Torsten-
An idea for enhancement: would be great to have an option for barrnap to write fasta file(s) containing just the detected rRNA sequences, i.e. to slice the input contigs file at coordinates reported in the .gff output - would facilitate much faster BLAST...just a thought.
Otherwise very handy tool, thanks for maintaining! :)
Jon
the "Source" instructions for cloning the repo are missing the 'bin' directory when you run './barrnap --help' (if you follow the instructions explicitly you'll be in the directory above the bin directory, so that command won't run ... unless someone's added the barrnap bin dir to their path already)
Needed for -name+
option.
Hi Torsten,
would it make sense to concatenate all db/*.hmm files to allow for a "meta" search?
Rationale: I have some eukaryotic RNASeq data and there one has rRNA from the cell as well as from the mitochondrion. And I would very much like not run barrnap twice (or more times for metagenomics) and then try to weed out by hand (i.e., by script) the most appropriate matches.
Best,
B.
Thanks!
I'm interested in building a plant mitochondrial database. Its rRNA are 5S, 18S and 26S
See #2
Barrnap creates an index file of the query sequence before searching for rRNAs.
if this inex already exists, it skips the creation of this index.
This can lead to problems, if the query file has been edited since the last barrnap-run.
This is especially a problem, when integrating that tool into custom pipelines, where one would avoid deleting problematic files simply based on their names as such files may exist for other reasons also. Tehrefore it would be best if each tool could just "clean up behind itself"...
Dear Torsten Seemann,
I am an enthusiastic user of the barrnap tool. I work with both plant parasitic prokaryotes and eukaryotes. I noticed that when annotating eukaryotic rDNA sequences, the 18S and 5.8S predictions are accurate, but the 28S gene is consequently predicted to start just before the 5.8S gene (see image attached). As a result, the ITS2 and correct start of the 28S gene have to be determined manually. I can imagine the prediction if difficult as the start of 5.8S and 28S have high sequence similarity and share some key conserved sequences.
Is there some way to overcome this inconvenience?
Should be 0
Harald Gruber-Vodicka:
You forgot to add the EUK 18S to the euk profile collection.
Hi Torsten,
the manual now says: "echo "PATH=$PATH:$HOME/barnapp-0.x/bin" >> .bashrc"
The folder name is misspelled, it should be barrnap :)
This would make it easier to pipe gzipped data to barrnap
I was always wondering (also for rnammer and similar tools) why there always seems to be a strict rule to only search for archaeal OR bacterial rRNAs, but never for both at the same time?
I would guess that In most settings, whether it is to help identify an isolate, a SAG or a MAG, the actual use case for specifically only searching for bacterial OR archaeal rRNA sequences seems rather low. Especially since the hmm models of both kingdoms do seem to overlap quite a bit, and most people would align the obtained sequence to a reference database anyway, for an exact classification.
Wouldn't it perhaps be more practical to perform a combined search based on both bacterial AND archaeal models in one go, e.g. by simply concatenating the bacterial and archaeal models? Then each detected sequence could either just be assigned to the model (bacterial or archaeal) which yielded the highest score, or the exact classification could just be outright left to downstream BLAST analyses, which the user will most probably perform anyway.
Could this be relatively easy to implement (e.g. just combining the hmm-files for bacteria and archaea) or am i missing something fundamental here?
Hi Torsten. We met a few times during my time at La Trobe a few years back. I'm now at Plant and Food Research. We are interested in an open-source alternative to rnammer and would be prepared to do some work to extend barrnap so that it could be used for fungal genomes. Is barrnap still under development, and can we contribute?
Much appreciated
Dan Jones
Barrnap v.0.9 produces gff-output and (optionally) a fasta output. The fasta output has the coordinates of each rRNA prediciton in the header, but not the evalue of that prediction. The gff output has also the evalue.
I now noticed that the start positions given in the fasta headers differ from the start positions given in the gff-output (usually by a value of 1).
For me this is a bit of a problem, because in order to catch any possible variation of rRNA genes in metagenomic bins, I am running barrnap runs for all three kingdoms (bac, arc & euk) consecutively and then try to identify overlapping hits and keep only the highest scoring (i.e. lowest evalue) hit for each overlapping possibility.
This means I have to compare the gff output (in order to get the evalues) with the fasta-headers.
Is this difference perhaps a bug or is it due to some special gff-specifications?
Can i safely assume that it is off by exactly 1 in ALL cases in order to correct for this difference, or could it be a bit more problematic?
Alternatively, it would be most helpful to either add the corresponding fasta seqid to the gff-output, or the evalue to the fasta-header.
I’m running barrnap on a fasta of bins with default settings. The resulting GFF contains some instances where sequential sequences (or sequences with a few bases overlapping) are being called as two partial sequences rather than one more complete sequence. Is this something you have seen before or is there a way to tell if this is an error or maybe two adjacent 16S copies?
Hello
barrnap/0.9 comes with bundle nhmmer from hmmer/3.3.1b1
can we use external nhmmer from version 3.3 instead ?
regards
Eric
Hi Torsten,
Small thing - from the doco
Usage:
barrnap [options] chr.fa
But
barrnap-0.9) $ barrnap a.fna
[barrnap] This is barrnap 0.9
[barrnap] Written by Torsten Seemann
[barrnap] Obtained from https://github.com/tseemann/barrnap
[barrnap] Detected operating system: linux
[barrnap] Adding /home/ben/e/barrnap-0.9/lib/barrnap/bin/../binaries/linux to end of PATH
[barrnap] Checking for dependencies:
[barrnap] Found nhmmer - /home/ben/e/barrnap-0.9/bin/nhmmer
[barrnap] Found bedtools - /home/ben/e/barrnap-0.9/bin/bedtools
[barrnap] Will use 1 threads
[barrnap] Setting evalue cutoff to 1e-06
[barrnap] Will tag genes < 0.8 of expected length.
[barrnap] Will reject genes < 0.25 of expected length.
[barrnap] Using database: /home/ben/e/barrnap-0.9/lib/barrnap/bin/../db/bac.hmm
[barrnap] ERROR: No input file on command line or stdin
Easy to workaround - just specify via stdin e.g. barrnap < a.fna
, but figured I'd report.
ben
If file.fasta is a path to someone elses folder it tries to write the .fai file there and fails, when using --outseq
Hi,
When I use the --outseq argument (barrnap --outseq bacteria.fna examples/bacteria.fna
), the following command line is invoked within barrnap :
bedtools getfasta -s -name+ -fo 'bacteria.fna' -fi 'examples/bacteria.fna' -bed '/tmp/ka24159v9M
Obviously, -name+ option is not implemented in bedtools so it exits with the following message :
*****ERROR: Unrecognized parameter: -name+ *****
Deleting the '+' would surely fix the issue. By then, I have to implement some bash workaround to extract the sequences.
Kindly
Hello!
I'm using barrnap 0.9
Interesting observation that GFF file start positions are shifted by +1 compared to FASTA header start:
NC_009053 barrnap:0.9 rRNA 69357 70894 0 + . Name=16S_rRNA;product=16S ribosomal RNA
>16S_rRNA::NC_009053:69356-70894(+)
Is this a bug or feature?
cmscan -g
is slow but will improve results due to glocal alignment and higher sensitivity
Great,
Works well !!!!!
Just have updated the database in file build_HMMs.sh to work with latest Silva 123 as follows. Replace with following chunk.
SILVA="SILVA_123_LSURef_tax_silva_full_align_trunc.fasta"
if [ ! -r "$SILVA" ]; then
echo "Downloading: $SILVA"
wget --quiet http://www.arb-silva.de/fileadmin/silva_databases/current/Exports/SILVA_123_LSURef_tax_silva_full_align_trunc.fasta.gz
gunzip $SILVA.gz
rm -f $SILVA.gz
else
echo "Using existing file: $SILVA"
fi
thanks
The other columns are pretty straightforward (assuming column 6 is e-value...)
Plant mitochondrion have three rRNA genes:
rrn5
rrn18
rrn26
[barrnap] Running: bedtools getfasta -s -name+ -fo 'rrna.fasta' -fi '/tmp/9tmz9ZJXup' -bed '/tmp/RCzdJ9pUop'
*****ERROR: Unrecognized parameter: -name+ *****
Hi Torsten,
I recently installed the latest version of Barrnap (0.8) and tested it by running it on a fasta file containing 18S sequences that I downloaded from SILVA. It detected partial 16S sequences when —kingdom was set to bac or arc, but detected zero ribosomal RNA features when —kingdom was set to euk.
However, Barrnap 0.7 was able to detect the 18S sequences in my fasta file (with an e-vaue of 0, as was expected). I didn’t investigate the matter further (I’m mainly a wet lab biologist with some bioinformatics skills) - but I wanted to bring the matter to your attention.
Best Regards,
Mahwash Jamy (Institute of Organismal Biology Uppsala University Sweden)
I use your tool "Prokka" (version 1.10) with your the default rRNA predictor "Barrnap" mainly for metagenome-annotation (and am very happy with it).
As my metagenomes consist of bacterial as well as archeal components, I already created a custom kingdom BLAST-DB for prokka (based on concatenated bacterial + archeal swissprot databases) for the protein annotations.
Now I would like to do something similar for the RNA prediction step with Barrnap (to increase the sensitivity). Is it possible to create custom HMM-DBs for Barrnap?
Otherwise it seems that the best option for me would be to replace the "bacterial" library with a concatenated archea+bacteria library, (As it is the default and will therefore probably be used when I run prokka with my custom-kingdom setting).
Thank you and with friendly greetings,
John Vollmers
Here are the commands I used:
barrnap $IN_PATH --threads 1 --kingdom euk
barrnap $IN_PATH --threads 1 --kingdom bac
barrnap $IN_PATH --threads 1 --kingdom arc
In all cases the following output is produced:
##gff-version 3
k85_159010 barrnap:0.8 rRNA 4717 4828 6.8e-10 - . Name=5S_rRNA;product=5S ribosomal RNA
It seems weird that the exact same evalue would be produced for searches against the 3 different hmm databases. How do I know which domain the 5S gene corresponds to?
Hello,
I just started using PROKKA and want to predict rRNA with barrnap-0.7. However, I'm getting the
following error while running it (on test example):
"/nhmmer: Syntax error: ")" unexpected"
What could be the issue here? Could it be the architecture of my system (I have 32-bit machine).
It also gave me the same error while running barrnap-0.6.
Any solutions?
Thank you very much,
MR
Hi,
I am using barrnap for 16s rRNA extraction and it show me this result without creating any output.
Tseemann sir i am big fan of you. First of all its great tool, will you please tell me how to use barrnap of >400 complete genome .
This is barrnap 0.9
[barrnap] Written by Torsten Seemann
[barrnap] Obtained from https://github.com/tseemann/barrnap
[barrnap] Detected operating system: linux
[barrnap] Adding /home/bvs/neelam/barrnap/bin/../binaries/linux to end of PATH
[barrnap] Checking for dependencies:
[barrnap] Found nhmmer - /home/bvs/neelam/barrnap/bin/../binaries/linux/nhmmer
[barrnap] Found bedtools - /home/bvs/neelam/bedtools2/bin/bedtools
[barrnap] Will use 96 threads
[barrnap] Setting evalue cutoff to 1e-06
[barrnap] Will tag genes < 0.8 of expected length.
[barrnap] Will reject genes < 0.25 of expected length.
[barrnap] Using database: /home/bvs/neelam/barrnap/bin/../db/bac.hmm
[barrnap] Copying STDIN to a temporary file: /tmp/ZMQNSyGJ92
[barrnap] Scanning /tmp/ZMQNSyGJ92 for bac rRNA genes... please wait
[barrnap] Command: nhmmer --cpu 96 -E 1e-06 --w_length 3878 -o /dev/null --tblout /dev/stdout '/home/bvs/neelam/barrnap/bin/../db/bac.hmm' '/tmp/ZMQNSyGJ92'
[barrnap] Found: 16S_rRNA CP080286.1 L=1531/1585 1978676..1980206 + 16S ribosomal RNA
[barrnap] Found: 16S_rRNA CP080286.1 L=1531/1585 248408..249938 - 16S ribosomal RNA
[barrnap] Found: 16S_rRNA CP080286.1 L=1531/1585 1020190..1021720 - 16S ribosomal RNA
[barrnap] Found: 16S_rRNA CP080286.1 L=1530/1585 6068907..6070436 - 16S ribosomal RNA
[barrnap] Found: 23S_rRNA CP080286.1 L=2888/3232 1016830..1019717 - 23S ribosomal RNA
[barrnap] Found: 23S_rRNA CP080286.1 L=2888/3232 1980680..1983567 + 23S ribosomal RNA
[barrnap] Found: 23S_rRNA CP080286.1 L=2888/3232 245047..247934 - 23S ribosomal RNA
[barrnap] Found: 23S_rRNA CP080286.1 L=2888/3232 6065546..6068433 - 23S ribosomal RNA
[barrnap] Found: 5S_rRNA CP080286.1 L=110/119 1983714..1983823 + 5S ribosomal RNA
[barrnap] Found: 5S_rRNA CP080286.1 L=110/119 244791..244900 - 5S ribosomal RNA
[barrnap] Found: 5S_rRNA CP080286.1 L=110/119 1016574..1016683 - 5S ribosomal RNA
[barrnap] Found: 5S_rRNA CP080286.1 L=110/119 6065290..6065399 - 5S ribosomal RNA
[barrnap] Found 12 ribosomal RNA features.
[barrnap] Sorting features and outputting GFF3...
[barrnap] Writing hit sequences to: output_rrna.fna
[barrnap] Running: bedtools getfasta -s -name+ -fo 'output_rrna.fna' -fi '/tmp/ZMQNSyGJ92' -bed '/tmp/PM4zayBzOc'
index file /tmp/ZMQNSyGJ92.fai not found, generating...
[barrnap] Done.
Thank you!
The current release fails on the bedtools
command. it looks like #27 fixes the breaking typo.
First off, thanks for writing this useful tool!
Today, I started noticing that my barrnap jobs were dying with the following message:
[12:58:10] bad line in nhmmer output - Killed
I then ran nhmmer directly, and noticed that the barrnap script was likely not happy with the 'Killed' text at the bottom of the nhmmer output.
Looking at top, I noticed that the nhmmer process was spawning threads which triggered the OOM-Killer on my Linux machine. I didn't notice until now that barrnap defaults to 8 threads. That might be reasonable on a beefy server, but on my Linux instance, I only had 2 CPUs and 2 GB of RAM.
I propose that barrnap should default to using a single CPU unless the user explicitly overrides it. Otherwise, barrnap is violating the Law of Least Surprise. Or, barrnap should be sure not to exceed the number of CPUs on the system, as that is surely undesirable behavior. Some bioinformatic tools that I've worked with will default to using (N-K) CPUs or one CPU, whichever is larger (where N is the total number of CPUs, and K in {1,2}, as a buffer to prevent the machine from locking up).
In the mean-time, I am now making sure to set the threads argument explicitly. Thanks for your consideration!
"Several of my users have had difficulties installing barrnap on computing clusters where they don't have admin rights due to the time::piece issue that seems to affect a lot of the perl distros installed by default."
I believe --kingdom mito
works only with metazoan mitochondria, no plants. If that's true, it would be helpful to update the README.md
documentation.
Dear Torsten,
not sure if this has anything to do with you directly, but I just noticed that the barrnap-v0.9 version packaged for Debian/Ubuntu (installed via apt
) is shipped with what appears to be a corrupted database. It lacks for example 23S and 28S models (see below). Any idea why that is, how to fix it or where to better report it?
Cheers
Thomas
#cd /usr/share/barrnap/db
grep NAME *.hmm
arc.hmm:NAME 16S_rRNA
arc.hmm:NAME 5S_rRNA
arc.hmm:NAME 5_8S_rRNA
bac.hmm:NAME 16S_rRNA
bac.hmm:NAME 5S_rRNA
euk.hmm:NAME 18S_rRNA
euk.hmm:NAME 5S_rRNA
euk.hmm:NAME 5_8S_rRNA
mito.hmm:NAME 12S_rRNA
mito.hmm:NAME 16S_rRNA
I've quality trimmed my HISEQ reads and converted to fasta. I want to run these through barrnap
but I'm not getting any hits with default settings. I was wondering how I could adjust these parameters to properly utilize barrnap while casting a wide net.
--evalue is the cut-off for nhmmer reporting, before further scrutiny
--lencutoff is the proportion of the full length that qualifies as partial match
--reject will not include hits below this proportion of the expected length
Is lencutoff
the proportion the target rRNA gene that is covered by the query sequence? If so, should I drop this down to something like 0.01?
I'm confused on how reject
is different than lencutoff
. How would you adjust this for properly incorporating reads?
For evalue
I was going to drop it down to 0.1 to cast a wide net. Do you think this is too permissive?
My sequences are around 200 bp long.
I believe this file is current:
Hi Torsten,
Great job on a wonderful tool. ;-)
Have just noticed that it seems to miss some of the TB 5S rRNA sequence:
[barrnap] Found: 5S_rRNA AL123456; L=76/119 5..80 + 5S ribosomal RNA (partial)
[barrnap] Found 1 ribosomal RNA features.
[barrnap] Sorting features and outputting GFF3...
##gff-version 3
AL123456; barrnap:0.9 rRNA 5 80 3.5e-14 + . Name=5S_rRNA;product=5S ribosomal RNA (partial);note=aligned only 63 percent of the 5S ribosomal RNA
[barrnap] Done.
RUN ON:
>TB-5S-rRNA AL123456; Mycobacterium tuberculosis H37Rv complete genome.
UUACGGCGGCCACAGCGGCAGGGAAACGCCCGGUCCCAUUCCGAACCCGG
AAGCUAAGCCUGCCAGCGCCGAUGAUACUGCCCCUCCGGGUGGAAAAGUA
GGACACCGCCGAACA
Can you reproduce this?
I built a HMMer model on an older Rfam RF00001 SEED alignment and that seems to work pretty well. There are definitely Mycobacteria sequences in the seed, so it should work well.
E.g.
>> AL123456; Mycobacterium tuberculosis H37Rv complete genome.
score bias Evalue hmmfrom hmm to alifrom ali to envfrom env to sq len acc
------ ----- --------- ------- ------- --------- --------- --------- --------- --------- ----
! 53.5 10.9 1.4e-18 4 116 .. 5 110 .. 2 114 .. 115 0.92
Alignment:
score: 53.5 bits
<<<<<<....<<.<<<<<...<<..<<<<<<.......>>..>>>>..>>....>>>>>..>><<<.<<....<.<<.....<<....>>.....>>.>. CS
SEED 4 ggcggccauagcgggggggaaacacccgauccCaUcccGaacucggaaguuAAgccccuuagcgccgauguagUAcugcggugggugaccacgugggAau 103
ggcggcca agcgg gggaaac cccg uccCaU+ccGaac cggaag uAAgcc+ +agcgccgaug UAcugc +cc gugg Aa
AL123456; 5 GGCGGCCACAGCGGCAGGGAAACGCCCGGUCCCAUUCCGAACCCGGAAGCUAAGCCUGCCAGCGCCGAUG--AUACUGCCC-----CUCCGGGUGGAAA- 96
789*******************************************************************..6****9874.....3578889999999. PP
.>>.>>>.>>>>>> CS
SEED 104 aguaggu.gcugcc 116
aguagg+ c+gcc
AL123456; 97 AGUAGGAcACCGCC 110
***99988888876 PP
Thanks for your time!
Paul & Helena.
Hello,
I would like to install barrnap in a singularity container but I have the following issue :
./barrnap --quiet examples/small.fna
[barrnap] ERROR: Can not find required 'nhmmer' in PATH
But nhmmer is in the path :
Singularity centos8.img:/opt/barrnap/bin> nhmmer -h
nhmmer :: search a DNA model, alignment, or sequence against a DNA database
HMMER 3.3.1 (Jul 2020); http://hmmer.org/
Could you help me to understand the problem, please ?
Singularity centos8.img:/opt/barrnap/bin> ./barrnap
[barrnap] This is barrnap 0.9
[barrnap] Written by Torsten Seemann
[barrnap] Obtained from https://github.com/tseemann/barrnap
[barrnap] Detected operating system: linux
[barrnap] Adding /opt/barrnap/bin/../binaries/linux to end of PATH
[barrnap] Checking for dependencies:
[barrnap] ERROR: Can not find required 'nhmmer' in PATH
Singularity centos8.img:/opt/barrnap/bin> ls ../binaries/linux/
nhmmer
RNAmmer is able to annotate rRNA genes containing gaps of Ns. Is this a feature that you would want to implement in Barrnap? See below for an example:
>rRNA_205522_53445-53560_DIR- /molecule=5s_rRNA /score=53.7
TGGTGTCCCAGGCGTAGAGGAACCACACCAACCCATCCCGAACTTGGTGGTTAAACTCTA
CTGCGGTGACGATACTATAGGGGAAGCCCTGCGGGAAAATAGCTCGGTGCCAGGAT
>rRNA_205522_53810-56760_DIR- /molecule=23s_rRNA /score=1839.4
TCAAACGAGGAAGGGCTTACGGTGGATACCTAGGCACCCAGAGACGAGGAAGGGCGTGGT
AAGCGACGAAATGCTTCGGGGAGTTGAAAATGAGCATAGATCCGGAGATTCCCGAATAGG
TTAACCTTTTTAACTGCTGCTGAATCCATGGGCAGGCAAGAGACAACCTGGCGAACTGAA
ACATCTTAGTAGCCAGAGGAATAGAAAGCAAAAGCGATTCCCGTAGTAGCGGCGAGCGAA
ATGGGAGCAGCCTAAACCGTGAAAACGGGGTTGTGGGAGAGCACAATATAAGCTCTGTGC
TGCTAGGCGAAGCGGTTGAGTCCTGCACCCTAGATGGTGAGAGTCCAGTAGCCAAAAGCA
TCATTGGGTTACGCTCTAACCCGAGTAGCATGGGGCACGTGGAATCCCGTGTGAATCAGC
AAGGACCACCTTGCAAGGCTAAATACTCCTGGGTGACCGATAGCGAAGTAGTACCGTGAG
GGAAAGGTGAAAAGAACCCCCATCGGGGAGTGAAATAGAACATGAAACCGTAAGCTCCCA
AGCAGTGGGAGGAAAATTATATCTCTGACCGCGTGCCTGTTGAAGAATGAGCCGGCGACT
TATAGGCAGTGGCTTGGTTAAGGGAACCCACCGGAGCCGTAGCGAGAGCGAGTCTTCATG
GGGCAATTGTCACTGCTTATGGACCCGAACCTGGGTGATCTATCCATGACCAGGATGAAG
CTTGGGTGAAACTAAGTGGAGGTCCGAACCGACTGATGTTGAAAAATCAGCGGATGAGTC
GTGGTTAGGGGTGAAATGCCACTCGAACCCAGAGCTAGCTGGTTCTCCCCGAAATGCGTT
GAGGCGCAGCAGTTGACTGGACCATCTAGGGGTAAAGCACTGTTTCGGTACGGGCCGCGA
GAGCGGTACCAAATCGAGGCAAACTCTGAATACTAGATTGCCCCAATAAAAGGGGTAAAG
GTCAGCCAGTGAGACGATGGGGGATAAGCTTCATCGTCGAGAGGGAAACAGCCCAGATCA
TCAGCTAAGGCCCCTAAATGACCGCTCAGTGATGAAGGAAGTACGAGTGCAAAGACAGCC
AGGAGGTTTGCCTAGAAGCAGCCAACCTTGAAAGAGTGCGTAATAGCTCACTGATCGAGC
GCTCTTGCGCCGAAGATGAACGGGACTAAGCGATCTGCCGAAGCTGTGGGATGTAAAAAT
ACATCGGTAGGGGAGCGTTCCGCCTCAGAGGGAAGCACCGGCGCGAGCAGGTGTGGACGA
AGCGGAAGCGAGAATGTCGGCTTGAGTAACGCAAACATTGGTGAGAATCCAATGCCCCGA
AAACCTAAGGGTTCCTCCGCAAGGTTCGTCCACGGAGGGTGAGTCAGGGCCTAAGATCAG
GCCGAAAGGCGTAGTCGATGGACAACAGGTTAATATTCCTGTACTACCCCTTGTTGGTCC
CGAGGGACGGAGGAGGCTAGGTTAGCCGAAAGATGGTTATCGGTTCAAGGACGCAAGGTG
ACCCTGCTTTTTTCAGGGTAAGAAGGGGTAGAGAAAATGCCTCGAGCCAATGTCCGAGTA
CCAAGCGCTACAGCGCTGAAGTAACCCATGCCATACTCCCAGGAAAAGCTCGAACGACCT
TTAACAAACGGGTACCTGTACCCGAAACCGACACAGGTAGGTAGGTAGAGAATACCTAGG
GGCGCGAGACAACTCTCTCTAAGGAACTCGGCAAAATAGCCCCGTAACTTCGGGAGAAGG
GGTGCCTCCTCAGGAGGTCGCAGTGACCAGGCCCGGGCGACTGTTTACCAAAAACACAGG
TCTCCGCAAAGTCGTAAGACCATGTATGGGGGCTGACGCCTGCCCAGTGCCGGAAGGTTA
AGGAAGTTGGTGACCTGATGACGGGGAAGCCAGCGACCGAAGCCCCGGTGAACGGCGGCC
GTAACTATAACGGTCCTAAGGTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCCGCACGA
AAGGCGTAACGATCTGGGCACTGTCTCGGAGAGAGACTCGGTGAAATAGACATGTCTGTG
AAGATGCGGACTACCTGCACCTGGACAGAAAGACCCTATGAAGCTTTACTGTTCCCTGGG
ATTGTCTTTGGGTTCTTCTTGCGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGGACGAAAGTCGGCCTTAG
TGATCCGACGGTGCCGAGTGGAAGGGCCGTCGCTCAACGGATAAAAGTTACTCTAGGGAT
AACAGGCTGATCTTCCCCAAGAGTTCACATCGACGGGAAGGTTTGGCACCTCGATGTCGG
CTCTTCGCCACCTGGGGCGGTAGTACGTTCCAAGGGTTGGGCTGTTCGCCCATTAAAGCG
GTACGTGAGCTGGGTTCAGAACGTCGTGAGACAGTTCGGTCCATATCCGGTGCGGGCGTT
AGAGCATTGAGAGGACCTTTCCCTAGTACGAGAGGACCGGGAAGGACGCACCTCTGGTGT
ACCAGTTATCGTGCCCACGGTAGACGCTGGGTAGCCAAGTGCAGAGCGGATAACTACTGA
AAGCATATAAGTAGGAAGCCCACCCCAAGATGAGTGCTCTCCTATTCTTACTTCCCTGAG
AGCCCTAGTCGCGAACACGGCTGGGACAACGACGGGTTCTCTGTCCTTGCAGGGGATGGA
GCGACAAAAGTATTGAGAATCCAAGATAAGGTCACGGCGAGACGAGCCGTTTATCATTAC
GATAGGTGTCAAGTGGAAGTGCAGTGATGTATGCAGCTGAGGCATCCTAACAGACCGAGA
GATTTGAACCT
>rRNA_205522_59270-60682_DIR- /molecule=16s_rRNA /score=1311.2
AGAGTTTGATCCTGGCTCAGGATGAACGCTGGCGGCATGCTTAACACATGCAAGTCGGAC
GGGAAGTGGTGTTTCCAGTGGCGGACGGGTGAGTAACGCGTAAGAACCTGCCCTTGGGAG
GGGAACAACAGCTGGAAACGGCTGCTAATACCCCATAGGCTGAGGAGCAAAAGGAGGAAT
CCGCCCAAGGAGGGGCTCGCGTCTGATTAGTTAGTTGGTGAGGCAATGGCTTACCAAGGC
GACGATCAGTAGCTGGTCCGAGAGGATGATCAGCCACACTGGGACTGAGACACGGCCCAG
ACTCCTACGGGAGGCAGCAGTGGGGAATTTTCCGCAATGGGCGAAAGCCTGACGGAGCAA
TGCCGCGTGAAGGCAGAAGGCCCACGGGTCATGAACTTCTTTTCTCGGAGAAGAAACAAT
GACGGTATCTGAGGAATAAGCATCGGCTAACTCTGTGCCAGCAGCCGCGGTAAGACAGAG
GATGCAAGCGTTATCCGGAATGATTGGGCGTAAAGCGTCTGTAGGTGGCTTTTCAAGTCC
GCCGTCAAATCCCAGGGCTCAACCCTGGACAGGCGGTGGAAACTACCAAGCTGGAGTACG
GTAGGGGCAGAGGGAATTTCCGGTGGAGCGGTGAAATGCGTTGAGATCGGAAAGAACACC
AACGGCGAAAGCACTCTGCTGGGCCGACACTGACACTGAGAGACGAAAGCTAGGGGAGCA
AATGGGATTAGATACCCCAGTAGTCCTAGCCGTAAACGATGGNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGTCGTCAGCTCGTGCCGTAAGGTG
TTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTGTTTAGTTGCCAGCATTGAGTTTGGA
ACCCTGAACAGACTGCCGGTGATAAGCCGGAGGAAGGTGAGGATGACGTCAAGTCATCAT
GCCCCTTACGCCCTGGGCGACACACGTGCTACAATGACCGGGACAAAGGGTCGCGACCCC
GCGAGGGCAAGCTAACCTCAAAAACCCGGCCTCAGTTCGGATTGCAGGCTGCAACTCGCC
TGCATGAAGCCGGAATCGCTAGTAATCGCCGGTCAGCCATACGGCGGTGAATCCGTTCCC
GGGCCTTGTACACACCGCCCGTCACACTATGGGAGCTGGCCATGCCCCAAGTCGTTACCT
TAACCGCAAGGAGGGGGATGCCGAAGGCTGGGCTAGTGACTGGAGTGAAGTCGTAACAAG
GTAGCCGTACTGGAAGGTGCGGCTGGATCACCT
Hi,
I've been digging into why --outseq has been generating empty sequence files despite there being results in the Barrnap generated GFF files. It looks like Bedtools is rejecting some of my assemblies due to different line lengths for some parts of the sequence. It would be nice if Barrnap failed if this happened, as the exit code for Barrnap is currently 0 even if outseq fails.
Let me know if my description isn't clear, happy to add more detail as needed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.