uw-ipd / rosettafold2na Goto Github PK

View Code? Open in Web Editor NEW

312.0 15.0 69.0 1.11 MB

RoseTTAFold2 protein/nucleic acid complex prediction

License: MIT License

Dockerfile 0.42% Shell 2.70% Python 96.71% Perl 0.16%

rosettafold2na's Introduction

RF2NA

GitHub repo for RoseTTAFold2 with nucleic acids

New: April 13, 2023 v0.2

Updated weights (https://files.ipd.uw.edu/dimaio/RF2NA_apr23.tgz) for better prediction of homodimer:DNA interactions and better DNA-specific sequence recognition
Bugfixes in MSA generation pipeline
Support for paired protein/RNA MSAs

Installation

Clone the package

git clone https://github.com/uw-ipd/RoseTTAFold2NA.git
cd RoseTTAFold2NA

Create conda environment All external dependencies are contained in RF2na-linux.yml

# create conda environment for RoseTTAFold2NA
conda env create -f RF2na-linux.yml

You also need to install NVIDIA's SE(3)-Transformer (please use SE3Transformer in this repo to install).

conda activate RF2NA
cd SE3Transformer
pip install --no-cache-dir -r requirements.txt
python setup.py install
cd ..

Download pre-trained weights under network directory

cd network
wget https://files.ipd.uw.edu/dimaio/RF2NA_apr23.tgz
tar xvfz RF2NA_apr23.tgz
ls weights/ # it should contain a 1.1GB weights file
cd ..

Download sequence and structure databases

# uniref30 [46G]
wget http://wwwuser.gwdg.de/~compbiol/uniclust/2020_06/UniRef30_2020_06_hhsuite.tar.gz
mkdir -p UniRef30_2020_06
tar xfz UniRef30_2020_06_hhsuite.tar.gz -C ./UniRef30_2020_06

# BFD [272G]
wget https://bfd.mmseqs.com/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gz
mkdir -p bfd
tar xfz bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gz -C ./bfd

# structure templates (including *_a3m.ffdata, *_a3m.ffindex)
wget https://files.ipd.uw.edu/pub/RoseTTAFold/pdb100_2021Mar03.tar.gz
tar xfz pdb100_2021Mar03.tar.gz

# RNA databases
mkdir -p RNA
cd RNA

# Rfam [300M]
wget ftp://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/Rfam.full_region.gz
wget ftp://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/Rfam.cm.gz
gunzip Rfam.cm.gz
cmpress Rfam.cm

# RNAcentral [12G]
wget ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/rfam/rfam_annotations.tsv.gz
wget ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/id_mapping/id_mapping.tsv.gz
wget ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/sequences/rnacentral_species_specific_ids.fasta.gz
../input_prep/reprocess_rnac.pl id_mapping.tsv.gz rfam_annotations.tsv.gz   # ~8 minutes
gunzip -c rnacentral_species_specific_ids.fasta.gz | makeblastdb -in - -dbtype nucl  -parse_seqids -out rnacentral.fasta -title "RNACentral"

# nt [151G]
update_blastdb.pl --decompress nt
cd ..

Usage

conda activate RF2NA
cd example
# run Protein/RNA prediction
../run_RF2NA.sh rna_pred rna_binding_protein.fa R:RNA.fa
# run Protein/DNA prediction
../run_RF2NA.sh dna_pred dna_binding_protein.fa D:DNA.fa

Inputs

The first argument to the script is the output folder
The remaining arguments are fasta files for individual chains in the structure. Use the tags P:xxx.fa R:xxx.fa D:xxx.fa S:xxx.fa to specify protein, RNA, double-stranded DNA, and single-stranded DNA, respectively. Use the tag PR:xxx.fa to specify paired protein/RNA. Each chain is a separate file; 'D' will automatically generate a complementary DNA strand to the input strand.

Expected outputs

Outputs are written to the folder provided as the first argument (dna_pred and rna_pred).
Model outputs are placed in a subfolder, models (e.g., dna_pred.models)
You will get a predicted structre with estimated per-residue LDDT in the B-factor column (models/model_00.pdb)
You will get a numpy .npz file (models/model_00.npz). This can be read with numpy.load and contains three tables (L=complex length):
- dist (L x L x 37) - the predicted distogram
- lddt (L) - the per-residue predicted lddt
- pae (L x L) - the per-residue pair predicted error

rosettafold2na's People

Contributors

Stargazers

Watchers

rosettafold2na's Issues

run RNA prediction error

cat test.fa

RNA
GCGGGGGUUGCCGAGCCUGGUCAAAGGCGGGGGACUCAAGAUCCCCUCCCGUAGGGGUUCCGGGGUUCGAAUCCCCGCCCCCGCACCAUCCCCGCCCCCGCACCA

bash -x ../run_RF2NA.sh test R:test.fa
output:

python /home/ubuntu/RoseTTAFold2NA/network/predict.py -inputs R:/home/ubuntu/RoseTTAFold2NA/example/Rtest/Rtest.afa -prefix /home/ubuntu/RoseTTAFold2NA/example/Rtest/models/model -model /home/ubuntu/RoseTTAFold2NA/network/weights/RF2NA_sep22.pt -db /home/ubuntu/RoseTTAFold2NA/pdb100_2021Mar03/pdb100_2021Mar03
Running on GPU
Traceback (most recent call last):
File "/home/ubuntu/RoseTTAFold2NA/network/predict.py", line 345, in
pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
File "/home/ubuntu/RoseTTAFold2NA/network/predict.py", line 225, in predict
self.run_model(Ls, msa_orig, ins_orig, t1d, t2d, xyz_t, xyz_t[:,0], alpha_t, "%s%02d"%(out_prefix, i_trial))
File "/home/ubuntu/RoseTTAFold2NA/network/predict.py", line 233, in _run_model
seq, msa_seed_orig, msa_seed, msa_extra, mask_msa = MSAFeaturize(
File "/home/ubuntu/RoseTTAFold2NA/network/data_loader.py", line 135, in MSAFeaturize
sample_mono = torch.randperm((N-1)//nmer, device=msa.device)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Password

Hi,
I think your repo is locked
can you make it accessible for cloning

Get errors "NVTX functions not installed. Are you sure you have a CUDA build?" when run RF2NA on CPU.

Hi,

I get the following errors when run RoseTTAFold2NA on CPU. I have replaced the "torch.cuda.amp.autocast" with "torch.amp.autocast" in predict.py in order to overcome another "NVTX functions not installed." error when running the run_RF2NA.sh.

Seems there is some part of the codes still calling cuda or searching for GPUs?
Thank you!

Traceback (most recent call last):
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/predict.py", line 376, in <module>
    pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/predict.py", line 239, in predict
    self._run_model(Ls, msa_orig, ins_orig, t1d, t2d, xyz_t, xyz_t[:,0], alpha_t, "%s_%02d"%(out_prefix, i_trial))
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/predict.py", line 299, in _run_model
    logit_s, logit_aa_s, logit_pae, init_crds, alpha_prev, _, pred_lddt_binned, msa_prev, pair_prev, state_prev = self.model(
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/RoseTTAFoldModel.py", line 104, in forward
    msa, pair, xyz, alpha_s, xyzallatom, state = self.simulator(
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/Track_module.py", line 441, in forward
    msa_full, pair, xyz, state, alpha = self.extra_block[i_m](msa_full, pair,
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/Track_module.py", line 367, in forward
    xyz, state, alpha = self.str2str(msa.float(), pair.float(), xyz.detach().float(), state.float(), idx, top_k=top_k)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/Track_module.py", line 234, in forward
    shift = self.se3(G, node.reshape(B*L, -1, 1), l1_feats, edge_feats)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/SE3_network.py", line 84, in forward
    return self.se3(G, node_features, edge_features) #, clamp_d=clamp_d)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/se3_transformer-1.0.0-py3.8.egg/se3_transformer/model/transformer.py", line 163, in forward
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/se3_transformer-1.0.0-py3.8.egg/se3_transformer/model/basis.py", line 166, in get_basis
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/contextlib.py", line 114, in __enter__
    return next(self.gen)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/cuda/nvtx.py", line 86, in range
    range_push(msg.format(*args, **kwargs))
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/cuda/nvtx.py", line 28, in range_push
    return _nvtx.rangePushA(msg)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/cuda/nvtx.py", line 9, in _fail
    raise RuntimeError("NVTX functions not installed. Are you sure you have a CUDA build?")
RuntimeError: NVTX functions not installed. Are you sure you have a CUDA build?

My package versions:

brotlipy==0.7.0
certifi @ file:///croot/certifi_1665076670883/work/certifi
cffi @ file:///tmp/abs_98z5h56wf8/croots/recipe/cffi_1659598650955/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
click==8.1.3
colorama @ file:///opt/conda/conda-bld/colorama_1657009087971/work
configparser==5.3.0
cryptography @ file:///croot/cryptography_1665612644927/work
dgl==0.9.1.post1
DLLogger @ git+https://github.com/NVIDIA/dllogger@0540a43971f4a8a16693a9de9de73c1072020769
docker-pycreds==0.4.0
e3nn==0.3.3
gitdb==4.0.9
GitPython==3.1.29
idna @ file:///croot/idna_1666125576474/work
mkl-fft==1.3.1
mkl-random @ file:///tmp/build/80754af9/mkl_random_1626186064646/work
mkl-service==2.4.0
mpmath==1.2.1
networkx @ file:///opt/conda/conda-bld/networkx_1657784097507/work
numpy @ file:///croot/numpy_and_numpy_base_1667233465264/work
opt-einsum==3.3.0
opt-einsum-fx==0.1.4
packaging==21.3
pathtools==0.1.2
promise==2.3
protobuf==4.21.9
psutil @ file:///home/conda/feedstock_root/build_artifacts/psutil_1667885878918/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pynvml==11.0.0
pyOpenSSL @ file:///opt/conda/conda-bld/pyopenssl_1643788558760/work
pyparsing==3.0.9
PySocks @ file:///tmp/build/80754af9/pysocks_1605305779399/work
python-dateutil==2.8.2
PyYAML==6.0
requests @ file:///opt/conda/conda-bld/requests_1657734628632/work
scipy==1.9.3
se3-transformer==1.0.0
sentry-sdk==1.11.0
shortuuid==1.0.11
six @ file:///tmp/build/80754af9/six_1644875935023/work
smmap==5.0.0
subprocess32==3.5.4
sympy==1.11.1
torch==1.13.0
tqdm @ file:///home/conda/feedstock_root/build_artifacts/tqdm_1662214488106/work
typing_extensions @ file:///tmp/abs_ben9emwtky/croots/recipe/typing_extensions_1659638822008/work
urllib3 @ file:///croot/urllib3_1666298941550/work
wandb==0.12.0

Strange output when running example

Thank you for the open-source code!

I was running the example provided in the README with:
../run_RF2NA.sh t000_ protein.fa R:RNA.fa

The output PDB looks mostly reasonable, except the last 11 bases of the RNA seem to be missing and there are 8 UNK protein residues added at the end. I have attached a copy of the output folder.

I am not familiar with the RNA MSA processing but it does appear that the RNA.afa file has a a single sequence that is also missing the same 11 bases as the final output. Would you be able to confirm whether these intermediate output files look reasonable? I don't see anything out of the ordinary in the log files.

t000_.tar.gz

Solving environment hangs on Ubuntu 22.04

Hello! I'm trying to install the code on a standard t2.2xlarge AWS instance with Ubuntu 22.04 LTS installed. These are the steps I've taken:

Install microconda

https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

Clone and create the RF2na environment:

git clone https://github.com/uw-ipd/RoseTTAFold2NA.git
cd RoseTTAFold2NA
conda env create -f RF2na-linux.yml

The following then happens:

Collecting package metadata (repodata.json): done
Solving environment: /
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed

and then

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versions

Package libgcc conflicts for:
bioconda::blast -> libgcc
bioconda::cd-hit -> libgcc
dglteam::dgl-cuda11.3 -> scipy -> libgcc
bioconda::infernal -> libgcc

Package libgcc-ng conflicts for:
bioconda::cd-hit -> libgcc-ng[version='>=10.3.0|>=12|>=9.3.0|>=7.3.0|>=4.9']
bioconda::mafft -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0|>=9.3.0|>=7.5.0|>=7.3.0']
bioconda::hhsuite -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0|>=9.3.0|>=7.5.0|>=7.3.0']
conda-forge::tqdm -> python[version='>=2.7'] -> libgcc-ng[version='>=10.3.0|>=11.2.0|>=7.5.0|>=7.3.0|>=7.2.0|>=12|>=9.4.0|>=9.3.0|>=4.9']
bioconda::blast -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0|>=9.3.0|>=7.5.0|>=7.3.0|>=4.9']
bioconda::cd-hit -> zlib[version='>=1.2.12,<1.3.0a0'] -> libgcc-ng[version='>=11.2.0|>=7.5.0|>=7.2.0']
python=3.8 -> zlib[version='>=1.2.11,<1.3.0a0'] -> libgcc-ng[version='>=4.9|>=7.2.0']
conda-forge::cudatoolkit=11.3 -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0']
bioconda::hmmer[version='>=3.3'] -> libgcc-ng[version='>=10.3.0|>=9.3.0|>=7.5.0|>=7.3.0']
requests -> python[version='>=3.8,<3.9.0a0'] -> libgcc-ng[version='>=10.3.0|>=11.2.0|>=7.5.0|>=7.3.0|>=12|>=9.4.0|>=9.3.0|>=7.2.0|>=4.9']
conda-forge::psutil -> python[version='>=3.9,<3.10.0a0'] -> libgcc-ng[version='>=11.2.0|>=7.2.0']
bioconda::hhsuite -> perl[version='>=5.32.1,<5.33.0a0',build=*_perl5] -> libgcc-ng[version='>=11.2.0|>=7.2.0|>=4.9']
bioconda::blast -> curl[version='>=7.83.1,<8.0a0'] -> libgcc-ng[version='>=11.2.0|>=7.2.0']
bioconda::infernal -> perl[version='>=5.32.1,<5.33.0a0',build=*_perl5] -> libgcc-ng[version='>=11.2.0|>=9.4.0|>=7.2.0']
conda-forge::psutil -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0|>=9.3.0|>=7.5.0|>=7.3.0|>=4.9']
bioconda::csblast -> libgcc-ng[version='>=10.3.0|>=9.3.0']
python=3.8 -> libgcc-ng[version='>=10.3.0|>=11.2.0|>=7.5.0|>=7.3.0|>=12|>=9.4.0|>=9.3.0']
dglteam::dgl-cuda11.3 -> numpy -> libgcc-ng[version='>=10.3.0|>=11.2.0|>=7.5.0|>=7.3.0|>=7.2.0|>=12|>=9.4.0|>=9.3.0|>=4.9']
bioconda::infernal -> libgcc-ng[version='>=10.3.0|>=9.3.0|>=7.5.0|>=7.3.0|>=4.9']
pytorch::pytorch -> blas=[build=mkl] -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0|>=9.3.0|>=7.5.0|>=7.3.0|>=11.2.0|>=7.2.0|>=4.9']

Package numpy conflicts for:
bioconda::blast -> boost[version='>=1.68.0,<1.68.1.0a0'] -> numpy[version='1.11.*|1.12.*|1.13.*|>=1.11|>=1.8|>=1.9.3,<2.0a0|>=1.9|>=1.7|>=1.23.4,<2.0a0|>=1.20.3,<2.0a0|>=1.21.6,<2.0a0|>=1.19.5,<2.0a0|>=1.21.5,<2.0a0|>=1.18.5,<2.0a0|>=1.21.4,<2.0a0|>=1.17.5,<2.0a0|>=1.16.6,<2.0a0|>=1.16.5,<2.0a0|>=1.19.4,<2.0a0|>=1.19.2,<2.0a0|>=1.14.6,<2.0a0']
pytorch::pytorch -> numpy[version='>=1.11|>=1.19']
dglteam::dgl-cuda11.3 -> networkx -> numpy[version='1.10.*|1.11.*|1.12.*|1.13.*|>=1.11|>=1.11.3,<2.0a0|>=1.14.6,<2.0a0|>=1.16,<1.23|>=1.19,<1.25.0|>=1.19|>=1.19,<1.26.0|>=1.21,<1.26.0|>=1.21,<1.25.0|>=1.21,<1.23|>=1.16.6,<1.23.0|>=1.21.2,<1.23.0|>=1.16.6,<2.0a0|>=1.15.1,<2.0a0|>=1.9.3,<2.0a0|>=1.21.6,<1.26|>=1.21.6,<2.0a0|>=1.23.4,<1.26|>=1.23.4,<2.0a0|>=1.20.3,<1.26|>=1.20.3,<2.0a0|>=1.19.5,<2.0a0|>=1.20.3,<1.25|>=1.21.6,<1.25|>=1.18.5,<2.0a0|>=1.21.5,<2.0a0|>=1.20.3,<1.23|>=1.21.6,<1.23|>=1.21.4,<2.0a0|>=1.17.5,<2.0a0|>=1.19.4,<2.0a0|>=1.16.5,<2.0a0|>=1.19.2,<2.0a0|>=1.18.1,<2.0a0|>=1.9']
dglteam::dgl-cuda11.3 -> numpy

Package tzdata conflicts for:
conda-forge::psutil -> python[version='>=3.11,<3.12.0a0'] -> tzdata
bioconda::hhsuite -> python[version='>=3.10,<3.11.0a0'] -> tzdata
conda-forge::tqdm -> python[version='>=2.7'] -> tzdata
dglteam::dgl-cuda11.3 -> python[version='>=3.9,<3.10.0a0'] -> tzdata
pytorch::pytorch -> python[version='>=3.10,<3.11.0a0'] -> tzdata
requests -> python[version='>=3.10,<3.11.0a0'] -> tzdata

Package ncurses conflicts for:
python=3.8 -> ncurses[version='>=6.1,<7.0.0a0|>=6.1,<7.0a0|>=6.2,<7.0a0|>=6.3,<7.0a0|>=6.2,<7.0.0a0']
python=3.8 -> readline[version='>=7.0,<8.0a0'] -> ncurses[version='5.9.*|6.0.*|>=6.0,<7.0a0']

Package _libgcc_mutex conflicts for:
bioconda::csblast -> libgcc-ng[version='>=10.3.0'] -> _libgcc_mutex[version='*|0.1|0.1',build='conda_forge|main']
bioconda::cd-hit -> _openmp_mutex[version='>=4.5'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']
bioconda::hmmer[version='>=3.3'] -> libgcc-ng[version='>=10.3.0'] -> _libgcc_mutex[version='*|0.1|0.1',build='conda_forge|main']
bioconda::infernal -> libgcc-ng[version='>=10.3.0'] -> _libgcc_mutex[version='*|0.1|0.1',build='conda_forge|main']
python=3.8 -> libgcc-ng[version='>=11.2.0'] -> _libgcc_mutex[version='*|0.1|0.1',build='conda_forge|main']
conda-forge::psutil -> libgcc-ng[version='>=12'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']
bioconda::mafft -> libgcc-ng[version='>=12'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']
bioconda::blast -> libgcc-ng[version='>=12'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']
conda-forge::cudatoolkit=11.3 -> libgcc-ng[version='>=12'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']
bioconda::hhsuite -> _openmp_mutex[version='>=4.5'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']

Package libffi conflicts for:
pytorch::pytorch -> python[version='>=3.10,<3.11.0a0'] -> libffi[version='3.2.*|>=3.2.1,<3.3.0a0|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0|>=3.3|<3.3.0.a0']
dglteam::dgl-cuda11.3 -> python[version='>=3.8,<3.9.0a0'] -> libffi[version='3.2.*|>=3.2.1,<3.3.0a0|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0']
python=3.8 -> libffi[version='>=3.2.1,<3.3.0a0|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0']
requests -> python[version='>=3.8,<3.9.0a0'] -> libffi[version='3.2.*|>=3.2.1,<3.3.0a0|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0']
conda-forge::psutil -> python[version='>=3.11,<3.12.0a0'] -> libffi[version='3.2.*|>=3.2.1,<3.3.0a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4.2,<3.5.0a0|>=3.4,<4.0a0|>=3.2.1,<3.3a0']
bioconda::hhsuite -> python[version='>=3.10,<3.11.0a0'] -> libffi[version='3.2.*|>=3.2.1,<3.3.0a0|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0']
conda-forge::tqdm -> python[version='>=2.7'] -> libffi[version='3.2.*|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0|>=3.2.1,<3.3.0a0']

Package libblas conflicts for:
pytorch::pytorch -> blas=[build=mkl] -> libblas[version='3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.9.0|>=3.9.0,<4.0a0|>=3.8.0,<4.0a0',build='4_mkl|6_mkl|7_mkl|8_mkl|9_mkl|12_mkl|13_mkl|18_mkl|19_mkl|6_mkl|7_mkl|8_mkl|10_mkl|14_linux64_mkl|15_linux64_mkl|16_linux64_mkl|13_linux64_mkl|12_linux64_mkl|11_linux64_mkl|9_mkl|5_mkl|21_mkl|20_mkl|16_mkl|15_mkl|14_mkl|11_mkl|10_mkl|5_mkl']
dglteam::dgl-cuda11.3 -> numpy -> libblas[version='>=3.8.0,<4.0a0|>=3.9.0,<4.0a0']

Package gmp conflicts for:
bioconda::blast -> gnutls[version='>=3.6.5,<3.7.0a0'] -> gmp[version='6.1.*|>=6.1.2|>=4.2']
bioconda::cd-hit -> libgcc -> gmp[version='>=4.2']
bioconda::blast -> gmp[version='>=6.1.2,<7.0a0']
bioconda::infernal -> libgcc -> gmp[version='>=4.2']

Package libsqlite conflicts for:
python=3.8 -> libsqlite[version='>=3.40.0,<4.0a0']
python=3.8 -> sqlite[version='>=3.40.0,<4.0a0'] -> libsqlite[version='3.39.2|3.39.3|3.39.4|3.40.0|>=3.39.4,<4.0a0|>=3.39.2,<4.0a0',build='h753d276_0|h753d276_1']

Package _openmp_mutex conflicts for:
bioconda::csblast -> libgcc-ng[version='>=10.3.0'] -> _openmp_mutex[version='>=4.5']
bioconda::blast -> libgcc-ng[version='>=12'] -> _openmp_mutex[version='>=4.5']
conda-forge::cudatoolkit=11.3 -> libgcc-ng[version='>=12'] -> _openmp_mutex[version='>=4.5']
python=3.8 -> libgcc-ng[version='>=11.2.0'] -> _openmp_mutex[version='>=4.5']
bioconda::infernal -> libgcc-ng[version='>=10.3.0'] -> _openmp_mutex[version='>=4.5']
bioconda::cd-hit -> libgcc-ng[version='>=10.3.0'] -> _openmp_mutex
pytorch::pytorch -> blas=[build=mkl] -> _openmp_mutex[version='*|>=4.5',build=*_llvm]
bioconda::hhsuite -> _openmp_mutex[version='>=4.5']
bioconda::mafft -> libgcc-ng[version='>=12'] -> _openmp_mutex[version='>=4.5']
bioconda::hmmer[version='>=3.3'] -> libgcc-ng[version='>=10.3.0'] -> _openmp_mutex[version='>=4.5']
conda-forge::psutil -> libgcc-ng[version='>=12'] -> _openmp_mutex[version='>=4.5']
bioconda::hhsuite -> libgcc-ng[version='>=10.3.0'] -> _openmp_mutex
bioconda::cd-hit -> _openmp_mutex[version='>=4.5']

Package ca-certificates conflicts for:
conda-forge::psutil -> python[version='>=2.7,<2.8.0a0'] -> ca-certificates
requests -> python -> ca-certificates
pytorch::pytorch -> python[version='>=2.7,<2.8.0a0'] -> ca-certificates
conda-forge::tqdm -> python[version='>=2.7'] -> ca-certificates
bioconda::blast -> gnutls[version='>=3.6.5,<3.7.0a0'] -> ca-certificates
bioconda::hhsuite -> python[version='>=2.7,<2.8.0a0'] -> ca-certificates
python=3.8 -> openssl[version='>=1.1.1s,<1.1.2a'] -> ca-certificates

Package python conflicts for:
dglteam::dgl-cuda11.3 -> networkx -> python[version='2.7.*|3.5.*|3.6.*|>=2.7,<2.8.0a0|>=3.5|>=3.6|>=3.7|>=3.8|>=3.5,<3.6.0a0|3.4.*|>=3.11,<3.12.0a0|>=3.7,<4.0|>=3.6,<4.0|>=2.7']
dglteam::dgl-cuda11.3 -> python[version='>=3.10,<3.11.0a0|>=3.6,<3.7.0a0|>=3.7,<3.8.0a0|>=3.8,<3.9.0a0|>=3.9,<3.10.0a0']
python=3.8
conda-forge::psutil -> python_abi=3.11[build=*_cp311] -> python[version='3.10.*|3.11.*|3.9.*|3.8.*|3.7.*']
conda-forge::tqdm -> python[version='2.7.*|3.5.*|3.6.*|>=2.7,<2.8.0a0|>=2.7|>=3.6,<3.7.0a0|>=3.8,<3.9.0a0|>=3.7,<3.8.0a0|3.4.*']
conda-forge::tqdm -> colorama -> python[version='>=3.10,<3.11.0a0|>=3.9,<3.10.0a0|>=3.5,<3.6.0a0|>=3.7|>=3.6']
pytorch::pytorch -> python[version='>=2.7,<2.8.0a0|>=3.10,<3.11.0a0|>=3.8,<3.9.0a0|>=3.7,<3.8.0a0|>=3.9,<3.10.0a0|>=3.6,<3.7.0a0|>=3.5,<3.6.0a0']
requests -> certifi[version='>=2017.4.17'] -> python[version='3.7.*|3.8.*|<4.0|>=3.5|>=3.7']
bioconda::hhsuite -> python[version='>=2.7,<2.8.0a0|>=3.10,<3.11.0a0|>=3.8,<3.9.0a0|>=3.7,<3.8.0a0|>=3.9,<3.10.0a0|>=3.6,<3.7.0a0']
conda-forge::psutil -> python[version='2.7.*|3.5.*|3.6.*|>=2.7,<2.8.0a0|>=3.10,<3.11.0a0|>=3.11,<3.12.0a0|>=3.9,<3.10.0a0|>=3.8,<3.9.0a0|>=3.7,<3.8.0a0|>=3.6,<3.7.0a0|>=3.5,<3.6.0a0|3.4.*']
pytorch::pytorch -> typing_extensions -> python[version='2.7.*|3.5.*|3.6.*|3.9.*|>=3.11,<3.12.0a0|>=3.5|>=3.6|>=3.7|>=3.6,<3.7|3.4.*|3.9.10|3.8.12|3.7.12|3.7.10|3.7.10|3.6.12|3.7.9|3.6.12|3.6.9|3.6.9|3.6.9|3.6.9|>=3.8|>=3',build='0_73_pypy|2_73_pypy|4_73_pypy|0_73_pypy|0_73_pypy|1_73_pypy|5_73_pypy|5_73_pypy|3_73_pypy|1_73_pypy']
bioconda::hhsuite -> python_abi=3.10[build=*_cp310] -> python[version='2.7.*|3.10.*|3.8.*|3.7.*|3.9.*|3.6.*']
requests -> python[version='2.7.*|3.5.*|3.6.*|>=2.7,<2.8.0a0|>=3.10,<3.11.0a0|>=3.7,<3.8.0a0|>=3.8,<3.9.0a0|>=3.9,<3.10.0a0|>=3.6|>=3.6,<3.7.0a0|>=3.5,<3.6.0a0|>=3.7,<4.0|>=3.6,<4.0|3.4.*']
bioconda::blast -> boost[version='>=1.68.0,<1.68.1.0a0'] -> python[version='2.7.*|3.5.*|3.6.*|>=2.7,<2.8.0a0|>=3.6,<3.7.0a0|>=3.7,<3.8.0a0|>=3.5,<3.6.0a0|3.4.*|>=3.10,<3.11.0a0|>=3.9,<3.10.0a0|>=3.8,<3.9.0a0|>=3.11,<3.12.0a0']

Package mkl conflicts for:
pytorch::pytorch -> mkl[version='>=2018']
pytorch::pytorch -> blas=[build=mkl] -> mkl[version='>=2018.0.0,<2019.0a0|>=2018.0.1,<2019.0a0|>=2018.0.2,<2019.0a0|>=2018.0.3,<2019.0a0|>=2019.1,<2021.0a0|>=2019.3,<2021.0a0|>=2019.4,<2021.0a0|>=2021.2.0,<2022.0a0|>=2021.3.0,<2022.0a0|>=2021.4.0,<2022.0a0|>=2019.4,<2020.0a0']

Package libzlib conflicts for:
bioconda::cd-hit -> zlib[version='>=1.2.12,<1.3.0a0'] -> libzlib[version='1.2.11|1.2.11|1.2.11|1.2.12|1.2.12|1.2.12|1.2.12|1.2.12|1.2.13',build='h36c2ea0_1013|h166bdaf_0|h166bdaf_2|h166bdaf_3|h166bdaf_4|h166bdaf_1|h166bdaf_1014|h36c2ea0_1012']
python=3.8 -> zlib[version='>=1.2.13,<1.3.0a0'] -> libzlib[version='1.2.11|1.2.11|1.2.11|1.2.12|1.2.12|1.2.12|1.2.12|1.2.12|1.2.13|>=1.2.12,<1.3.0a0',build='h36c2ea0_1013|h166bdaf_0|h166bdaf_2|h166bdaf_3|h166bdaf_4|h166bdaf_1|h166bdaf_1014|h36c2ea0_1012']
python=3.8 -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.13,<1.3.0a0']
conda-forge::psutil -> python[version='>=3.11,<3.12.0a0'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.13,<1.3.0a0|>=1.2.12,<1.3.0a0']
bioconda::blast -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0']
bioconda::cd-hit -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0']
bioconda::hhsuite -> python[version='>=3.10,<3.11.0a0'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0|>=1.2.13,<1.3.0a0']
bioconda::blast -> curl[version='>=7.83.1,<8.0a0'] -> libzlib[version='1.2.11|1.2.11|1.2.11|1.2.12|1.2.12|1.2.12|1.2.12|1.2.12|1.2.13|>=1.2.13,<1.3.0a0',build='h36c2ea0_1013|h166bdaf_0|h166bdaf_2|h166bdaf_3|h166bdaf_4|h166bdaf_1|h166bdaf_1014|h36c2ea0_1012']
requests -> python[version='>=3.8,<3.9.0a0'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.13,<1.3.0a0|>=1.2.12,<1.3.0a0']
conda-forge::tqdm -> python[version='>=2.7'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0|>=1.2.13,<1.3.0a0']
pytorch::pytorch -> python[version='>=3.10,<3.11.0a0'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0|>=1.2.13,<1.3.0a0']
dglteam::dgl-cuda11.3 -> python[version='>=3.8,<3.9.0a0'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.13,<1.3.0a0|>=1.2.12,<1.3.0a0']

Package sqlite conflicts for:
python=3.8 -> sqlite[version='>=3.30.0,<4.0a0|>=3.30.1,<4.0a0|>=3.31.1,<4.0a0|>=3.32.3,<4.0a0|>=3.33.0,<4.0a0|>=3.35.4,<4.0a0|>=3.36.0,<4.0a0|>=3.38.0,<4.0a0|>=3.39.3,<4.0a0|>=3.40.0,<4.0a0|>=3.37.1,<4.0a0|>=3.37.0,<4.0a0|>=3.35.5,<4.0a0|>=3.34.0,<4.0a0']
python=3.8 -> pypy3.8=7.3.9 -> sqlite[version='>=3.38.2,<4.0a0|>=3.39.1,<4.0a0|>=3.39.2,<4.0a0']

Package perl conflicts for:
bioconda::blast -> entrez-direct -> perl[version='>=5.22.0,<5.23.0|>=5.26.2,<5.27.0a0|>=5.32.1,<6.0a0',build=*_perl5]
bioconda::blast -> perl[version='5.22.0.*|>=5.26.2,<5.26.3.0a0']
bioconda::hhsuite -> perl[version='>=5.26.2,<5.26.3.0a0|>=5.32.1,<5.33.0a0',build=*_perl5]
bioconda::infernal -> perl[version='>=5.32.1,<5.33.0a0',build=*_perl5]

Package pypy3.6 conflicts for:
requests -> certifi[version='>=2017.4.17'] -> pypy3.6[version='7.3.*|7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*|>=7.3.1|>=7.3.2|>=7.3.3']
pytorch::pytorch -> python[version='>=3.6,<3.7.0a0'] -> pypy3.6[version='7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*|>=7.3.1|>=7.3.3|>=7.3.2']
bioconda::blast -> boost -> pypy3.6[version='>=7.3.3']
bioconda::hhsuite -> python[version='>=3.6,<3.7.0a0'] -> pypy3.6[version='7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*']
dglteam::dgl-cuda11.3 -> numpy -> pypy3.6[version='7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*|>=7.3.1|>=7.3.2|>=7.3.3']
conda-forge::psutil -> python[version='>=3.6,<3.7.0a0'] -> pypy3.6[version='7.3.*|7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*']
conda-forge::tqdm -> python[version='>=2.7'] -> pypy3.6[version='7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*']
conda-forge::psutil -> pypy3.6[version='>=7.3.1|>=7.3.2|>=7.3.3']

Package xz conflicts for:
python=3.8 -> xz[version='>=5.2.4,<5.3.0a0|>=5.2.4,<6.0a0|>=5.2.5,<6.0a0|>=5.2.6,<6.0a0|>=5.2.5,<5.3.0a0']
python=3.8 -> pypy3.8=7.3.9 -> xz[version='>=5.2.6,<5.3.0a0']

Package blas conflicts for:
pytorch::pytorch -> numpy[version='>=1.19'] -> blas[version='*|1.0|1.0|1.1',build='openblas|mkl|openblas|openblas']
pytorch::pytorch -> blas=[build=mkl]

Package requests conflicts for:
dglteam::dgl-cuda11.3 -> requests
requests

Package expat conflicts for:
python=3.8 -> pypy3.8=7.3.9 -> expat[version='>=2.4.7,<3.0a0|>=2.4.8,<3.0a0|>=2.4.9,<3.0a0|>=2.5.0,<3.0a0']
conda-forge::psutil -> pypy3.9[version='>=7.3.9'] -> expat[version='>=2.2.9,<3.0.0a0|>=2.3.0,<3.0a0|>=2.4.1,<3.0a0|>=2.4.7,<3.0a0|>=2.4.8,<3.0a0|>=2.4.9,<3.0a0|>=2.5.0,<3.0a0']

Package psutil conflicts for:
dglteam::dgl-cuda11.3 -> psutil
conda-forge|conda-forge::psutil
conda-forge::psutil

Package libnsl conflicts for:
conda-forge::tqdm -> python[version='>=2.7'] -> libnsl[version='>=2.0.0,<2.1.0a0']
bioconda::hhsuite -> perl[version='>=5.32.1,<5.33.0a0',build=*_perl5] -> libnsl[version='>=2.0.0,<2.1.0a0']
conda-forge::psutil -> python[version='>=3.11,<3.12.0a0'] -> libnsl[version='>=2.0.0,<2.1.0a0']
bioconda::infernal -> perl[version='>=5.32.1,<5.33.0a0',build=*_perl5] -> libnsl[version='>=2.0.0,<2.1.0a0']
pytorch::pytorch -> python[version='>=3.10,<3.11.0a0'] -> libnsl[version='>=2.0.0,<2.1.0a0']
requests -> python[version='>=3.8,<3.9.0a0'] -> libnsl[version='>=2.0.0,<2.1.0a0']
bioconda::blast -> perl -> libnsl[version='>=2.0.0,<2.1.0a0']
python=3.8 -> libnsl[version='>=2.0.0,<2.1.0a0']
dglteam::dgl-cuda11.3 -> python[version='>=3.8,<3.9.0a0'] -> libnsl[version='>=2.0.0,<2.1.0a0']

Package cudatoolkit conflicts for:
conda-forge|conda-forge::cudatoolkit=11.3
pytorch::pytorch -> cudatoolkit[version='8.*|>=10.0,<10.1|>=10.1,<10.2|>=10.2,<10.3|>=11.3,<11.4|>=11.6,<11.7|>=11.5,<11.6|>=11.1,<11.2|>=11.0,<11.1|>=9.2,<9.3|>=9.0,<9.1|>=8.0,<8.1|9.*']
conda-forge::cudatoolkit=11.3

Package setuptools conflicts for:
python=3.8 -> pip -> setuptools
dglteam::dgl-cuda11.3 -> networkx -> setuptools

Package gdbm conflicts for:
bioconda::blast -> perl -> gdbm[version='>=1.18|>=1.18,<1.19.0a0']
python=3.8 -> pypy3.8=7.3.9 -> gdbm[version='>=1.18,<1.19.0a0']
conda-forge::psutil -> pypy3.9[version='>=7.3.9'] -> gdbm[version='>=1.18,<1.19.0a0']

Package clangdev conflicts for:
bioconda::cd-hit -> openmp -> clangdev[version='4.0.0|4.0.0|4.0.0.*|5.0.0|5.0.0.*']
bioconda::hhsuite -> openmp -> clangdev[version='4.0.0|4.0.0|4.0.0.*|5.0.0|5.0.0.*']

Package llvm-openmp conflicts for:
pytorch::pytorch -> blas=[build=mkl] -> llvm-openmp[version='>=10.0.0|>=11.0.0|>=11.0.1|>=11.1.0|>=12.0.1|>=13.0.1|>=14.0.4|>=15.0.6|>=14.0.3|>=9.0.1']
bioconda::cd-hit -> _openmp_mutex[version='>=4.5'] -> llvm-openmp[version='8.0.0|8.0.0|8.0.1|>=9.0.1',build='hc9558a2_0|hc9558a2_0|hc9558a2_1']
bioconda::hhsuite -> _openmp_mutex[version='>=4.5'] -> llvm-openmp[version='8.0.0|8.0.0|8.0.1|>=9.0.1',build='hc9558a2_0|hc9558a2_0|hc9558a2_1']

Package tqdm conflicts for:
dglteam::dgl-cuda11.3 -> tqdm
conda-forge::tqdm
conda-forge|conda-forge::tqdm

Package libstdcxx-ng conflicts for:
python=3.8 -> libstdcxx-ng[version='>=11.2.0|>=7.5.0|>=7.3.0|>=9.4.0|>=9.3.0']
python=3.8 -> libffi[version='>=3.2.1,<3.3a0'] -> libstdcxx-ng[version='>=4.9|>=7.2.0']

Package zlib conflicts for:
python=3.8 -> zlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0|>=1.2.13,<1.3.0a0']
python=3.8 -> pypy3.8=7.3.9 -> zlib

Package blast-legacy conflicts for:
biocore::blast-legacy=2.2.26
biocore|biocore::blast-legacy=2.2.26
biocore::psipred=4.01 -> blast-legacy==2.2.26The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__glibc==2.35=0
  - feature:|@/linux-64::__glibc==2.35=0
  - bioconda::blast -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::cd-hit -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::csblast -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::hhsuite -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::hmmer[version='>=3.3'] -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::infernal -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::mafft -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - conda-forge::cudatoolkit=11.3 -> __glibc[version='>=2.17,<3.0.a0']
  - conda-forge::cudatoolkit=11.3 -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - conda-forge::psutil -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - pytorch::pytorch -> cudatoolkit[version='>=11.3,<11.4'] -> __glibc[version='>=2.17,<3.0.a0']

Your installed version is: 2.35

Note that strict channel priority may have removed packages required for satisfiability.

I've also tried doing the same on the standard Pytorch 13.1 deep learning AMI for AWS and have run into the same issue. What am I doing wrong here?

possible mistake in make_rna_msa.sh

First of all thanks for the great work!

I think we might have found a mistake in input_prep/make_rna_msa.sh: (line 126)

for e_val in 1e-8, 1e-7, 1e-6, 1e-3, 1e-2, 1e-1
do
    nhmmer --noali -A nhmmer.a2m --incE $e_val --cpu $CPU --watson $in_fasta db | grep 'no alignment saved'
    ...
done

the loop here is wrong: this will pass 1e-8, etc (including the comma in the end) to nhummer but it will work only for 1e-1 which doesn't have a coma in the end. For the rest of values nhummer throws an error message which is hidden by the grep command.

Issue with using BLAST

Hello! Thanks for this great work which helps me a lot!

I am using RoseTTAFold2NA for the prediction of protein and RNA complexes. However, I am facing an error in constructing the MSA of the RNA. The issue arises when the program attempts to execute this command. The following error message is returned:

BLAST Database error: No alias or index file found for nucleotide database [/home/huangtin/RoseTTAFold2NA/RNA/] in search path [/home/huangtin/RoseTTAFold2NA/aptamer_data/test/rna_pred::]

It appears that BLAST is unable to locate the index file of the 'nt' database, even though it has been downloaded. The RNA sequence I am utilizing is GCUUCUGGACUGCGAUGGGAGCACGAAACGUCGUGGCGCAAUUGGGUGGGGAAAGUCCUUAAAAGAGGGCCACCACAGAAGC.

I would appreciate any assistance in resolving this issue. Thanks!

RuntimeError: CUDA out of memory.

Which parameter represents how tightly the protein binds to the nucleic acid?

RF2NA is a great tool for nucleic acid docking with protein.

I have a problem.

Which parameter represents how tightly the protein binds to the nucleic acid?

The dist from .npz file?

Inconsistent outputs

Hello!

I've been testing the software by running the same input protein and DNA sequences 10 times, and the outputs seem to be quite inconsistent. The prediction accuracy is similar as 0.66-0.67 in most cases, but DNAs bind different regions of the proteins. The files are attatched below. I wonder if this is an issue on my side or associated with the weights.
RoseTTAFold2NA_output.zip

I also noticed that sometimes DNAs overlap with proteins

Or some residues appear with "- - - -" when the structure is visualized.

Thanks for your help!

perl oom program kiiled

When I run the ../input_prep/reprocess_rnac.pl id_mapping.tsv.gz rfam_annotations.tsv.gz command I get this error:
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=reprocess_rnac.,pid=334,uid=1000
[ 142.115180] Out of memory: Killed process 334 (reprocess_rnac.) total-vm:9579944kB, anon-rss:7421640kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:18624kB oom_score_adj:0

anyone know what this means?

Typo:cmpress->compress

In the section 4. Download sequence and structure databases of README, whether cmpress Rfam.cm is a typo, shoud be compress Rfam.cm?

numpy 1.24 breaks the code

Noticed the following error when running the default env yaml - "AttributeError: module 'numpy' has no attribute 'long'"
Post installing numpy 1.23 with python 3.10 it seems to be fine. Suggest updating the yaml ?

Training data

Hi, thanks for your amazing work

I wanted to ask about sharing the training code and data?
for RNA/DNA data all of the thing I see in repo are sequences. could you please share pdb files

Thanks

Downloading the RNA database according to README instructions

In the README for downloading the RNA databases, there maybe a small typo:
When I do wget with the flag -C (as shown below) I get invalid option. Should this be lowercase "c" for all??:
wget ftp://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/Rfam.cm.gz -C ./RNA
wget: invalid option -- 'C'
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.
(RF2NA) [osu10269@pitzer-login04 RoseTTAFold2NA]$ wget ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/sequences/rnacentral_species_specific_ids.fasta.gz -C ./RNA
wget: invalid option -- 'C'
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.
(RF2NA) [osu10269@pitzer-login04 RoseTTAFold2NA]$ wget ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/rfam/rfam_annotations.tsv.gz -C ./RNA
wget: invalid option -- 'C'
Usage: wget [OPTION]... [URL]...

File Sizes

In order to actually install RF2NA how much storage needs to be allocated to the system? The compressed file sizes are listed, but not how much storage the databases they're extracted to require.

How to handle case where no RNA families are found

Hi, @fdimaio. Thanks again for making this project open-source. I had a question RNA MSA generation. For some small RNA sequence inputs I've tried running through RF2NA, the RNA MSA generation script (i.e., make_rna_msa.sh) issues a series of errors stemming from the following line, in the event that no RNA families are found by cmscan. It seems like in this case all proceeding lines fail because they assume at least one family was found. Is there a simple adjustment that we can make to this MSA generation script to handle the case where no RNA families are found? For example, would this simply mean that the script should exit early instead of proceeding with subsequent steps?

Another way of phrasing this questions would be, does RF2NA have a way of performing single-sequence predictions for RNA FASTA inputs? If not, what changes would likely be necessary?

https://github.com/uw-ipd/RoseTTAFold2NA/blob/03f12bd421db618455d9c0726f79f72433a8638e/input_prep/make_rna_msa.sh#L62C1-L62C1

The error to predict the protein and RNA structure

When I use this code, the machine reports wrong, shown as below, did you know how to resolve this? Thank you very much.

Traceback (most recent call last):
File "/home/software/RoseTTAFold2NA/network/predict.py", line 345, in
pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
File "/home/software/RoseTTAFold2NA/network/predict.py", line 146, in predict
msa_i, ins_i = parse_a3m(a3m_i, unzip=False)
File "/home/software/RoseTTAFold2NA/network/parsers.py", line 221, in parse_a3m
msa = np.array([list(s) for s in msa], dtype='|S1').view(np.uint8)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (722,) + inhomogeneous part.

Run RNA prediction error

When I predict the structure of this RNA file, and get error because no data got back. Could you help me for resolving this problem?

RNA
AAUUUCUACUAAGUGUAGAUC

Thanks.

ERROR: failed to load model

Hello. I was trying to fold a ssDNA and I get the above error.

Running HHblits
Running PSIPRED
Running hhsearch
Running RoseTTAFold2NA to predict structures
Running on GPU
ERROR: failed to load model
Done

Any advice is appreciated.

makeblastdb show add '-parse_seqids' option.

BLAST version: ncbi-blast-2.13.0+

gunzip -c rnacentral_species_specific_ids.fasta.gz | makeblastdb -in - -dbtype nucl  -out rnacentral.fasta -title "RNACentral"

should be changed to:

gunzip -c rnacentral_species_specific_ids.fasta.gz | makeblastdb -in - -parse_seqids -dbtype nucl  -out rnacentral.fasta -title "RNACentral"

otherwise, blastdbcmd will raise error:

blastdbcmd -db $db -entry_batch $tag.list.split.$suffix -out tmp.$tag.db.$suffix -outfmt ">Accession:%a_TaxID:%T @%s" &> /dev/null
Error: [blastdbcmd] Skipped URS0002332FB3_39947                                     
Error: [blastdbcmd] Skipped URS0002332FB3_40148  
Error: [blastdbcmd] Skipped URS0002332FB3_4529

typo `../run_RF2NA.sh t000_ protein.fa R:rna.fa`

the exemple should probably read ../run_RF2NA.sh t000_ protein.fa R:RNA.fa (upper case fasta filename) and not R:rna.fa (lower case) ;P

Best regards

Tru

Issues with RNA multiple sequence alignment

Hi, thanks for the amazing work!!

I am having some issues when trying to run the model for a protein-RNA docking. I first tried with the protein and RNA sequences you provided for the example and everything worked very well. But when trying with other sequences I get some errors.

The stderr files relative to the hhsearch and the protein msa do not contain errors.
This is the content of the stderr file for the rna msa.

/aplic/noarch/software/RoseTTAFold2NA/0.2-Miniconda3-4.9.2/input_prep/make_rna_msa.sh: line 9: [: missing `]'
rm: cannot remove ‘rfam1.list.split.’: No such file or directory
rm: cannot remove ‘rfam2.list.split.’: No such file or directory
rm: cannot remove ‘blastn1.list.split.*’: No such file or directory
/aplic/noarch/software/RoseTTAFold2NA/0.2-Miniconda3-4.9.2/input_prep/make_rna_msa.sh: line 101: 221255 Aborted (core dumped) cd-hit-est-2d -T $CPU -i $in_fasta -i2 trim.db -c $cut -o cdhitest2d.db -l $throw_away_sequences -M 0 &>/dev/null
grep: db: No such file or directory
rm: cannot remove ‘cdhitest2d.db’: No such file or directory
rm: cannot remove ‘cdhitest2d.db.clstr’: No such file or directory
rm: cannot remove ‘db.clstr’: No such file or directory