GithubHelp home page GithubHelp logo

uw-ipd / rosettafold2na Goto Github PK

View Code? Open in Web Editor NEW
312.0 15.0 69.0 1.11 MB

RoseTTAFold2 protein/nucleic acid complex prediction

License: MIT License

Dockerfile 0.42% Shell 2.70% Python 96.71% Perl 0.16%

rosettafold2na's Introduction

RF2NA

GitHub repo for RoseTTAFold2 with nucleic acids

New: April 13, 2023 v0.2

  • Updated weights (https://files.ipd.uw.edu/dimaio/RF2NA_apr23.tgz) for better prediction of homodimer:DNA interactions and better DNA-specific sequence recognition
  • Bugfixes in MSA generation pipeline
  • Support for paired protein/RNA MSAs

Installation

  1. Clone the package
git clone https://github.com/uw-ipd/RoseTTAFold2NA.git
cd RoseTTAFold2NA
  1. Create conda environment All external dependencies are contained in RF2na-linux.yml
# create conda environment for RoseTTAFold2NA
conda env create -f RF2na-linux.yml

You also need to install NVIDIA's SE(3)-Transformer (please use SE3Transformer in this repo to install).

conda activate RF2NA
cd SE3Transformer
pip install --no-cache-dir -r requirements.txt
python setup.py install
cd ..
  1. Download pre-trained weights under network directory
cd network
wget https://files.ipd.uw.edu/dimaio/RF2NA_apr23.tgz
tar xvfz RF2NA_apr23.tgz
ls weights/ # it should contain a 1.1GB weights file
cd ..
  1. Download sequence and structure databases
# uniref30 [46G]
wget http://wwwuser.gwdg.de/~compbiol/uniclust/2020_06/UniRef30_2020_06_hhsuite.tar.gz
mkdir -p UniRef30_2020_06
tar xfz UniRef30_2020_06_hhsuite.tar.gz -C ./UniRef30_2020_06

# BFD [272G]
wget https://bfd.mmseqs.com/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gz
mkdir -p bfd
tar xfz bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gz -C ./bfd

# structure templates (including *_a3m.ffdata, *_a3m.ffindex)
wget https://files.ipd.uw.edu/pub/RoseTTAFold/pdb100_2021Mar03.tar.gz
tar xfz pdb100_2021Mar03.tar.gz

# RNA databases
mkdir -p RNA
cd RNA

# Rfam [300M]
wget ftp://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/Rfam.full_region.gz
wget ftp://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/Rfam.cm.gz
gunzip Rfam.cm.gz
cmpress Rfam.cm

# RNAcentral [12G]
wget ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/rfam/rfam_annotations.tsv.gz
wget ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/id_mapping/id_mapping.tsv.gz
wget ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/sequences/rnacentral_species_specific_ids.fasta.gz
../input_prep/reprocess_rnac.pl id_mapping.tsv.gz rfam_annotations.tsv.gz   # ~8 minutes
gunzip -c rnacentral_species_specific_ids.fasta.gz | makeblastdb -in - -dbtype nucl  -parse_seqids -out rnacentral.fasta -title "RNACentral"

# nt [151G]
update_blastdb.pl --decompress nt
cd ..

Usage

conda activate RF2NA
cd example
# run Protein/RNA prediction
../run_RF2NA.sh rna_pred rna_binding_protein.fa R:RNA.fa
# run Protein/DNA prediction
../run_RF2NA.sh dna_pred dna_binding_protein.fa D:DNA.fa

Inputs

  • The first argument to the script is the output folder
  • The remaining arguments are fasta files for individual chains in the structure. Use the tags P:xxx.fa R:xxx.fa D:xxx.fa S:xxx.fa to specify protein, RNA, double-stranded DNA, and single-stranded DNA, respectively. Use the tag PR:xxx.fa to specify paired protein/RNA. Each chain is a separate file; 'D' will automatically generate a complementary DNA strand to the input strand.

Expected outputs

  • Outputs are written to the folder provided as the first argument (dna_pred and rna_pred).
  • Model outputs are placed in a subfolder, models (e.g., dna_pred.models)
  • You will get a predicted structre with estimated per-residue LDDT in the B-factor column (models/model_00.pdb)
  • You will get a numpy .npz file (models/model_00.npz). This can be read with numpy.load and contains three tables (L=complex length):
    • dist (L x L x 37) - the predicted distogram
    • lddt (L) - the per-residue predicted lddt
    • pae (L x L) - the per-residue pair predicted error

rosettafold2na's People

Contributors

amorehead avatar blake-riley avatar fdimaio avatar ita-infiniplex avatar suhassrinivasan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rosettafold2na's Issues

run RNA prediction error

cat test.fa

RNA
GCGGGGGUUGCCGAGCCUGGUCAAAGGCGGGGGACUCAAGAUCCCCUCCCGUAGGGGUUCCGGGGUUCGAAUCCCCGCCCCCGCACCAUCCCCGCCCCCGCACCA

bash -x ../run_RF2NA.sh test R:test.fa
output:

  • python /home/ubuntu/RoseTTAFold2NA/network/predict.py -inputs R:/home/ubuntu/RoseTTAFold2NA/example/Rtest/Rtest.afa -prefix /home/ubuntu/RoseTTAFold2NA/example/Rtest/models/model -model /home/ubuntu/RoseTTAFold2NA/network/weights/RF2NA_sep22.pt -db /home/ubuntu/RoseTTAFold2NA/pdb100_2021Mar03/pdb100_2021Mar03
    Running on GPU
    Traceback (most recent call last):
    File "/home/ubuntu/RoseTTAFold2NA/network/predict.py", line 345, in
    pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
    File "/home/ubuntu/RoseTTAFold2NA/network/predict.py", line 225, in predict
    self.run_model(Ls, msa_orig, ins_orig, t1d, t2d, xyz_t, xyz_t[:,0], alpha_t, "%s%02d"%(out_prefix, i_trial))
    File "/home/ubuntu/RoseTTAFold2NA/network/predict.py", line 233, in _run_model
    seq, msa_seed_orig, msa_seed, msa_extra, mask_msa = MSAFeaturize(
    File "/home/ubuntu/RoseTTAFold2NA/network/data_loader.py", line 135, in MSAFeaturize
    sample_mono = torch.randperm((N-1)//nmer, device=msa.device)
    RuntimeError: CUDA error: device-side assert triggered
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Password

Hi,
I think your repo is locked
can you make it accessible for cloning

Get errors "NVTX functions not installed. Are you sure you have a CUDA build?" when run RF2NA on CPU.

Hi,

I get the following errors when run RoseTTAFold2NA on CPU. I have replaced the "torch.cuda.amp.autocast" with "torch.amp.autocast" in predict.py in order to overcome another "NVTX functions not installed." error when running the run_RF2NA.sh.

Seems there is some part of the codes still calling cuda or searching for GPUs?
Thank you!

Traceback (most recent call last):
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/predict.py", line 376, in <module>
    pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/predict.py", line 239, in predict
    self._run_model(Ls, msa_orig, ins_orig, t1d, t2d, xyz_t, xyz_t[:,0], alpha_t, "%s_%02d"%(out_prefix, i_trial))
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/predict.py", line 299, in _run_model
    logit_s, logit_aa_s, logit_pae, init_crds, alpha_prev, _, pred_lddt_binned, msa_prev, pair_prev, state_prev = self.model(
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/RoseTTAFoldModel.py", line 104, in forward
    msa, pair, xyz, alpha_s, xyzallatom, state = self.simulator(
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/Track_module.py", line 441, in forward
    msa_full, pair, xyz, state, alpha = self.extra_block[i_m](msa_full, pair,
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/Track_module.py", line 367, in forward
    xyz, state, alpha = self.str2str(msa.float(), pair.float(), xyz.detach().float(), state.float(), idx, top_k=top_k)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/Track_module.py", line 234, in forward
    shift = self.se3(G, node.reshape(B*L, -1, 1), l1_feats, edge_feats)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/expanse/lustre/projects/ddp398/wjin/software/RoseTTAFold2NA/network/SE3_network.py", line 84, in forward
    return self.se3(G, node_features, edge_features) #, clamp_d=clamp_d)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/se3_transformer-1.0.0-py3.8.egg/se3_transformer/model/transformer.py", line 163, in forward
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/se3_transformer-1.0.0-py3.8.egg/se3_transformer/model/basis.py", line 166, in get_basis
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/contextlib.py", line 114, in __enter__
    return next(self.gen)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/cuda/nvtx.py", line 86, in range
    range_push(msg.format(*args, **kwargs))
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/cuda/nvtx.py", line 28, in range_push
    return _nvtx.rangePushA(msg)
  File "/home/wjin/data/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/cuda/nvtx.py", line 9, in _fail
    raise RuntimeError("NVTX functions not installed. Are you sure you have a CUDA build?")
RuntimeError: NVTX functions not installed. Are you sure you have a CUDA build?

My package versions:

brotlipy==0.7.0
certifi @ file:///croot/certifi_1665076670883/work/certifi
cffi @ file:///tmp/abs_98z5h56wf8/croots/recipe/cffi_1659598650955/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
click==8.1.3
colorama @ file:///opt/conda/conda-bld/colorama_1657009087971/work
configparser==5.3.0
cryptography @ file:///croot/cryptography_1665612644927/work
dgl==0.9.1.post1
DLLogger @ git+https://github.com/NVIDIA/dllogger@0540a43971f4a8a16693a9de9de73c1072020769
docker-pycreds==0.4.0
e3nn==0.3.3
gitdb==4.0.9
GitPython==3.1.29
idna @ file:///croot/idna_1666125576474/work
mkl-fft==1.3.1
mkl-random @ file:///tmp/build/80754af9/mkl_random_1626186064646/work
mkl-service==2.4.0
mpmath==1.2.1
networkx @ file:///opt/conda/conda-bld/networkx_1657784097507/work
numpy @ file:///croot/numpy_and_numpy_base_1667233465264/work
opt-einsum==3.3.0
opt-einsum-fx==0.1.4
packaging==21.3
pathtools==0.1.2
promise==2.3
protobuf==4.21.9
psutil @ file:///home/conda/feedstock_root/build_artifacts/psutil_1667885878918/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pynvml==11.0.0
pyOpenSSL @ file:///opt/conda/conda-bld/pyopenssl_1643788558760/work
pyparsing==3.0.9
PySocks @ file:///tmp/build/80754af9/pysocks_1605305779399/work
python-dateutil==2.8.2
PyYAML==6.0
requests @ file:///opt/conda/conda-bld/requests_1657734628632/work
scipy==1.9.3
se3-transformer==1.0.0
sentry-sdk==1.11.0
shortuuid==1.0.11
six @ file:///tmp/build/80754af9/six_1644875935023/work
smmap==5.0.0
subprocess32==3.5.4
sympy==1.11.1
torch==1.13.0
tqdm @ file:///home/conda/feedstock_root/build_artifacts/tqdm_1662214488106/work
typing_extensions @ file:///tmp/abs_ben9emwtky/croots/recipe/typing_extensions_1659638822008/work
urllib3 @ file:///croot/urllib3_1666298941550/work
wandb==0.12.0

Strange output when running example

Thank you for the open-source code!

I was running the example provided in the README with:
../run_RF2NA.sh t000_ protein.fa R:RNA.fa

The output PDB looks mostly reasonable, except the last 11 bases of the RNA seem to be missing and there are 8 UNK protein residues added at the end. I have attached a copy of the output folder.

I am not familiar with the RNA MSA processing but it does appear that the RNA.afa file has a a single sequence that is also missing the same 11 bases as the final output. Would you be able to confirm whether these intermediate output files look reasonable? I don't see anything out of the ordinary in the log files.

t000_.tar.gz

Solving environment hangs on Ubuntu 22.04

Hello! I'm trying to install the code on a standard t2.2xlarge AWS instance with Ubuntu 22.04 LTS installed. These are the steps I've taken:

Install microconda

https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

Clone and create the RF2na environment:

git clone https://github.com/uw-ipd/RoseTTAFold2NA.git
cd RoseTTAFold2NA
conda env create -f RF2na-linux.yml

The following then happens:

Collecting package metadata (repodata.json): done
Solving environment: /
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed

and then

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versions

Package libgcc conflicts for:
bioconda::blast -> libgcc
bioconda::cd-hit -> libgcc
dglteam::dgl-cuda11.3 -> scipy -> libgcc
bioconda::infernal -> libgcc

Package libgcc-ng conflicts for:
bioconda::cd-hit -> libgcc-ng[version='>=10.3.0|>=12|>=9.3.0|>=7.3.0|>=4.9']
bioconda::mafft -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0|>=9.3.0|>=7.5.0|>=7.3.0']
bioconda::hhsuite -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0|>=9.3.0|>=7.5.0|>=7.3.0']
conda-forge::tqdm -> python[version='>=2.7'] -> libgcc-ng[version='>=10.3.0|>=11.2.0|>=7.5.0|>=7.3.0|>=7.2.0|>=12|>=9.4.0|>=9.3.0|>=4.9']
bioconda::blast -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0|>=9.3.0|>=7.5.0|>=7.3.0|>=4.9']
bioconda::cd-hit -> zlib[version='>=1.2.12,<1.3.0a0'] -> libgcc-ng[version='>=11.2.0|>=7.5.0|>=7.2.0']
python=3.8 -> zlib[version='>=1.2.11,<1.3.0a0'] -> libgcc-ng[version='>=4.9|>=7.2.0']
conda-forge::cudatoolkit=11.3 -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0']
bioconda::hmmer[version='>=3.3'] -> libgcc-ng[version='>=10.3.0|>=9.3.0|>=7.5.0|>=7.3.0']
requests -> python[version='>=3.8,<3.9.0a0'] -> libgcc-ng[version='>=10.3.0|>=11.2.0|>=7.5.0|>=7.3.0|>=12|>=9.4.0|>=9.3.0|>=7.2.0|>=4.9']
conda-forge::psutil -> python[version='>=3.9,<3.10.0a0'] -> libgcc-ng[version='>=11.2.0|>=7.2.0']
bioconda::hhsuite -> perl[version='>=5.32.1,<5.33.0a0',build=*_perl5] -> libgcc-ng[version='>=11.2.0|>=7.2.0|>=4.9']
bioconda::blast -> curl[version='>=7.83.1,<8.0a0'] -> libgcc-ng[version='>=11.2.0|>=7.2.0']
bioconda::infernal -> perl[version='>=5.32.1,<5.33.0a0',build=*_perl5] -> libgcc-ng[version='>=11.2.0|>=9.4.0|>=7.2.0']
conda-forge::psutil -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0|>=9.3.0|>=7.5.0|>=7.3.0|>=4.9']
bioconda::csblast -> libgcc-ng[version='>=10.3.0|>=9.3.0']
python=3.8 -> libgcc-ng[version='>=10.3.0|>=11.2.0|>=7.5.0|>=7.3.0|>=12|>=9.4.0|>=9.3.0']
dglteam::dgl-cuda11.3 -> numpy -> libgcc-ng[version='>=10.3.0|>=11.2.0|>=7.5.0|>=7.3.0|>=7.2.0|>=12|>=9.4.0|>=9.3.0|>=4.9']
bioconda::infernal -> libgcc-ng[version='>=10.3.0|>=9.3.0|>=7.5.0|>=7.3.0|>=4.9']
pytorch::pytorch -> blas=[build=mkl] -> libgcc-ng[version='>=10.3.0|>=12|>=9.4.0|>=9.3.0|>=7.5.0|>=7.3.0|>=11.2.0|>=7.2.0|>=4.9']

Package numpy conflicts for:
bioconda::blast -> boost[version='>=1.68.0,<1.68.1.0a0'] -> numpy[version='1.11.*|1.12.*|1.13.*|>=1.11|>=1.8|>=1.9.3,<2.0a0|>=1.9|>=1.7|>=1.23.4,<2.0a0|>=1.20.3,<2.0a0|>=1.21.6,<2.0a0|>=1.19.5,<2.0a0|>=1.21.5,<2.0a0|>=1.18.5,<2.0a0|>=1.21.4,<2.0a0|>=1.17.5,<2.0a0|>=1.16.6,<2.0a0|>=1.16.5,<2.0a0|>=1.19.4,<2.0a0|>=1.19.2,<2.0a0|>=1.14.6,<2.0a0']
pytorch::pytorch -> numpy[version='>=1.11|>=1.19']
dglteam::dgl-cuda11.3 -> networkx -> numpy[version='1.10.*|1.11.*|1.12.*|1.13.*|>=1.11|>=1.11.3,<2.0a0|>=1.14.6,<2.0a0|>=1.16,<1.23|>=1.19,<1.25.0|>=1.19|>=1.19,<1.26.0|>=1.21,<1.26.0|>=1.21,<1.25.0|>=1.21,<1.23|>=1.16.6,<1.23.0|>=1.21.2,<1.23.0|>=1.16.6,<2.0a0|>=1.15.1,<2.0a0|>=1.9.3,<2.0a0|>=1.21.6,<1.26|>=1.21.6,<2.0a0|>=1.23.4,<1.26|>=1.23.4,<2.0a0|>=1.20.3,<1.26|>=1.20.3,<2.0a0|>=1.19.5,<2.0a0|>=1.20.3,<1.25|>=1.21.6,<1.25|>=1.18.5,<2.0a0|>=1.21.5,<2.0a0|>=1.20.3,<1.23|>=1.21.6,<1.23|>=1.21.4,<2.0a0|>=1.17.5,<2.0a0|>=1.19.4,<2.0a0|>=1.16.5,<2.0a0|>=1.19.2,<2.0a0|>=1.18.1,<2.0a0|>=1.9']
dglteam::dgl-cuda11.3 -> numpy

Package tzdata conflicts for:
conda-forge::psutil -> python[version='>=3.11,<3.12.0a0'] -> tzdata
bioconda::hhsuite -> python[version='>=3.10,<3.11.0a0'] -> tzdata
conda-forge::tqdm -> python[version='>=2.7'] -> tzdata
dglteam::dgl-cuda11.3 -> python[version='>=3.9,<3.10.0a0'] -> tzdata
pytorch::pytorch -> python[version='>=3.10,<3.11.0a0'] -> tzdata
requests -> python[version='>=3.10,<3.11.0a0'] -> tzdata

Package ncurses conflicts for:
python=3.8 -> ncurses[version='>=6.1,<7.0.0a0|>=6.1,<7.0a0|>=6.2,<7.0a0|>=6.3,<7.0a0|>=6.2,<7.0.0a0']
python=3.8 -> readline[version='>=7.0,<8.0a0'] -> ncurses[version='5.9.*|6.0.*|>=6.0,<7.0a0']

Package _libgcc_mutex conflicts for:
bioconda::csblast -> libgcc-ng[version='>=10.3.0'] -> _libgcc_mutex[version='*|0.1|0.1',build='conda_forge|main']
bioconda::cd-hit -> _openmp_mutex[version='>=4.5'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']
bioconda::hmmer[version='>=3.3'] -> libgcc-ng[version='>=10.3.0'] -> _libgcc_mutex[version='*|0.1|0.1',build='conda_forge|main']
bioconda::infernal -> libgcc-ng[version='>=10.3.0'] -> _libgcc_mutex[version='*|0.1|0.1',build='conda_forge|main']
python=3.8 -> libgcc-ng[version='>=11.2.0'] -> _libgcc_mutex[version='*|0.1|0.1',build='conda_forge|main']
conda-forge::psutil -> libgcc-ng[version='>=12'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']
bioconda::mafft -> libgcc-ng[version='>=12'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']
bioconda::blast -> libgcc-ng[version='>=12'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']
conda-forge::cudatoolkit=11.3 -> libgcc-ng[version='>=12'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']
bioconda::hhsuite -> _openmp_mutex[version='>=4.5'] -> _libgcc_mutex[version='*|0.1',build='conda_forge|main|main']

Package libffi conflicts for:
pytorch::pytorch -> python[version='>=3.10,<3.11.0a0'] -> libffi[version='3.2.*|>=3.2.1,<3.3.0a0|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0|>=3.3|<3.3.0.a0']
dglteam::dgl-cuda11.3 -> python[version='>=3.8,<3.9.0a0'] -> libffi[version='3.2.*|>=3.2.1,<3.3.0a0|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0']
python=3.8 -> libffi[version='>=3.2.1,<3.3.0a0|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0']
requests -> python[version='>=3.8,<3.9.0a0'] -> libffi[version='3.2.*|>=3.2.1,<3.3.0a0|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0']
conda-forge::psutil -> python[version='>=3.11,<3.12.0a0'] -> libffi[version='3.2.*|>=3.2.1,<3.3.0a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4.2,<3.5.0a0|>=3.4,<4.0a0|>=3.2.1,<3.3a0']
bioconda::hhsuite -> python[version='>=3.10,<3.11.0a0'] -> libffi[version='3.2.*|>=3.2.1,<3.3.0a0|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0']
conda-forge::tqdm -> python[version='>=2.7'] -> libffi[version='3.2.*|>=3.2.1,<3.3a0|>=3.3,<3.4.0a0|>=3.4,<3.5|>=3.4,<4.0a0|>=3.4.2,<3.5.0a0|>=3.2.1,<3.3.0a0']

Package libblas conflicts for:
pytorch::pytorch -> blas=[build=mkl] -> libblas[version='3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.8.0|3.9.0|>=3.9.0,<4.0a0|>=3.8.0,<4.0a0',build='4_mkl|6_mkl|7_mkl|8_mkl|9_mkl|12_mkl|13_mkl|18_mkl|19_mkl|6_mkl|7_mkl|8_mkl|10_mkl|14_linux64_mkl|15_linux64_mkl|16_linux64_mkl|13_linux64_mkl|12_linux64_mkl|11_linux64_mkl|9_mkl|5_mkl|21_mkl|20_mkl|16_mkl|15_mkl|14_mkl|11_mkl|10_mkl|5_mkl']
dglteam::dgl-cuda11.3 -> numpy -> libblas[version='>=3.8.0,<4.0a0|>=3.9.0,<4.0a0']

Package gmp conflicts for:
bioconda::blast -> gnutls[version='>=3.6.5,<3.7.0a0'] -> gmp[version='6.1.*|>=6.1.2|>=4.2']
bioconda::cd-hit -> libgcc -> gmp[version='>=4.2']
bioconda::blast -> gmp[version='>=6.1.2,<7.0a0']
bioconda::infernal -> libgcc -> gmp[version='>=4.2']

Package libsqlite conflicts for:
python=3.8 -> libsqlite[version='>=3.40.0,<4.0a0']
python=3.8 -> sqlite[version='>=3.40.0,<4.0a0'] -> libsqlite[version='3.39.2|3.39.3|3.39.4|3.40.0|>=3.39.4,<4.0a0|>=3.39.2,<4.0a0',build='h753d276_0|h753d276_1']

Package _openmp_mutex conflicts for:
bioconda::csblast -> libgcc-ng[version='>=10.3.0'] -> _openmp_mutex[version='>=4.5']
bioconda::blast -> libgcc-ng[version='>=12'] -> _openmp_mutex[version='>=4.5']
conda-forge::cudatoolkit=11.3 -> libgcc-ng[version='>=12'] -> _openmp_mutex[version='>=4.5']
python=3.8 -> libgcc-ng[version='>=11.2.0'] -> _openmp_mutex[version='>=4.5']
bioconda::infernal -> libgcc-ng[version='>=10.3.0'] -> _openmp_mutex[version='>=4.5']
bioconda::cd-hit -> libgcc-ng[version='>=10.3.0'] -> _openmp_mutex
pytorch::pytorch -> blas=[build=mkl] -> _openmp_mutex[version='*|>=4.5',build=*_llvm]
bioconda::hhsuite -> _openmp_mutex[version='>=4.5']
bioconda::mafft -> libgcc-ng[version='>=12'] -> _openmp_mutex[version='>=4.5']
bioconda::hmmer[version='>=3.3'] -> libgcc-ng[version='>=10.3.0'] -> _openmp_mutex[version='>=4.5']
conda-forge::psutil -> libgcc-ng[version='>=12'] -> _openmp_mutex[version='>=4.5']
bioconda::hhsuite -> libgcc-ng[version='>=10.3.0'] -> _openmp_mutex
bioconda::cd-hit -> _openmp_mutex[version='>=4.5']

Package ca-certificates conflicts for:
conda-forge::psutil -> python[version='>=2.7,<2.8.0a0'] -> ca-certificates
requests -> python -> ca-certificates
pytorch::pytorch -> python[version='>=2.7,<2.8.0a0'] -> ca-certificates
conda-forge::tqdm -> python[version='>=2.7'] -> ca-certificates
bioconda::blast -> gnutls[version='>=3.6.5,<3.7.0a0'] -> ca-certificates
bioconda::hhsuite -> python[version='>=2.7,<2.8.0a0'] -> ca-certificates
python=3.8 -> openssl[version='>=1.1.1s,<1.1.2a'] -> ca-certificates

Package python conflicts for:
dglteam::dgl-cuda11.3 -> networkx -> python[version='2.7.*|3.5.*|3.6.*|>=2.7,<2.8.0a0|>=3.5|>=3.6|>=3.7|>=3.8|>=3.5,<3.6.0a0|3.4.*|>=3.11,<3.12.0a0|>=3.7,<4.0|>=3.6,<4.0|>=2.7']
dglteam::dgl-cuda11.3 -> python[version='>=3.10,<3.11.0a0|>=3.6,<3.7.0a0|>=3.7,<3.8.0a0|>=3.8,<3.9.0a0|>=3.9,<3.10.0a0']
python=3.8
conda-forge::psutil -> python_abi=3.11[build=*_cp311] -> python[version='3.10.*|3.11.*|3.9.*|3.8.*|3.7.*']
conda-forge::tqdm -> python[version='2.7.*|3.5.*|3.6.*|>=2.7,<2.8.0a0|>=2.7|>=3.6,<3.7.0a0|>=3.8,<3.9.0a0|>=3.7,<3.8.0a0|3.4.*']
conda-forge::tqdm -> colorama -> python[version='>=3.10,<3.11.0a0|>=3.9,<3.10.0a0|>=3.5,<3.6.0a0|>=3.7|>=3.6']
pytorch::pytorch -> python[version='>=2.7,<2.8.0a0|>=3.10,<3.11.0a0|>=3.8,<3.9.0a0|>=3.7,<3.8.0a0|>=3.9,<3.10.0a0|>=3.6,<3.7.0a0|>=3.5,<3.6.0a0']
requests -> certifi[version='>=2017.4.17'] -> python[version='3.7.*|3.8.*|<4.0|>=3.5|>=3.7']
bioconda::hhsuite -> python[version='>=2.7,<2.8.0a0|>=3.10,<3.11.0a0|>=3.8,<3.9.0a0|>=3.7,<3.8.0a0|>=3.9,<3.10.0a0|>=3.6,<3.7.0a0']
conda-forge::psutil -> python[version='2.7.*|3.5.*|3.6.*|>=2.7,<2.8.0a0|>=3.10,<3.11.0a0|>=3.11,<3.12.0a0|>=3.9,<3.10.0a0|>=3.8,<3.9.0a0|>=3.7,<3.8.0a0|>=3.6,<3.7.0a0|>=3.5,<3.6.0a0|3.4.*']
pytorch::pytorch -> typing_extensions -> python[version='2.7.*|3.5.*|3.6.*|3.9.*|>=3.11,<3.12.0a0|>=3.5|>=3.6|>=3.7|>=3.6,<3.7|3.4.*|3.9.10|3.8.12|3.7.12|3.7.10|3.7.10|3.6.12|3.7.9|3.6.12|3.6.9|3.6.9|3.6.9|3.6.9|>=3.8|>=3',build='0_73_pypy|2_73_pypy|4_73_pypy|0_73_pypy|0_73_pypy|1_73_pypy|5_73_pypy|5_73_pypy|3_73_pypy|1_73_pypy']
bioconda::hhsuite -> python_abi=3.10[build=*_cp310] -> python[version='2.7.*|3.10.*|3.8.*|3.7.*|3.9.*|3.6.*']
requests -> python[version='2.7.*|3.5.*|3.6.*|>=2.7,<2.8.0a0|>=3.10,<3.11.0a0|>=3.7,<3.8.0a0|>=3.8,<3.9.0a0|>=3.9,<3.10.0a0|>=3.6|>=3.6,<3.7.0a0|>=3.5,<3.6.0a0|>=3.7,<4.0|>=3.6,<4.0|3.4.*']
bioconda::blast -> boost[version='>=1.68.0,<1.68.1.0a0'] -> python[version='2.7.*|3.5.*|3.6.*|>=2.7,<2.8.0a0|>=3.6,<3.7.0a0|>=3.7,<3.8.0a0|>=3.5,<3.6.0a0|3.4.*|>=3.10,<3.11.0a0|>=3.9,<3.10.0a0|>=3.8,<3.9.0a0|>=3.11,<3.12.0a0']

Package mkl conflicts for:
pytorch::pytorch -> mkl[version='>=2018']
pytorch::pytorch -> blas=[build=mkl] -> mkl[version='>=2018.0.0,<2019.0a0|>=2018.0.1,<2019.0a0|>=2018.0.2,<2019.0a0|>=2018.0.3,<2019.0a0|>=2019.1,<2021.0a0|>=2019.3,<2021.0a0|>=2019.4,<2021.0a0|>=2021.2.0,<2022.0a0|>=2021.3.0,<2022.0a0|>=2021.4.0,<2022.0a0|>=2019.4,<2020.0a0']

Package libzlib conflicts for:
bioconda::cd-hit -> zlib[version='>=1.2.12,<1.3.0a0'] -> libzlib[version='1.2.11|1.2.11|1.2.11|1.2.12|1.2.12|1.2.12|1.2.12|1.2.12|1.2.13',build='h36c2ea0_1013|h166bdaf_0|h166bdaf_2|h166bdaf_3|h166bdaf_4|h166bdaf_1|h166bdaf_1014|h36c2ea0_1012']
python=3.8 -> zlib[version='>=1.2.13,<1.3.0a0'] -> libzlib[version='1.2.11|1.2.11|1.2.11|1.2.12|1.2.12|1.2.12|1.2.12|1.2.12|1.2.13|>=1.2.12,<1.3.0a0',build='h36c2ea0_1013|h166bdaf_0|h166bdaf_2|h166bdaf_3|h166bdaf_4|h166bdaf_1|h166bdaf_1014|h36c2ea0_1012']
python=3.8 -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.13,<1.3.0a0']
conda-forge::psutil -> python[version='>=3.11,<3.12.0a0'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.13,<1.3.0a0|>=1.2.12,<1.3.0a0']
bioconda::blast -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0']
bioconda::cd-hit -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0']
bioconda::hhsuite -> python[version='>=3.10,<3.11.0a0'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0|>=1.2.13,<1.3.0a0']
bioconda::blast -> curl[version='>=7.83.1,<8.0a0'] -> libzlib[version='1.2.11|1.2.11|1.2.11|1.2.12|1.2.12|1.2.12|1.2.12|1.2.12|1.2.13|>=1.2.13,<1.3.0a0',build='h36c2ea0_1013|h166bdaf_0|h166bdaf_2|h166bdaf_3|h166bdaf_4|h166bdaf_1|h166bdaf_1014|h36c2ea0_1012']
requests -> python[version='>=3.8,<3.9.0a0'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.13,<1.3.0a0|>=1.2.12,<1.3.0a0']
conda-forge::tqdm -> python[version='>=2.7'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0|>=1.2.13,<1.3.0a0']
pytorch::pytorch -> python[version='>=3.10,<3.11.0a0'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0|>=1.2.13,<1.3.0a0']
dglteam::dgl-cuda11.3 -> python[version='>=3.8,<3.9.0a0'] -> libzlib[version='>=1.2.11,<1.3.0a0|>=1.2.13,<1.3.0a0|>=1.2.12,<1.3.0a0']

Package sqlite conflicts for:
python=3.8 -> sqlite[version='>=3.30.0,<4.0a0|>=3.30.1,<4.0a0|>=3.31.1,<4.0a0|>=3.32.3,<4.0a0|>=3.33.0,<4.0a0|>=3.35.4,<4.0a0|>=3.36.0,<4.0a0|>=3.38.0,<4.0a0|>=3.39.3,<4.0a0|>=3.40.0,<4.0a0|>=3.37.1,<4.0a0|>=3.37.0,<4.0a0|>=3.35.5,<4.0a0|>=3.34.0,<4.0a0']
python=3.8 -> pypy3.8=7.3.9 -> sqlite[version='>=3.38.2,<4.0a0|>=3.39.1,<4.0a0|>=3.39.2,<4.0a0']

Package perl conflicts for:
bioconda::blast -> entrez-direct -> perl[version='>=5.22.0,<5.23.0|>=5.26.2,<5.27.0a0|>=5.32.1,<6.0a0',build=*_perl5]
bioconda::blast -> perl[version='5.22.0.*|>=5.26.2,<5.26.3.0a0']
bioconda::hhsuite -> perl[version='>=5.26.2,<5.26.3.0a0|>=5.32.1,<5.33.0a0',build=*_perl5]
bioconda::infernal -> perl[version='>=5.32.1,<5.33.0a0',build=*_perl5]

Package pypy3.6 conflicts for:
requests -> certifi[version='>=2017.4.17'] -> pypy3.6[version='7.3.*|7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*|>=7.3.1|>=7.3.2|>=7.3.3']
pytorch::pytorch -> python[version='>=3.6,<3.7.0a0'] -> pypy3.6[version='7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*|>=7.3.1|>=7.3.3|>=7.3.2']
bioconda::blast -> boost -> pypy3.6[version='>=7.3.3']
bioconda::hhsuite -> python[version='>=3.6,<3.7.0a0'] -> pypy3.6[version='7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*']
dglteam::dgl-cuda11.3 -> numpy -> pypy3.6[version='7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*|>=7.3.1|>=7.3.2|>=7.3.3']
conda-forge::psutil -> python[version='>=3.6,<3.7.0a0'] -> pypy3.6[version='7.3.*|7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*']
conda-forge::tqdm -> python[version='>=2.7'] -> pypy3.6[version='7.3.0.*|7.3.1.*|7.3.2.*|7.3.3.*']
conda-forge::psutil -> pypy3.6[version='>=7.3.1|>=7.3.2|>=7.3.3']

Package xz conflicts for:
python=3.8 -> xz[version='>=5.2.4,<5.3.0a0|>=5.2.4,<6.0a0|>=5.2.5,<6.0a0|>=5.2.6,<6.0a0|>=5.2.5,<5.3.0a0']
python=3.8 -> pypy3.8=7.3.9 -> xz[version='>=5.2.6,<5.3.0a0']

Package blas conflicts for:
pytorch::pytorch -> numpy[version='>=1.19'] -> blas[version='*|1.0|1.0|1.1',build='openblas|mkl|openblas|openblas']
pytorch::pytorch -> blas=[build=mkl]

Package requests conflicts for:
dglteam::dgl-cuda11.3 -> requests
requests

Package expat conflicts for:
python=3.8 -> pypy3.8=7.3.9 -> expat[version='>=2.4.7,<3.0a0|>=2.4.8,<3.0a0|>=2.4.9,<3.0a0|>=2.5.0,<3.0a0']
conda-forge::psutil -> pypy3.9[version='>=7.3.9'] -> expat[version='>=2.2.9,<3.0.0a0|>=2.3.0,<3.0a0|>=2.4.1,<3.0a0|>=2.4.7,<3.0a0|>=2.4.8,<3.0a0|>=2.4.9,<3.0a0|>=2.5.0,<3.0a0']

Package psutil conflicts for:
dglteam::dgl-cuda11.3 -> psutil
conda-forge|conda-forge::psutil
conda-forge::psutil

Package libnsl conflicts for:
conda-forge::tqdm -> python[version='>=2.7'] -> libnsl[version='>=2.0.0,<2.1.0a0']
bioconda::hhsuite -> perl[version='>=5.32.1,<5.33.0a0',build=*_perl5] -> libnsl[version='>=2.0.0,<2.1.0a0']
conda-forge::psutil -> python[version='>=3.11,<3.12.0a0'] -> libnsl[version='>=2.0.0,<2.1.0a0']
bioconda::infernal -> perl[version='>=5.32.1,<5.33.0a0',build=*_perl5] -> libnsl[version='>=2.0.0,<2.1.0a0']
pytorch::pytorch -> python[version='>=3.10,<3.11.0a0'] -> libnsl[version='>=2.0.0,<2.1.0a0']
requests -> python[version='>=3.8,<3.9.0a0'] -> libnsl[version='>=2.0.0,<2.1.0a0']
bioconda::blast -> perl -> libnsl[version='>=2.0.0,<2.1.0a0']
python=3.8 -> libnsl[version='>=2.0.0,<2.1.0a0']
dglteam::dgl-cuda11.3 -> python[version='>=3.8,<3.9.0a0'] -> libnsl[version='>=2.0.0,<2.1.0a0']

Package cudatoolkit conflicts for:
conda-forge|conda-forge::cudatoolkit=11.3
pytorch::pytorch -> cudatoolkit[version='8.*|>=10.0,<10.1|>=10.1,<10.2|>=10.2,<10.3|>=11.3,<11.4|>=11.6,<11.7|>=11.5,<11.6|>=11.1,<11.2|>=11.0,<11.1|>=9.2,<9.3|>=9.0,<9.1|>=8.0,<8.1|9.*']
conda-forge::cudatoolkit=11.3

Package setuptools conflicts for:
python=3.8 -> pip -> setuptools
dglteam::dgl-cuda11.3 -> networkx -> setuptools

Package gdbm conflicts for:
bioconda::blast -> perl -> gdbm[version='>=1.18|>=1.18,<1.19.0a0']
python=3.8 -> pypy3.8=7.3.9 -> gdbm[version='>=1.18,<1.19.0a0']
conda-forge::psutil -> pypy3.9[version='>=7.3.9'] -> gdbm[version='>=1.18,<1.19.0a0']

Package clangdev conflicts for:
bioconda::cd-hit -> openmp -> clangdev[version='4.0.0|4.0.0|4.0.0.*|5.0.0|5.0.0.*']
bioconda::hhsuite -> openmp -> clangdev[version='4.0.0|4.0.0|4.0.0.*|5.0.0|5.0.0.*']

Package llvm-openmp conflicts for:
pytorch::pytorch -> blas=[build=mkl] -> llvm-openmp[version='>=10.0.0|>=11.0.0|>=11.0.1|>=11.1.0|>=12.0.1|>=13.0.1|>=14.0.4|>=15.0.6|>=14.0.3|>=9.0.1']
bioconda::cd-hit -> _openmp_mutex[version='>=4.5'] -> llvm-openmp[version='8.0.0|8.0.0|8.0.1|>=9.0.1',build='hc9558a2_0|hc9558a2_0|hc9558a2_1']
bioconda::hhsuite -> _openmp_mutex[version='>=4.5'] -> llvm-openmp[version='8.0.0|8.0.0|8.0.1|>=9.0.1',build='hc9558a2_0|hc9558a2_0|hc9558a2_1']

Package tqdm conflicts for:
dglteam::dgl-cuda11.3 -> tqdm
conda-forge::tqdm
conda-forge|conda-forge::tqdm

Package libstdcxx-ng conflicts for:
python=3.8 -> libstdcxx-ng[version='>=11.2.0|>=7.5.0|>=7.3.0|>=9.4.0|>=9.3.0']
python=3.8 -> libffi[version='>=3.2.1,<3.3a0'] -> libstdcxx-ng[version='>=4.9|>=7.2.0']

Package zlib conflicts for:
python=3.8 -> zlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0|>=1.2.13,<1.3.0a0']
python=3.8 -> pypy3.8=7.3.9 -> zlib

Package blast-legacy conflicts for:
biocore::blast-legacy=2.2.26
biocore|biocore::blast-legacy=2.2.26
biocore::psipred=4.01 -> blast-legacy==2.2.26The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__glibc==2.35=0
  - feature:|@/linux-64::__glibc==2.35=0
  - bioconda::blast -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::cd-hit -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::csblast -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::hhsuite -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::hmmer[version='>=3.3'] -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::infernal -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - bioconda::mafft -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - conda-forge::cudatoolkit=11.3 -> __glibc[version='>=2.17,<3.0.a0']
  - conda-forge::cudatoolkit=11.3 -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - conda-forge::psutil -> libgcc-ng[version='>=10.3.0'] -> __glibc[version='>=2.17']
  - pytorch::pytorch -> cudatoolkit[version='>=11.3,<11.4'] -> __glibc[version='>=2.17,<3.0.a0']

Your installed version is: 2.35

Note that strict channel priority may have removed packages required for satisfiability.

I've also tried doing the same on the standard Pytorch 13.1 deep learning AMI for AWS and have run into the same issue. What am I doing wrong here?

possible mistake in make_rna_msa.sh

First of all thanks for the great work!

I think we might have found a mistake in input_prep/make_rna_msa.sh: (line 126)

for e_val in 1e-8, 1e-7, 1e-6, 1e-3, 1e-2, 1e-1
do
    nhmmer --noali -A nhmmer.a2m --incE $e_val --cpu $CPU --watson $in_fasta db | grep 'no alignment saved'
    ...
done

the loop here is wrong: this will pass 1e-8, etc (including the comma in the end) to nhummer but it will work only for 1e-1 which doesn't have a coma in the end. For the rest of values nhummer throws an error message which is hidden by the grep command.

Issue with using BLAST

Hello! Thanks for this great work which helps me a lot!

I am using RoseTTAFold2NA for the prediction of protein and RNA complexes. However, I am facing an error in constructing the MSA of the RNA. The issue arises when the program attempts to execute this command. The following error message is returned:

BLAST Database error: No alias or index file found for nucleotide database [/home/huangtin/RoseTTAFold2NA/RNA/] in search path [/home/huangtin/RoseTTAFold2NA/aptamer_data/test/rna_pred::]

It appears that BLAST is unable to locate the index file of the 'nt' database, even though it has been downloaded. The RNA sequence I am utilizing is GCUUCUGGACUGCGAUGGGAGCACGAAACGUCGUGGCGCAAUUGGGUGGGGAAAGUCCUUAAAAGAGGGCCACCACAGAAGC.

I would appreciate any assistance in resolving this issue. Thanks!

Inconsistent outputs

Hello!

I've been testing the software by running the same input protein and DNA sequences 10 times, and the outputs seem to be quite inconsistent. The prediction accuracy is similar as 0.66-0.67 in most cases, but DNAs bind different regions of the proteins. The files are attatched below. I wonder if this is an issue on my side or associated with the weights.
RoseTTAFold2NA_output.zip

I also noticed that sometimes DNAs overlap with proteins
image

Or some residues appear with "- - - -" when the structure is visualized.
image

Thanks for your help!

perl oom program kiiled

When I run the ../input_prep/reprocess_rnac.pl id_mapping.tsv.gz rfam_annotations.tsv.gz command I get this error:
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=reprocess_rnac.,pid=334,uid=1000
[ 142.115180] Out of memory: Killed process 334 (reprocess_rnac.) total-vm:9579944kB, anon-rss:7421640kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:18624kB oom_score_adj:0

anyone know what this means?

Typo:cmpress->compress

In the section 4. Download sequence and structure databases of README, whether cmpress Rfam.cm is a typo, shoud be compress Rfam.cm?

numpy 1.24 breaks the code

Noticed the following error when running the default env yaml - "AttributeError: module 'numpy' has no attribute 'long'"
Post installing numpy 1.23 with python 3.10 it seems to be fine. Suggest updating the yaml ?

Training data

Hi, thanks for your amazing work

I wanted to ask about sharing the training code and data?
for RNA/DNA data all of the thing I see in repo are sequences. could you please share pdb files

Thanks

Downloading the RNA database according to README instructions

In the README for downloading the RNA databases, there maybe a small typo:
When I do wget with the flag -C (as shown below) I get invalid option. Should this be lowercase "c" for all??:
wget ftp://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/Rfam.cm.gz -C ./RNA
wget: invalid option -- 'C'
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.
(RF2NA) [osu10269@pitzer-login04 RoseTTAFold2NA]$ wget ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/sequences/rnacentral_species_specific_ids.fasta.gz -C ./RNA
wget: invalid option -- 'C'
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.
(RF2NA) [osu10269@pitzer-login04 RoseTTAFold2NA]$ wget ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/rfam/rfam_annotations.tsv.gz -C ./RNA
wget: invalid option -- 'C'
Usage: wget [OPTION]... [URL]...

File Sizes

In order to actually install RF2NA how much storage needs to be allocated to the system? The compressed file sizes are listed, but not how much storage the databases they're extracted to require.

How to handle case where no RNA families are found

Hi, @fdimaio. Thanks again for making this project open-source. I had a question RNA MSA generation. For some small RNA sequence inputs I've tried running through RF2NA, the RNA MSA generation script (i.e., make_rna_msa.sh) issues a series of errors stemming from the following line, in the event that no RNA families are found by cmscan. It seems like in this case all proceeding lines fail because they assume at least one family was found. Is there a simple adjustment that we can make to this MSA generation script to handle the case where no RNA families are found? For example, would this simply mean that the script should exit early instead of proceeding with subsequent steps?

Another way of phrasing this questions would be, does RF2NA have a way of performing single-sequence predictions for RNA FASTA inputs? If not, what changes would likely be necessary?

https://github.com/uw-ipd/RoseTTAFold2NA/blob/03f12bd421db618455d9c0726f79f72433a8638e/input_prep/make_rna_msa.sh#L62C1-L62C1

The error to predict the protein and RNA structure

When I use this code, the machine reports wrong, shown as below, did you know how to resolve this? Thank you very much.

Traceback (most recent call last):
File "/home/software/RoseTTAFold2NA/network/predict.py", line 345, in
pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
File "/home/software/RoseTTAFold2NA/network/predict.py", line 146, in predict
msa_i, ins_i = parse_a3m(a3m_i, unzip=False)
File "/home/software/RoseTTAFold2NA/network/parsers.py", line 221, in parse_a3m
msa = np.array([list(s) for s in msa], dtype='|S1').view(np.uint8)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (722,) + inhomogeneous part.

Run RNA prediction error

When I predict the structure of this RNA file, and get error because no data got back. Could you help me for resolving this problem?

RNA
AAUUUCUACUAAGUGUAGAUC

Thanks.

ERROR: failed to load model

Hello. I was trying to fold a ssDNA and I get the above error.

Running HHblits
Running PSIPRED
Running hhsearch
Running RoseTTAFold2NA to predict structures
Running on GPU
ERROR: failed to load model
Done

Any advice is appreciated.

makeblastdb show add '-parse_seqids' option.

BLAST version: ncbi-blast-2.13.0+

gunzip -c rnacentral_species_specific_ids.fasta.gz | makeblastdb -in - -dbtype nucl  -out rnacentral.fasta -title "RNACentral"

should be changed to:

gunzip -c rnacentral_species_specific_ids.fasta.gz | makeblastdb -in - -parse_seqids -dbtype nucl  -out rnacentral.fasta -title "RNACentral"

otherwise, blastdbcmd will raise error:

blastdbcmd -db $db -entry_batch $tag.list.split.$suffix -out tmp.$tag.db.$suffix -outfmt ">Accession:%a_TaxID:%T @%s" &> /dev/null
Error: [blastdbcmd] Skipped URS0002332FB3_39947                                     
Error: [blastdbcmd] Skipped URS0002332FB3_40148  
Error: [blastdbcmd] Skipped URS0002332FB3_4529 

Issues with RNA multiple sequence alignment

Hi, thanks for the amazing work!!

I am having some issues when trying to run the model for a protein-RNA docking. I first tried with the protein and RNA sequences you provided for the example and everything worked very well. But when trying with other sequences I get some errors.

The stderr files relative to the hhsearch and the protein msa do not contain errors.
This is the content of the stderr file for the rna msa.

/aplic/noarch/software/RoseTTAFold2NA/0.2-Miniconda3-4.9.2/input_prep/make_rna_msa.sh: line 9: [: missing `]'
rm: cannot remove ‘rfam1.list.split.’: No such file or directory
rm: cannot remove ‘rfam2.list.split.
’: No such file or directory
rm: cannot remove ‘blastn1.list.split.*’: No such file or directory
/aplic/noarch/software/RoseTTAFold2NA/0.2-Miniconda3-4.9.2/input_prep/make_rna_msa.sh: line 101: 221255 Aborted (core dumped) cd-hit-est-2d -T $CPU -i $in_fasta -i2 trim.db -c $cut -o cdhitest2d.db -l $throw_away_sequences -M 0 &>/dev/null
grep: db: No such file or directory
rm: cannot remove ‘cdhitest2d.db’: No such file or directory
rm: cannot remove ‘cdhitest2d.db.clstr’: No such file or directory
rm: cannot remove ‘db.clstr’: No such file or directory

Error: Failed to open target sequence database db for reading

Alignment input open failed.
couldn't open nhmmer.a2m for reading

  • 18:08:17.021 INFO: Input file = guide_RNA.wquery.unfilt.afa

  • 18:08:17.022 INFO: Output file = guide_RNA.afa

  • 18:08:17.022 ERROR: In /opt/conda/conda-bld/hhsuite_1659427602200/work/src/hhalignment.cpp:502: Read:

  • 18:08:17.022 ERROR: No sequences found in file guide_RNA.wquery.unfilt.afa

grep: guide_RNA.afa: No such file or directory

Error: Failed to open target sequence database db for reading

Alignment input open failed.
couldn't open nhmmer.a2m for reading

  • 18:08:17.230 INFO: Input file = guide_RNA.wquery.unfilt.afa

  • 18:08:17.230 INFO: Output file = guide_RNA.afa

  • 18:08:17.230 ERROR: In /opt/conda/conda-bld/hhsuite_1659427602200/work/src/hhalignment.cpp:502: Read:

  • 18:08:17.230 ERROR: No sequences found in file guide_RNA.wquery.unfilt.afa

Error: Failed to open target sequence database db for reading

Alignment input open failed.
couldn't open nhmmer.a2m for reading

  • 18:08:17.433 INFO: Input file = guide_RNA.wquery.unfilt.afa

  • 18:08:17.433 INFO: Output file = guide_RNA.afa

  • 18:08:17.434 ERROR: In /opt/conda/conda-bld/hhsuite_1659427602200/work/src/hhalignment.cpp:502: Read:

  • 18:08:17.434 ERROR: No sequences found in file guide_RNA.wquery.unfilt.afa

Error: Failed to open target sequence database db for reading

Alignment input open failed.
couldn't open nhmmer.a2m for reading

  • 18:08:17.637 INFO: Input file = guide_RNA.wquery.unfilt.afa

  • 18:08:17.637 INFO: Output file = guide_RNA.afa

  • 18:08:17.638 ERROR: In /opt/conda/conda-bld/hhsuite_1659427602200/work/src/hhalignment.cpp:502: Read:

  • 18:08:17.638 ERROR: No sequences found in file guide_RNA.wquery.unfilt.afa

Error: Failed to open target sequence database db for reading

Alignment input open failed.
couldn't open nhmmer.a2m for reading

  • 18:08:17.840 INFO: Input file = guide_RNA.wquery.unfilt.afa

  • 18:08:17.840 INFO: Output file = guide_RNA.afa

  • 18:08:17.841 ERROR: In /opt/conda/conda-bld/hhsuite_1659427602200/work/src/hhalignment.cpp:502: Read:

  • 18:08:17.841 ERROR: No sequences found in file guide_RNA.wquery.unfilt.afa

Error: Failed to open target sequence database db for reading

Alignment input open failed.
couldn't open nhmmer.a2m for reading

  • 18:08:18.044 INFO: Input file = guide_RNA.wquery.unfilt.afa

  • 18:08:18.044 INFO: Output file = guide_RNA.afa

  • 18:08:18.045 ERROR: In /opt/conda/conda-bld/hhsuite_1659427602200/work/src/hhalignment.cpp:502: Read:

  • 18:08:18.045 ERROR: No sequences found in file guide_RNA.wquery.unfilt.afa

rm: cannot remove ‘nhmmer.a2m’: No such file or directory

protein
MIRNKAFVVRLYPNAAQTELINRTLGSARFVYNHFLARRIAAYKESGKGLTYGQTSSELTLLKQAEETSWLSEVDKFALQNSLKNLETAYKNFFRTVKQSGKKVGFPRFRKKRTGESYRTQFTNNNIQIGEGRLKLPKLGWVKTKGQQDIQGKILNVTVRRIHEGHYEASVLCEVEIPYLPAAPKFAAGVAVGIKDFAIVTDGVRFKHEQNPKYYRSTLKRLRKAQQTLSRRKKGSARYGKAKTKLARIHKRIVNKRQDFLHKLTTSLVREYEIIGTEHLKPDNMRKNRRLALSISDAGWGEFIRQLEYKAAWYGRLVSKVSPYFPSSQLCHDCGFKNPEVKNLAVRTWTCPNCGETHDRDENAALNIRREALVAAGISDTLNAHGGYVRPASAGNGLRSENHATLVVSRADPKKK

RNA
TCGGCGTGAAGCGTTGGTGGCTGCGGGAATCTCAGACACCTTAAACGCTCATGGAGGCTATGTCAGACCTGCTTCGGGGGCAATGGTCTGCGAAGTGAGAATCACGCGACTTTAGTCGTGTGAGGTTCAAGAGTCCCTTGGCGCCC

P.S.: I get the same issue also when trying to use this RNA sequence with the protein you provided in the example, but I really do not understand what is wrong with the RNA sequence I am using.

Thanks!!

Double stranded DNA does not fold properly

Hi,
I tried to create a double stranded DNA-protein complex with the following command:
../run_RF2NA.sh 00_7tqb-H-B_1 P:00_7tqb-H-B_1_prot.fa D:00_7tqb-H-B_1_DNA0.fa D:00_7tqb-H-B_1_DNA1.fa

Where the 3 fasta files contained these sequences:
P:00_7tqb-H-B_1_prot.fa:

7tqb-H-B_1
EVQLQQSGPELVKPGASVKMSCKASGYTFTSYVMHWVKQKPGQGLEWIGFINLYNDGTKYNEKFKGKATLTSDKSSSTAYMELSSLTSKDSAVYYCARDYYGSRWFDYWGQGTTLTVSSAKTTAPSVYPLAPVSVTLGCLVKGYFPEPVTLTWNSGSLSSGVHTFPAVLQSDLYTLSSSVTVTSSTWPSQSITCNVAHPASSTKVDKK

00_7tqb-H-B_1_DNA0.fa: <-- forward DNA strand

7tqb-H-B_1
ATTGTGATAAGATAACATGATGAAAATATTGAATCTCGTGTCAGACAAGACCACGAGCCC

00_7tqb-H-B_1_DNA1.fa: <-- reverse DNA strand

7tqb-H-B_1
TAACACTATTCTATTGTACTACTTTTATAACTTAGAGCACAGTCTGTTCTGGTGCTCGGG

However, the DNA-protein complex predicted by RoseTTAFold2NA contained DNA with incorrect folding and a long 'link' between the two DNA strands:
image

Was this incorrect folding of the double-stranded DNA observed often and a mistake the model (commonly) makes? Or is this an implementation error on my side?

Any help is much appreciated!

Usage of Uniref30 version

I would like to share the database uniref30 with AlphaFold.
The RTF and RF2NA use Uniref30 2020_06 version, however, AF uses the 2021_03 version.
Does the database version interrupt the prediction of the structure?

RNA alphabet error

Incredible work! Thanks so much for sharing! I've been encountering a PyTorch error when using RNA sequences that contain "U". I think I've traced it to network/parsers.py line 104: shouldn't the rna_alphabet contain U instead of T?

C3' used as end axis for delta torsion

RTs_by_torsion[i,NPROTTORS+5,:3,3] = xyzs_in_base_frame[i,7,:3] # C3'

Hi, I think I may be misunderstanding something. In

[" O3'",12, ( 0.4966, 1.3432, 0.000)],
, it seems clear that the delta torsion is supposed to be defining the position of the O3' atom. However, in the code referenced above, the rotation is being defined on the C3' atom. Is this an issue?

RNA connected to end of protein

I ran this program successfully, but the output PDB file does not look right. The last amino acid of the protein is connected to the first RNA base. Is there a way to fix this?

Trying to model homodimer results in two monomers in same space.

I am trying to model a protein homodimer complexed with dsDNA. I have entered in the two identical protein monomer chains as separate files and DNA strands as two separate files. In a command similar to this:
../run_RF2NA.sh t000_ Protein1.fa Protein2.fa D:DNA-box_F.fa D:DNA-box_R.fa
where protein1 and protein2 are an identical sequence in two separate files, also named as protein1 and protein2 in the fasta headers.
However, the output model has the two copies of the protein superimposed on top of each other as near-identical models, rather than as separate models in a homodimer complex.
Am I doing something wrong here or is this a limitation of roseTTafold?

RuntimeError: Class values must be smaller than num_classes.

Hello,

I am getting the following error message when running on GPU:

Traceback (most recent call last):
  File "/home/.../RoseTTAFold2NA/network/predict.py", line 345, in <module>
    pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
  File "/home/.../RoseTTAFold2NA/network/predict.py", line 225, in predict
    self._run_model(Ls, msa_orig, ins_orig, t1d, t2d, xyz_t, xyz_t[:,0], alpha_t, "%s_%02d"%(out_prefix, i_trial))
  File "/home/.../RoseTTAFold2NA/network/predict.py", line 233, in _run_model
    seq, msa_seed_orig, msa_seed, msa_extra, mask_msa = MSAFeaturize(
  File "/home/.../RoseTTAFold2NA/network/data_loader.py", line 116, in MSAFeaturize
    raw_profile = raw_profile.float().mean(dim=0)
RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Or when running on CPU:

Traceback (most recent call last):
  File "/home/.../RoseTTAFold2NA/network/predict.py", line 345, in <module>
    pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
  File "/home/.../RoseTTAFold2NA/network/predict.py", line 225, in predict
    self._run_model(Ls, msa_orig, ins_orig, t1d, t2d, xyz_t, xyz_t[:,0], alpha_t, "%s_%02d"%(out_prefix, i_trial))
  File "/home/.../RoseTTAFold2NA/network/predict.py", line 233, in _run_model
    seq, msa_seed_orig, msa_seed, msa_extra, mask_msa = MSAFeaturize(
  File "/home/.../RoseTTAFold2NA/network/data_loader.py", line 115, in MSAFeaturize
    raw_profile = torch.nn.functional.one_hot(msa, num_classes=NAATOKENS)
RuntimeError: Class values must be smaller than num_classes.

Do you have any idea what might cause this and how to fix?

Thanks!

RuntimeError: CUDA out of memory.

Hi,
Thanks for creating this tool and providing the code. I am able to run the prediction on the example sequences, but when I am trying to make a prediction for my usecase, I run into 'CUDA out of memory errors'.

Here's the entire error message:

Traceback (most recent call last):
File "/home/akash.bahai/RoseTTAFold2NA/network/predict.py", line 346, in
pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
File "/home/akash.bahai/RoseTTAFold2NA/network/predict.py", line 226, in predict
self.run_model(Ls, msa_orig, ins_orig, t1d, t2d, xyz_t, xyz_t[:,0], alpha_t, "%s%02d"%(out_prefix, i_trial))
File "/home/akash.bahai/RoseTTAFold2NA/network/predict.py", line 270, in _run_model
logit_s, logit_aa_s, logit_pae, init_crds, alpha_prev, _, pred_lddt_binned, msa_prev, pair_prev, state_prev = self.model(
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/akash.bahai/RoseTTAFold2NA/network/RoseTTAFoldModel.py", line 93, in forward
pair, state = self.templ_emb(t1d, t2d, alpha_t, xyz_t, pair, state, use_checkpoint=use_checkpoint)
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/akash.bahai/RoseTTAFold2NA/network/Embeddings.py", line 190, in forward
templ = self.templ_stack(templ, xyz_t, use_checkpoint=use_checkpoint) # (B, T, L,L, d_templ)
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/akash.bahai/RoseTTAFold2NA/network/Embeddings.py", line 132, in forward
templ = self.block[i_block](templ, rbf_feat)
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/akash.bahai/RoseTTAFold2NA/network/Track_module.py", line 95, in forward
pair = pair + self.drop_row(self.row_attn(pair, rbf_feat))
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/akash.bahai/RoseTTAFold2NA/network/Attention_module.py", line 453, in forward
pair = self.norm_pair(pair)
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 189, in forward
return F.layer_norm(
File "/home/akash.bahai/.conda/envs/RF2NA/lib/python3.8/site-packages/torch/nn/functional.py", line 2503, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: CUDA out of memory. Tried to allocate 4.01 GiB (GPU 0; 31.75 GiB total capacity; 22.47 GiB already allocated; 3.67 GiB free; 27.22 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

This error is happening in the last step of the pipeline i.e. end-to-end prediction. I am using a V100 with 32 GB of memory.
The length of the RNA is ~1300 NT's and and the protein is ~700 amino acids. Can you please estimate the size of the GPU memory required for such an usecase?

A bug in make_rna_msa.sh

blastdbcmd -entry_batch should begin with the sequence ID,when using accession as input will return wrong series.
For excample:

$ printf "%s %s %s \n" 1M5O_B -4-98 minus 30 | blastdbcmd -db $db -entry_batch -
$ 6B6H_1 Chain 1, SYNTHETIC NONTEMPLATE STRAND DNA (88-MER)
CGCCGCGTCAGACTGCACACATTATAGCATACGTGAGGTGGGATGTCAAGGCCTTTTTTGCCTAAAATGTGATCTAGATC
ACATTTTN
Error: [blastdbcmd] Skipped 30

The input was 1M5O_B but 6B6H_1 was return,It is recommended to modify blastn -outfmt to '6 sgi smart send saver evalue bitscore nident staxids'

B-factor column all 1.0 with model, no pLDDT in PDB file?

Hi, thanks for open-sourcing this model!

When I run the example in the README:
../run_RF2NA.sh t000_ protein.fa R:RNA.fa
it runs fine.
But rather than a "per-residue LDDT in the B-factor column" of the PDB, I see 1.00 for all residues.

Note, the pLDDT seems to be calculated OK: it is accessible via the provided .npz file, though this isn't particularly convenient.

Output model in this zip: model_00.zip.
(As an aside, there's also a bit of a clash in the region 147 UAUA 150, but I don't think this is related).

open source data?

Hi. Great work! Are you interested in open source the training and test data (rna, dna, protein-rna complex data)?

weights size mismatch

Hi,

Appreciate such awesome work!

I just bump into the size difference of weights issue between what i got and what it should be.
From the README, it should be 1.6G but I only see a shy 800M.
I used the same path to download and i saw that wget reached to 100%.
Whenever you have a chance, can you please take a quick look at it?

Thank you.

best,
heejong

minimization of rna/protein outputs in rosetta

Hi,

This may be slightly off-topic and if so please feel free to close it. I was wondering if you had a protocol within rosetta to parameterize and minimize the resulting RNA/protein complexes that come out of the predictions.

I was hoping to get further refinement of the model.

Thanks!

Tom

Unresolved references, network/loss.py not used

I want to finetune the model on my own data and some references are missing. Some examples:

  • network/loss.py: the functions are not called anywhere
  • network/ffindex.py: Unresolved reference 'read_lines'
  • network/parsers.py: Unresolved reference 'L_s', Unresolved reference 'FFindexDB'

Where is the loss.py code actually used?

example possible failure `conda activate RF2NA && ../run_RF2NA.sh t000_ protein.fa R:RNA.fa`

Hello

running the above commands yields:
stdout:

Running HHblits
Running PSIPRED
Running hhsearch
Running rMSA (lite)
Running RoseTTAFold2NA to predict structures
Running on GPU
           plddt    best
RECYCLE  0   0.875  -1.000
RECYCLE  1   0.893   0.875
RECYCLE  2   0.897   0.893
RECYCLE  3   0.899   0.897
RECYCLE  4   0.900   0.899
RECYCLE  5   0.900   0.900
RECYCLE  6   0.902   0.900
RECYCLE  7   0.902   0.902
RECYCLE  8   0.901   0.902
RECYCLE  9   0.901   0.902
Done

stderr:

/pasteur/appa/scratch/tru/RoseTTAFold2NA/network/parsers.py:116: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  msa[msa == "U"] = 30
/pasteur/appa/scratch/tru/miniconda3/envs/RF2NA/lib/python3.8/site-packages/e3nn/o3/_spherical_harmonics.py:82: UserWarning: FALLBACK path has been taken inside: compileCudaFusionGroup. This is an indication
 that codegen Failed for some reason.
To debug try disable codegen fallback path via setting the env variable `export PYTORCH_NVFUSER_DISABLE=fallback`
To report the issue, try enable logging via setting the envvariable ` export PYTORCH_JIT_LOG_LEVEL=manager.cpp`
 (Triggered internally at  /opt/conda/conda-bld/pytorch_1659484810403/work/torch/csrc/jit/codegen/cuda/manager.cpp:237.)
  sh = _spherical_harmonics(self._lmax, x[..., 0], x[..., 1], x[..., 2])

installation problem, what is wrong ?

ld2NA/RoseTTAFold2NA/example$ ../run_RF2NA.sh t000_ protein.fa R:RNA.fa
Running RoseTTAFold2NA to predict structures
Running on CPU
plddt best
/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/amp/autocast_mode.py:202: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
warnings.warn('User provided device_type of 'cuda', but CUDA is not available. Disabling')
Traceback (most recent call last):
File "/media/katja/BIGDATA/RoseTTAFold2NA/RoseTTAFold2NA/network/predict.py", line 345, in
pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
File "/media/katja/BIGDATA/RoseTTAFold2NA/RoseTTAFold2NA/network/predict.py", line 225, in predict
self.run_model(Ls, msa_orig, ins_orig, t1d, t2d, xyz_t, xyz_t[:,0], alpha_t, "%s%02d"%(out_prefix, i_trial))
File "/media/katja/BIGDATA/RoseTTAFold2NA/RoseTTAFold2NA/network/predict.py", line 269, in _run_model
logit_s, logit_aa_s, logit_pae, init_crds, alpha_prev, _, pred_lddt_binned, msa_prev, pair_prev, state_prev = self.model(
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/media/katja/BIGDATA/RoseTTAFold2NA/RoseTTAFold2NA/network/RoseTTAFoldModel.py", line 96, in forward
msa, pair, xyz, alpha_s, xyzallatom, state = self.simulator(
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/media/katja/BIGDATA/RoseTTAFold2NA/RoseTTAFold2NA/network/Track_module.py", line 430, in forward
msa_full, pair, xyz, state, alpha = self.extra_block[i_m](msa_full, pair,
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/media/katja/BIGDATA/RoseTTAFold2NA/RoseTTAFold2NA/network/Track_module.py", line 356, in forward
xyz, state, alpha = self.str2str(msa.float(), pair.float(), xyz.detach().float(), state.float(), idx, top_k=top_k)
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
return func(args, **kwargs)
File "/media/katja/BIGDATA/RoseTTAFold2NA/RoseTTAFold2NA/network/Track_module.py", line 223, in forward
shift = self.se3(G, node.reshape(B
L, -1, 1), l1_feats, edge_feats)
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/media/katja/BIGDATA/RoseTTAFold2NA/RoseTTAFold2NA/network/SE3_network.py", line 83, in forward
return self.se3(G, node_features, edge_features) #, clamp_d=clamp_d)
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/se3_transformer-1.0.0-py3.8.egg/se3_transformer/model/transformer.py", line 163, in forward
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/se3_transformer-1.0.0-py3.8.egg/se3_transformer/model/basis.py", line 166, in get_basis
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/contextlib.py", line 113, in enter
return next(self.gen)
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/cuda/nvtx.py", line 86, in range
range_push(msg.format(*args, **kwargs))
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/cuda/nvtx.py", line 28, in range_push
return _nvtx.rangePushA(msg)
File "/home/katja/anaconda3/envs/RF2NA/lib/python3.8/site-packages/torch/cuda/nvtx.py", line 9, in _fail
raise RuntimeError("NVTX functions not installed. Are you sure you have a CUDA build?")
RuntimeError: NVTX functions not installed. Are you sure you have a CUDA build?

run_RF2NA.sh :: option to set CPU and MEM requirements

Hello,

run_RF2NA.sh have hardcode values for CPU and MEM requirements.

would it be possible to have either options to set those values, either env var to achieve the same task.
this way one can change the requirements to fit his needs without editing the script

let me explain.

I'm trying to install RF2NA on. a cluster on a shared volume mountred readonly. so

  1. obviously user can't change CPU//MEM requirements
  2. RN2FA will be run troughg slurm allocation. so given the hardcoded values we may have oversubscriotion regarding the cpu and mem requirement.
    default slurm allocation is 1 CPU // 4G MEM in our setup.
    slurm allow cpu and mem requirements with eg: -c, --cpus-per-task=ncpus, --mem=MB
    I am currently able to set the CPU requirements by tweaking the script
    CPU=8 -> CPU={$SLURM_CPUS_PER_TASK:1} to ensure that RF2NA won't use more CPU than what was allocated.
    for MEM I have a problem. depending on how slurmjob are run srun//sbatch slurm set or not env var for the requested memory
    sbatch setSLURM_MEM_PER_NODE and SLURM_MEM_PER_CPU but srun won't ;-(

considering the above it would be usefull to have the ability to set via some mechanism the CPU//MEM requirements of run_RF2NA.sh

regards

Eric

_spherical_harmonics.py error when running example

After complete installation according to instructions and testing the example, I have the following error
Running on GPU
plddt best
/home/xxxx/miniconda3_4.6.14/envs/RF2NA/lib/python3.8/site-packages/e3nn/o3/_spherical_harmonics.py:82: UserWarning: FALLBACK path has been taken inside: compileCudaFusionGroup. This is an indication that codegen Failed for some reason.
To debug try disable codegen fallback path via setting the env variable export PYTORCH_NVFUSER_DISABLE=fallback
To report the issue, try enable logging via setting the envvariable export PYTORCH_JIT_LOG_LEVEL=manager.cpp
(Triggered internally at /opt/conda/conda-bld/pytorch_1659484810403/work/torch/csrc/jit/codegen/cuda/manager.cpp:237.)
sh = _spherical_harmonics(self._lmax, x[..., 0], x[..., 1], x[..., 2])

However the program continues and generates models

How to choose GPU for execution?

Hi,

How do I tell RF2NA which GPU to use? The system assigns number 0 to my second GPU while the more powerful GPU is number 1. The more powerful GPU is the closest to the CPU, so moving them around is futile.

So, how do I tell run_RF2NA.sh to use GPU 1, but not GPU 0?

Thanks,

Petr

Usage of PR tag is unclear

The README says: Use the tag PR:xxx.fa to specify paired protein/RNA.

The code says "Merge MSAs based on taxonomy ID", which I believe makes sense in my case (binding tRNA to synthetase dimer, all from same organism source) but I cannot figure out how to specify the PR flag in the input.

run_RF2NA.sh does not test if [ $type = 'PR' ]

What goes in that 'PR:xxx.fa' fasta file? Concatenation of protein and RNA sequence? What's the sequence separator?

"No hits satisfy inclusion thresholds; no alignment saved" for example files.

Unable to pass make_msa.RNA nhmmer step as nothing passes and therefore no sequence data is given to subsequent steps. Currently says "No sequences found in file RNA.wquery.unfilt.afa" in the stderr log file.

I am fairly confident that all databases utilized in this step are available and correctly downloaded -is there potentially another reason this may not be passing with the example files? Everything was initialized/downloaded as mentioned in the most recent README, and in order.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.