I'm still banging my head on this problem. I've worked on this off and on for the last

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

make nt database index about centrifuge HOT 8 OPEN

jrherr commented on September 18, 2024 1

make nt database index

from centrifuge.

Comments (8)

fconstancias commented on September 18, 2024

I am having an issue as well on building the nt index. I just followed the manual and I got this error :

centrifuge-build -p 16 --bmax 1342177280 --conversion-table gi_taxid_nucl.map --taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp nt.fa nt

_
...
Warning: taxomony id doesn't exists for NM_001348188.1!
Warning: taxomony id doesn't exists for NM_001348191.1!
Warning: taxomony id doesn't exists for NM_001348189.1!
Warning: taxomony id doesn't exists for NM_001348193.1!
Warning: taxomony id doesn't exists for NM_001348190.1!
Warning: taxomony id doesn't exists for NM_001348194.1!
Warning: taxomony id doesn't exists for NM_001348192.1!
Warning: taxomony id doesn't exists for NM_003831.4!
Warning: taxomony id doesn't exists for NG_001488.3!
Warning: taxomony id doesn't exists for NG_052639.1!
Warning: taxomony id doesn't exists for NG_052638.1!
Warning: taxomony id doesn't exists for NR_145473.1!
Warning: taxomony id doesn't exists for NR_145472.1!
Warning: taxomony id doesn't exists for NR_145477.1!
Warning: taxomony id doesn't exists for NR_145465.1!
Warning: taxomony id doesn't exists for NR_145474.1!
Warning: taxomony id doesn't exists for NR_145475.1!
Warning: taxomony id doesn't exists for NR_103441.2!
Warning: taxomony id doesn't exists for NR_145476.1!
Warning: taxomony id doesn't exists for NR_145471.1!
Error: taxonomy/nodes.dmp doesn't exist!
Total time for call to driver() for forward index: 02:09:20
Error: Encountered internal Centrifuge exception (#1)
Command: centrifuge-build-bin --wrapper basic-0 -p 16 --bmax 1342177280 --conversion-table gi_taxid_nucl.map --taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp nt.fa nt
Deleting "nt.1.cf" file written during aborted indexing attempt.
Deleting "nt.2.cf" file written during aborted indexing attempt.
Deleting "nt.3.cf" file written during aborted indexing attempt.
_

Do you have any idea what is wrong with my command?
Thanks a lot.

Flo

from centrifuge.

xgnusr commented on September 18, 2024

The same issue ..............
Any news ????

from centrifuge.

feltzmc commented on September 18, 2024

@xgnusr Building the nt index takes a lot of memory, I managed to build it successfully on an AWS server with 488 GB of memory, but it fails on a server with 196 GB.

Command used:
centrifuge-build -p 16 --conversion-table acc2tax.map --taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp nt.fa nt

from centrifuge.

GastonViarengo commented on September 18, 2024

@xgnusr Building the nt index takes a lot of memory, I managed to build it successfully on an AWS server with 488 GB of memory, but it fails on a server with 196 GB.

Command used:
centrifuge-build -p 16 --conversion-table acc2tax.map --taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp nt.fa nt

Is this the memory requeriment also for the RefSeq database? I'm trying to make it work but been unsuccesful in creating a bacterial index (#199 (comment)). How can I get access to the AWS with that amount of RAM??? Thanks for any help!

from centrifuge.

feltzmc commented on September 18, 2024

@GastonViarengo the memory required for RefSeq could be even higher. I believe RefSeq complete is around 120GB at the moment, whereas nt is around 94GB in BLAST db form. As for accessing an AWS with a large amount of RAM, you would first need to have an account on Amazon Web Services (https://aws.amazon.com/). From there I would recommend building a machine image on a free micro instance with centrifuge already installed, then creating an AWS storage drive with the concatenated RefSeq files in FASTA format. Once you are all set up and ready to build then you would want to reserve an AWS VM with a large amount of memory, load the image, mount the drive and begin the build process. It is a somewhat complicated process so you may want to have someone familiar with AWS walk you through. Please note that running AWS machines and reserving storage is not free and you will need to pay for any resources you use. Good luck!

from centrifuge.

GastonViarengo commented on September 18, 2024

Thanks Matthew for the quick response! I have downloaded bacterial RefSeq database (through centrifuge-download) and it's around 70GB in FASTA form, that's why I was wondering why it takes so much memory to build the index. I'll check out AWS but it's complicated for us to be able to pay such services, do you (or anyone) know a FREE server with that amount of memory? Another approach I am thinking is to split the bacterial DB and create multiple indexes, but that's annoying for further analysis! Thanks so much for your help!

from centrifuge.

feltzmc commented on September 18, 2024

@GastonViarengo Ah 70GB should be much more manageable than 120. I don't know of any free resources with that amount of memory but if you are in academia you might be able to reach out to nearby universities and find one that has a high performance compute cluster with a "fat node". Alternatively it looks like this site (https://genexa.ch/sars2-bioinformatics-resources/) has some free centrifuge databases built with recent (March 2020) RefSeq data. Best of luck!

from centrifuge.

GastonViarengo commented on September 18, 2024

Ok Matthew, thanks again! Yes, I downloaded those DBs, but I wanted to work with a bacteria only RefSeq index (or at least a human+bacteria). Anyway, I'll keep trying and looking for options. Bests!

from centrifuge.

make nt database index about centrifuge HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs