GithubHelp home page GithubHelp logo

taxonomywelder's Introduction

TaxonomyWelder

Warning: Readme requires updating!

query/parsing/merging tool for NCBI/GTDB/SILVA taxonomies

produces two files:

  1. df_gtdb_to_ncbi_to_gg_silva_archaea_plus_bacteria_trim.tsv
GTDB_accession  RefSeq_or_GenBank ncbi_genbank_assembly_accession ncbi_taxid  gtdb_taxonomy ncbi_taxonomy ncbi_taxonomy_unfiltered  ssu_gg_taxonomy ssu_silva_taxonomy
GB_GCA_000007185.1  GB  GCA_000007185.1 190192  d__Archaea;p__Methanobacteriota;c__Methanopyri;o__Methanopyrales;f__Methanopyraceae;g__Methanopyrus;s__Methanopyrus kandleri  d__Archaea;p__Euryarchaeota;c__Methanopyri;o__Methanopyrales;f__Methanopyraceae;g__Methanopyrus;s__Methanopyrus kandleri  d__Archaea;p__Euryarchaeota;c__Methanopyri;o__Methanopyrales;f__Methanopyraceae;g__Methanopyrus;s__Methanopyrus kandleri;x__Methanopyrus kandleri AV19k__Archaea;p__Euryarchaeota;c__Methanopyri;o__Methanopyrales;f__Methanopyraceae;g__Methanopyrus;s__kandleri Archaea;Euryarchaeota;Methanopyri;Methanopyrales;Methanopyraceae;Methanopyrus;Methanopyrus kandleri AV19
GB_GCA_000007345.1  GB  GCA_000007345.1 188937  d__Archaea;p__Halobacteriota;c__Methanosarcinia;o__Methanosarcinales;f__Methanosarcinaceae;g__Methanosarcina;s__Methanosarcina acetivorans  d__Archaea;p__Euryarchaeota;c__Methanomicrobia;o__Methanosarcinales;f__Methanosarcinaceae;g__Methanosarcina;s__Methanosarcina acetivorans d__Archaea;p__Euryarchaeota;x__Stenosarchaea group;c__Methanomicrobia;o__Methanosarcinales;f__Methanosarcinaceae;g__Methanosarcina;s__Methanosarcina acetivorans;x__Methanosarcina acetivorans C2A  none  Archaea;Euryarchaeota;Methanomicrobia;Methanosarcinales;Methanosarcinaceae;Methanosarcina;Methanosarcina acetivorans C2A
GB_GCA_000008085.1  GB  GCA_000008085.1 228908  d__Archaea;p__Nanoarchaeota;c__Nanoarchaeia;o__Nanoarchaeales;f__Nanoarchaeaceae;g__Nanoarchaeum;s__Nanoarchaeum equitans d__Archaea;p__Nanoarchaeota;c__;o__Nanoarchaeales;f__Nanoarchaeaceae;g__Nanoarchaeum;s__Nanoarchaeum equitans d__Archaea;x__DPANN group;p__Nanoarchaeota;o__Nanoarchaeales;f__Nanoarchaeaceae;g__Nanoarchaeum;s__Nanoarchaeum equitans;x__Nanoarchaeum equitans Kin4-M  k__Archaea;p__Nanoarchaeota;c__[Nanoarchaeoti];o__[Nanoarchaeotales];f__[Nanoarchaeotaceae];g__Nanoarchaeum;s__equitans Kin4-MArchaea;Nanoarchaeaeota;Nanoarchaeia;Nanoarchaeales;Nanoarchaeaceae;Nanoarchaeum;Nanoarchaeum equitans
GB_GCA_000011125.1  GB  GCA_000011125.1 272557  d__Archaea;p__Thermoproteota;c__Thermoproteia;o__Desulfurococcales;f__Acidilobaceae;g__Aeropyrum;s__Aeropyrum pernix  d__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Desulfurococcales;f__Desulfurococcaceae;g__Aeropyrum;s__Aeropyrum pernix d__Archaea;x__TACK group;p__Crenarchaeota;c__Thermoprotei;o__Desulfurococcales;f__Desulfurococcaceae;g__Aeropyrum;s__Aeropyrum pernix;x__Aeropyrum pernix K1  k__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Desulfurococcales;f__Desulfurococcaceae;g__Aeropyrum;s__ Archaea;Crenarchaeota;Thermoprotei;Desulfurococcales;Acidilobaceae;Aeropyrum;Aeropyrum camini SY1 = JCM 12091
GB_GCA_000014945.1  GB  GCA_000014945.1 349307  d__Archaea;p__Halobacteriota;c__Methanosarcinia;o__Methanotrichales;f__Methanotrichaceae;g__Methanothrix_B;s__Methanothrix_B thermoacetophila d__Archaea;p__Euryarchaeota;c__Methanomicrobia;o__Methanosarcinales;f__Methanotrichaceae;g__Methanothrix;s__Methanothrix thermoacetophila d__Archaea;p__Euryarchaeota;x__Stenosarchaea group;c__Methanomicrobia;o__Methanosarcinales;f__Methanotrichaceae;g__Methanothrix;s__Methanothrix thermoacetophila;x__Methanothrix thermoacetophila PT  k__Archaea;p__Euryarchaeota;c__Methanomicrobia;o__Methanosarcinales;f__Methanosaetaceae;g__Methanosaeta;s__ Archaea;Euryarchaeota;Methanomicrobia;Methanosarcinales;Methanosaetaceae;Methanosaeta;Methanosaeta thermophila PT
GB_GCA_000015805.1  GB  GCA_000015805.1 410359  d__Archaea;p__Thermoproteota;c__Thermoproteia;o__Thermoproteales;f__Thermoproteaceae;g__Pyrobaculum;s__Pyrobaculum calidifontis d__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Thermoproteales;f__Thermoproteaceae;g__Pyrobaculum;s__Pyrobaculum calidifontis d__Archaea;x__TACK group;p__Crenarchaeota;c__Thermoprotei;o__Thermoproteales;f__Thermoproteaceae;g__Pyrobaculum;s__Pyrobaculum calidifontis;x__Pyrobaculum calidifontis JCM 11548 k__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Thermoproteales;f__Thermoproteaceae;g__Pyrobaculum;s__ Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;Pyrobaculum;Pyrobaculum calidifontis
GB_GCA_000018465.1  GB  GCA_000018465.1 436308  d__Archaea;p__Thermoproteota;c__Nitrososphaeria;o__Nitrososphaerales;f__Nitrosopumilaceae;g__Nitrosopumilus;s__Nitrosopumilus maritimus d__Archaea;p__Thaumarchaeota;c__;o__Nitrosopumilales;f__Nitrosopumilaceae;g__Nitrosopumilus;s__Nitrosopumilus maritimus d__Archaea;x__TACK group;p__Thaumarchaeota;o__Nitrosopumilales;f__Nitrosopumilaceae;g__Nitrosopumilus;s__Nitrosopumilus maritimus;x__Nitrosopumilus maritimus SCM1  k__Archaea;p__Crenarchaeota;c__Thaumarchaeota;o__Cenarchaeales;f__Cenarchaeaceae;g__Nitrosopumilus;s__  Archaea;Thaumarchaeota;Nitrososphaeria;Nitrosopumilales;Nitrosopumilaceae;Candidatus Nitrosopumilus;Nitrosopumilus maritimus SCM1
GB_GCA_000019805.1  GB  GCA_000019805.1 444157  d__Archaea;p__Thermoproteota;c__Thermoproteia;o__Thermoproteales;f__Thermoproteaceae;g__Pyrobaculum;s__Pyrobaculum neutrophilum d__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Thermoproteales;f__Thermoproteaceae;g__Pyrobaculum;s__Pyrobaculum neutrophilum d__Archaea;x__TACK group;p__Crenarchaeota;c__Thermoprotei;o__Thermoproteales;f__Thermoproteaceae;g__Pyrobaculum;s__Pyrobaculum neutrophilum;x__Pyrobaculum neutrophilum V24Sta  k__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Thermoproteales;f__Thermoproteaceae;g__Pyrobaculum;s__ Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;Pyrobaculum;Pyrobaculum neutrophilum V24Sta
GB_GCA_000022445.1  GB  GCA_000022445.1 426118  d__Archaea;p__Thermoproteota;c__Thermoproteia;o__Sulfolobales;f__Sulfolobaceae;g__Saccharolobus;s__Saccharolobus islandicus d__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Sulfolobales;f__Sulfolobaceae;g__Sulfolobus;s__Sulfolobus islandicus d__Archaea;x__TACK group;p__Crenarchaeota;c__Thermoprotei;o__Sulfolobales;f__Sulfolobaceae;g__Sulfolobus;s__Sulfolobus islandicus;x__Sulfolobus islandicus M.16.4 k__Archaea;p__Crenarchaeota;c__Thermoprotei;o__Sulfolobales;f__Sulfolobaceae;g__Sulfolobus;s__islandicus  Archaea;Crenarchaeota;Thermoprotei;Sulfolobales;Sulfolobaceae;Sulfolobus;Sulfolobus islandicus M.16.4
  1. df_silva_to_ncbi.tsv
SLV_accession NCBI_taxid  NCBI_accession  SLV_taxonomy  SLV_organisms_name  SLV_taxid
AX664486  7227  AX664486  Eukaryota;Amorphea;Obazoa;Opisthokonta;Holozoa;Choanozoa;Metazoa;Animalia;BCP clade;Bilateria;Protostomia;Ecdysozoa;Arthropoda;Hexapoda;Insecta;Pterygota;Neoptera;Diptera; Drosophila melanogaster (fruit fly) 47180
BD307583  37089 BD307583  Eukaryota;SAR;Alveolata;Apicomplexa;Conoidasida;Coccidia;Eimeriorina;Toxoplasma;  Neospora sp.  8805
BD359735  5858  BD359735  Eukaryota;SAR;Alveolata;Apicomplexa;Aconoidasida;Haemosporoidia;Plasmodium;Plasmodium malariae  8775
BD359736  5858  BD359736  Eukaryota;SAR;Alveolata;Apicomplexa;Aconoidasida;Haemosporoidia;Plasmodium;Plasmodium malariae  8775
CS214259  10116 CS214259  Eukaryota;Amorphea;Obazoa;Opisthokonta;Holozoa;Choanozoa;Metazoa;Animalia;BCP clade;Bilateria;Deuterostomia;Chordata;Vertebrata;Gnathostomata;Euteleostomi;Tetrapoda;Mammalia;  Rattus norvegicus (Norway rat)  47087
CS214261  10116 CS214261  Eukaryota;Amorphea;Obazoa;Opisthokonta;Holozoa;Choanozoa;Metazoa;Animalia;BCP clade;Bilateria;Deuterostomia;Chordata;Vertebrata;Gnathostomata;Euteleostomi;Tetrapoda;Mammalia;  Rattus norvegicus (Norway rat)  47087
CS214263  10116 CS214263  Eukaryota;Amorphea;Obazoa;Opisthokonta;Holozoa;Choanozoa;Metazoa;Animalia;BCP clade;Bilateria;Deuterostomia;Chordata;Vertebrata;Gnathostomata;Euteleostomi;Tetrapoda;Mammalia;  Rattus norvegicus (Norway rat)  47087
AB000106  28214 AB000106  Bacteria;Proteobacteria;Alphaproteobacteria;Sphingomonadales;Sphingomonadaceae;Sphingobium; Sphingomonas sp.  2850
AB000278  56192 AB000278  Bacteria;Proteobacteria;Gammaproteobacteria;Vibrionales;Vibrionaceae;Photobacterium;  Photobacterium iliopiscarium  3781

taxonomywelder's People

Contributors

jungbluth avatar

Stargazers

Katherine Silliman avatar Austin Richardson avatar

Watchers

James Cloos avatar  avatar

Forkers

sformel-usgs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.