GithubHelp home page GithubHelp logo

ensembldb3's Introduction

ensembldb3

WARNING This tool is being replaced by EnsemblLite. EnsemblLite lite is not yet ready for use, but should be in early 2024. Thanks for your patience!

Documentation

ensembldb3 documentation is available at read the docs.

ensembldb3's People

Contributors

gavinhuttley avatar huaemilyying avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ensembldb3's Issues

download should use CHECKSUMS

The ensembldb3.download.Download class should check that all downloaded files match the value in Ensembl's provided CHECKSUMS file.

Add a function based on the following code that is called by Download.

def get_invalid_files(path):
    """return files with invalid checksums"""
    checkpath = (path / "CHECKSUMS")
    failed = []
    with open(checkpath) as infile:
        for record in infile.readlines():
            record = record.split()
            filepath = path / record[-1]
            if not filepath.exists():
                continue

            result = exec_command(f"sum {filepath}")
            result = result.split()
            if result != record[:2]:
                failed.append(filepath)
    return failed

Where do gene symbols come from for human ensembl genes?

I am trying to figure out where ensembl stores the symbol for a gene in its MySQL database (schema). Forgive me for opening an issue here. My excuse is that:

From the ensembldb3 docs:

>>> gene = human.get_gene_by_stableid(stableid='ENSG00000139618')
>>> print(gene)
Gene(species='Homo sapiens'; biotype='protein_coding'; description='BRCA2, DNA repair...'; stableid='ENSG00000139618'; status='KNOWN'; symbol='BRCA2')

Note symbol='BRCA2'. Now I'd like to figure out where in the database this is coming from.

improve installation performance

At present, installation status of a db is identified via existence of an empty ENSEMBLDB_INSTALLED file.

If installation is interrupted, for some reason, then the entire db is dropped and every table reinstalled. Instead, this file should be used to record successful installation of each table.

That means, the status check needs to read the contents of this file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.