GithubHelp home page GithubHelp logo

genbench's Introduction

GenBench: A Comprehensive Benchmark of genomic foundation models

Introduction

GenBench is a comprehensive benchmark for evaluating genomic foundation model, encompassing a broad spectrum of methods and diverse tasks, ranging from predicting gene location and function, identifying regulatory elements, and studying species evolution. GenBench offers a modular and extensible framework, excelling in user-friendliness, organization, and comprehensiveness. The codebase is organized into three abstracted layers, namely the core layer, algorithm layer, and user interface layer, arranged from the bottom to the top.

(back to top)

Overview

Code Structures
  • GenBench/configs contains configuration for benchmark evaluation.
  • GenBench/data contains datasets.
  • GenBench/notebook contains analysis and visualization notebooks.
  • GenBench/src contains source code for evaluation piplines.
  • GenBench/weight contains pretrained weights for benchmark evaluation.
  • GenBench/experiment contains scripts for experiment management.

Installation

This project has provided an environment setting file of conda, users can easily reproduce the environment by the following commands:

cd GenBench
conda env create -f environment.yml
conda activate OpenGenome
python setup.py develop

Getting Started

Here is an example of single GPU non-distributed training HyenaDNA on demo_human_or_worm dataset.

bash tools/prepare_data/download_mmnist.sh
python train.py -m train experiment=hg38/genomic_benchmark_mamba \
        dataset.dataset_name=demo_human_or_worm \
        wandb.id=demo_human_or_worm_hyenadna \
        train.pretrained_model_path=path/to/pretrained_model \
        trainer.devices=1

Repeat the experiment

Please see experiment.MD for the details of experiment management. and find scrips in 'experiment' directory

Overview of Model Zoo and Datasets

We support various Genomic foundation models. We are working on add new methods and collecting experiment results.

(back to top)

Visualization

We present visualization examples of HyenaDNA below. For more detailed information, please refer to the notebook.

  • for Drosophila enhancer activity prediction, visualization of predicted enhancers and ground truth enhancers are shown in notebook/drosophila_pearsonr.ipynb after running the experiment.

License

This project is released under the Apache 2.0 license. See LICENSE for more information.

Acknowledgement

The framework of GenBench is insipred by HyenaDNA

Contact

(back to top)

genbench's People

Contributors

jimmylihui avatar

Stargazers

SLEEPNOW avatar Rex Ma avatar  avatar Zhao Yang avatar  avatar Wenduo Cheng avatar Zhihan Zhou avatar Siyuan Li avatar  avatar  avatar

Watchers

 avatar

genbench's Issues

Some advice

Thanks for the wonderful benchmark, it may be easier to use if you guys can provide the download link of the pre-trained models like NT or hyena.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.