GithubHelp home page GithubHelp logo

edinburgh-genome-foundry / galaxy_synbiocad_dnaweaver Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 0.0 260 KB

Archived Galaxy node for the manufacturing step of the SynBioCAD project. See latest version here: https://github.com/brsynth/DNAWeaver_SynBioCAD

Home Page: https://github.com/brsynth/DNAWeaver_SynBioCAD

Python 100.00%

galaxy_synbiocad_dnaweaver's Introduction

galaxy_synbioCAD_dnaweaver

Build Status

This project finds assembly plans to build the genetic designs generated by the Synbiocad project using either Golden Gate or Gibson assembly.

Installing/Dockerizing

This is not yet dockerized but as per the .travis.yml file the Docker build (from an ubuntu/python image) should be:

sudo apt-get -qq update
sudo apt-get install ncbi-blast+
pip install -r requirements.txt

Running the script

See script.py or run ./script.py -h for the description of the parameters. A typical command is:

script.py input_sbol.xml output.xlsx any_method

Where input_sbol.xml is a path to an .xml SBOL file containing constructs designs and sequences, output.xlsx is a spreadsheet report of the assembly plan, and any_method (which can also be only gibson or golden_gate) indicates to consider both methods.

Testing

The .travis.yml file describes the testing procedure. To run the test install pytest:

pip install pytest

Then run the tests with:

python3 -m pytest tests.py

What this program does

Problem:

Given a set of designs (one design is a construct name and list of its parts), find a valid and efficient assembly plan to build all the designs. The designs and part sequences are provided as an SBOL file (see test_input.xml for an example)

Method:

  • We assume that the different standard parts are available or will be ordered, with the exact sequence provided in the input file (in the future it would be easy to automatically break long parts into smaller fragments).
  • The desired construct sequence for a design is simply the contenation of that design's part sequences in the right order (no assembly overhang is included).
  • Buy primers with overhangs to extend the parts fragments via PCR and create homologies between them so they can be assembled together.
  • Assemble each construct in a single step with Golden Gate assembly if possible (that is if at least one site out of BsaI, BbsI and BsmBI is totally absent from the construct sequence), else with Gibson assembly. It can also be only one of the two methods if the option gibson or golden_gate is selected instead of any_method.
  • Process all designs one after another. Make an assembly plan for the first design, then reuse primers and fragments to make an assembly plan for the second design, etc.

Here is a schema of the DNA Weaver supply network used to make an assembly plan for a single design. The primers and fragments from previously-planned assemblies are provided "for free" in the network as "parts libraries":

Output:

See example_output.xlsx for an example. The output is an Excel spreadsheet with the following sub-sheets:

  • construct_parts: the ID and list of part names (in the right order) for each design.
  • construct_sequences: the final sequence of the constructs to build.
  • part_sequences: the list of each standard part and its sequence (same information as in the input SBOL file).
  • fragment_extensions: for each PCR fragment, the standard part and the primers to use
  • assembly_plan: for each design, the list of PCR fragments to use.
  • errors: list of errors to help troubleshooting assemblies for which no valid assembly plan was found.

Description of the example/testing sample

The example input SBOL is from an example file provided by @pablocarb, with a random sequence used for the terminator (Ter) part. The example has 48 designs which are well representative. Some have type-2s sites preventing golden gate, some use the same part more than once, making it a challenging scenario for scarless Golden Gate assembly.

The output example_output.xlsx shows the plan generated to build all the designs. The plan has:

  • 19 different base parts (provided by the SBOL input)
  • 125 primers to be ordered to extend the parts in various ways (that's less than 3 primers per design to build, thanks to primer reuse)
  • 116 fragments to be PCRed (less than 3 PCRs per design to build, thanks to fragment reuse).
  • 15 designs seem to need Gibson assembly as they contain BsaI, BsmBI, and BbsI sites, the rest use Golden Gate assembly.

Limitations

For constructs with repeated parts and other homologies (such as, in the example, the designs with several "Ter" in a row, ), Gibson assembly (and probably LCR assembly too) may create misanealled constructs and more clones will need to be picked. This is not taken into account by the script at the moment. This could be fixed by buying custom fragments from a commercial vendor for the extreme cases (i.e. by amending the current implementation to forbid gibson cuts in regions with homologies elsewhere, and add a DNA vendor in the supply network).

Code organisation

  • script.py -- main script, can be run from the command line
  • methods/generate_supply_network.py -- implements the DnaWeaver supply network from the figure above.
  • methods/compute_all_constructs.py -- main loop to iterate over all constructs and get assembly plans using the supply network.
  • methods/write_output_spreadsheet.py -- method to write the data collected into the output spreadsheet.

Written by Zulko at the Edinburgh Genome Foundry.

galaxy_synbiocad_dnaweaver's People

Contributors

zulko avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.