GithubHelp home page GithubHelp logo

edinburgh-genome-foundry / dnacauldron Goto Github PK

View Code? Open in Web Editor NEW
49.0 49.0 11.0 3.5 MB

:alembic: Simple cloning simulator (Golden Gate etc.) for single and combinatorial assemblies

Home Page: https://edinburgh-genome-foundry.github.io/DnaCauldron/

License: MIT License

Python 99.50% Pug 0.38% CSS 0.12%
cloning-simulator dna-assembly golden-gate molecular-biology synthetic-biology

dnacauldron's People

Contributors

veghp avatar zulko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dnacauldron's Issues

Assembling fragments digested by different enzymes

Hi, thank you for the great package and documentation. I would like to simulate a Type2s-like assembly where the acceptor and the inserts are digested by different enzymes. In other words, the acceptor is digested with enzyme A, and the insert (or inserts) are digested with enzyme B. Is it possible to simulate that using dnacauldron? Thanks

Check parts assembling into enzyme sites

In this hypothetical scenario, the ends of two parts assemble into an enzyme site. If the overhang is in the middle 4 basepairs, the enzyme site will not be detected in the individual parts:

enzyme_site = "CGTCTC"
overhang = "GTCT"
part1 = "NNNNN" + "CGTCT"
part2 = "GTCTC" + "NNNNN"

Admittedly, this is an extreme edge case, but may be useful to implement a check on the final construct, possibly using biopython.Restriction. Example of final construct containing an enzyme site: assembly_simulation.zip

TypeError: StickyEndSeq.reverse_complement() got an unexpected keyword argument 'inplace'

I sometimes get the following error:

TypeError: StickyEndSeq.reverse_complement() got an unexpected keyword argument 'inplace'

I use latest version of DnaCauldro, biopython ( 1.8.1 ) and python 3.10, the stacktrace is:

ypeError                                 Traceback (most recent call last)
Input In [8], in <cell line: 5>()
      1 #repository.import_records(folder=str(twist / "A_round"), use_file_names_as_ids=True, topology="circular")
      2 B_assembly_plan = dc.AssemblyPlan.from_spreadsheet(
      3     assembly_class=dc.Type2sRestrictionAssembly, path= str(locations.algae / "B_round_2.csv")
      4 )
----> 5 B_simulation = B_assembly_plan.simulate(sequence_repository=repository)
      6 print("Assembly stats:", B_simulation.compute_stats())
      7 B_simulation.compute_summary_dataframe()

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/AssemblyPlan/AssemblyPlan.py:257, in AssemblyPlan.simulate(self, sequence_repository)
    255 if assembly.name in cancelled_assemblies:
    256     continue
--> 257 simulation_result = assembly.simulate(sequence_repository)
    258 simulation_results.append(simulation_result)
    259 for record in simulation_result.construct_records:

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/Assembly/builtin_assembly_classes/Type2sRestrictionAssembly.py:180, in Type2sRestrictionAssembly.simulate(self, sequence_repository, annotate_parts_homologies)
    177 warnings = []
    179 records = sequence_repository.get_records(self.parts)
--> 180 mix = generate_type2s_restriction_mix(
    181     records, enzyme=self.enzyme, name="type2s_mix"
    182 )
    183 if self.enzyme == "auto":
    184     self.enzyme = str(mix.enzymes[0])  # it has been autoselected!

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/AssemblyMix/RestrictionLigationMix.py:101, in generate_type2s_restriction_mix(parts, enzyme, name)
     99 if enzyme == "auto":
    100     enzyme = autoselect_enzyme(parts)
--> 101 return RestrictionLigationMix(
    102     parts=parts,
    103     enzymes=[enzyme],
    104     fragment_filters=[NoRestrictionSiteFilter(str(enzyme))],
    105     name=name,
    106 )

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/AssemblyMix/RestrictionLigationMix.py:69, in RestrictionLigationMix.__init__(self, parts, enzymes, fragments, fragment_filters, name, annotate_fragments_with_parts)
     67 self.name = name
     68 self.annotate_fragments_with_parts = annotate_fragments_with_parts
---> 69 self.initialize()

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/AssemblyMix/AssemblyMix.py:30, in AssemblyMix.initialize(self)
     28 if not hasattr(self, "fragments") or self.fragments is None:
     29     self.compute_fragments()
---> 30 self.compute_reverse_fragments()
     31 self.compute_connections_graph()

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/AssemblyMix/mixins/FragmentsMixin.py:33, in FragmentsMixin.compute_reverse_fragments(self)
     31 for fragment in self.fragments:
     32     fragment.is_reversed = False
---> 33     new_fragment = fragment.reverse_complement()
     34     new_fragment.is_reversed = True
     35     new_fragment.reverse_fragment = fragment

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/Fragment/Fragment.py:34, in Fragment.reverse_complement(self)
     31 """Reverse-complement the fragment while keeping its class"""
     32 # Note: if the fragment has a StickyEndSeq, it will be properly
     33 # reverse-complemented with its ends
---> 34 new_record = SeqRecord.reverse_complement(self)
     35 new_record.__class__ = self.__class__
     36 return new_record

File ~/.local/lib/python3.10/site-packages/Bio/SeqRecord.py:1238, in SeqRecord.reverse_complement(self, id, name, description, features, annotations, letter_annotations, dbxrefs)
   1233     seq = self.seq.reverse_complement_rna(
   1234         inplace=False
   1235     )  # TODO: remove inplace=False
   1236 else:
   1237     # Default to DNA)
-> 1238     seq = self.seq.reverse_complement(
   1239         inplace=False
   1240     )  # TODO: remove inplace=False
   1241 if isinstance(self.seq, MutableSeq):
   1242     seq = Seq(seq)

TypeError: StickyEndSeq.reverse_complement() got an unexpected keyword argument 'inplace'

Unwanted case-sensitivity when indexing fragment against source record

in StickyEndFragment.list_from_record_digestion (line 142), add upper() conversion, changing from existing
index = record.seq.find(fragment)
to fixed
index = record.seq.upper().find(fragment)

Current implementation is broken for source records with lowercase Seq.
If there is a mandate somewhere where these should be uppercase, then instead add a useful error report here.

problem with snapgene_reader

Attempting to install results in the following error for snapgene reader using pip3

Using cached https://files.pythonhosted.org/packages/f6/90/bfd3cc66c6e5045f4156b695708dcec715e9ff18d4aef9493fd7f966f5aa/snapgene_reader-0.1.15.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-build-kl5wwig5/snapgene-reader/setup.py", line 13, in
long_description=open('README.rst').read(),
File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 1031: ordinal not in range(128)

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-kl5wwig5/snapgene-reader/

Annotating features with qualifiers

I could be doing something wrong, but adding qualifiers in dc.biotools.record_operations.annotate_record don't seem to be working for me. Rather the qualifiers are just being added to /note. Annotating using the same qualifiers dict in SeqFeature seems to be working fine for me though.

annotations crossing overhangs lost

Just started playing around with DnaCauldron-- super cool module! One thing that isn't a showstopper but would be nice if resolved is that annotations crossing overhangs seem to be lost. it'd be nice if all of those were kept. is this a possibility? I looked for an option in the source but didn't see anything.

Test fails with Biopython v1.79

Test test_hierarchical_biobrick fails with Biopython v1.79.

In v1.79 "The Seq and MutableSeq classes in Bio.Seq now store their sequence contents as bytes and bytearray objects, respectively." ( https://github.com/biopython/biopython/blob/master/NEWS.rst#3-june-2021-biopython-179 )

Following BioBrickStandardAssembly - StickyEndFragment.list_from_record_digestion() - StickyEndSeq.list_from_sequence_digestion - overhang_bit = fragments[0][:overhang]
we see that this causes the slice functionality to return incorrect sequences, with the bytestring flanked with b' and '. This problem occurs only when 2 enzymes are used (also relevant for #15).

Reproducible example:

from dnacauldron.Fragment.StickyEndFragment.StickyEndSeq import StickyEndSeq
f = StickyEndSeq('CTAGTAAAAAAAAAAA')
f[:4]
# (None-b'CTAG'-None)

The best solution seems to be to implement a StickyEndSeq.slice(from, to) method that slices on the to_standard_sequence(discard_sticky_ends=True) export and creates a new sticky end seq.

example with ladders

It would be nice to have a report where ladders of restriction analysis are also generated. I know that people use PyDNA for it, but it is not clear for me how I can combine DnaCauldron assembly with ladders from PyDNA. And example can be useful.

Linear assemblies?

I am trying to simulate a three fragment golden gate type assembly where the end product is not a circular dna but a linear dna.

If i run the below code with three fragments suitable for circular assembly i get the expected ouput of:

  • circular connection graph pdf
  • final assembly genbank file
  • summary.csv with one correct assembly consiting of frag1, frag2 and frag3 and no error.csv file

If i run the code with three fragments that are suitable for linear assembly, but cannot be circularized it get:

  • linear connection graph pdf
  • no final assembly genbank file
  • no summary.csv file and error.csv contains "unused_parts: FRAG1 & FRAG2 & FRAG3"

Is there a way to specify that I am expecting a linear assembly and not a cirucular?

repository = dc.SequenceRepository()
repository.import_records(files=['seqs.fa'])

parts_list = list(repository.collections["parts"])

assembly = dc.Type2sRestrictionAssembly(
                                        name="golden_gate",
                                        parts=parts_list,
                                        expected_constructs="any_number",
                                        )

simulation = assembly.simulate(sequence_repository=repository)

report_writer = dc.AssemblyReportWriter(include_mix_graphs=True, include_part_plots=True)

simulation.write_report(
                        target="output",
                        report_writer=report_writer,
                        )

golden_gate_type2s_mix_parts_graph.pdf

Palindromes in Golden Gate assembly, only want one construct

Hello. I have a palindrome overhang in one of my parts. This is causing two constructs to get produced. I have to use this palindrome for my application.

This gives me two assemblies: a correct one with the vector and all the parts, and an incorrect one with the vector placed twice with several parts repeated. I tried creating a simple assembly plan, but I still get the same result.

I can set max_constructs=1, and at least for this example I get the shorter/simpler assembly that I want. But I'm not sure I can guarantee this is how it will behave all the time. Or does it?

Thanks for any help you can provide!

image

Choosing enzyme in AssemblyPlan.from_spreadsheet()

When the backbone contains 2 sets of correctly oriented enzyme sites (BsaI and BsmBI), then autoselect may not correctly choose the one we want to use.
This can be addressed by specifying an enzyme for each assembly:

assembly_plan = dc.AssemblyPlan.from_spreadsheet(
    name="Assembly",
    path=assembly_plan_path,
    assembly_class=dc.Type2sRestrictionAssembly
)
for assembly in assembly_plan.assemblies:
    assembly.enzyme = "BsaI"

so this is not an issue in DNA Cauldron, but causes a bug in CUBA where the enzyme selected in the dropdown menu is ignored.

A possible solution is to make a note about this in the documentation and add the above code in CUBA.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.