edinburgh-genome-foundry / dnacauldron Goto Github PK

View Code? Open in Web Editor NEW

49.0 49.0 11.0 3.5 MB

:alembic: Simple cloning simulator (Golden Gate etc.) for single and combinatorial assemblies

Home Page: https://edinburgh-genome-foundry.github.io/DnaCauldron/

License: MIT License

Python 99.50% Pug 0.38% CSS 0.12%

cloning-simulator dna-assembly golden-gate molecular-biology synthetic-biology

dnacauldron's People

Contributors

Stargazers

Watchers

Forkers

csawye01 sandyg05 hainesm6-learning e-tomato bijikyu antonkulaga kidaa1 guruace feldman4 jamesbagley

dnacauldron's Issues

Assembling fragments digested by different enzymes

Hi, thank you for the great package and documentation. I would like to simulate a Type2s-like assembly where the acceptor and the inserts are digested by different enzymes. In other words, the acceptor is digested with enzyme A, and the insert (or inserts) are digested with enzyme B. Is it possible to simulate that using dnacauldron? Thanks

Check parts assembling into enzyme sites

In this hypothetical scenario, the ends of two parts assemble into an enzyme site. If the overhang is in the middle 4 basepairs, the enzyme site will not be detected in the individual parts:

enzyme_site = "CGTCTC"
overhang = "GTCT"
part1 = "NNNNN" + "CGTCT"
part2 = "GTCTC" + "NNNNN"

Admittedly, this is an extreme edge case, but may be useful to implement a check on the final construct, possibly using biopython.Restriction. Example of final construct containing an enzyme site: assembly_simulation.zip

pdf_reports required but not installed by pip

TypeError: StickyEndSeq.reverse_complement() got an unexpected keyword argument 'inplace'

I sometimes get the following error:

TypeError: StickyEndSeq.reverse_complement() got an unexpected keyword argument 'inplace'

I use latest version of DnaCauldro, biopython ( 1.8.1 ) and python 3.10, the stacktrace is:

ypeError                                 Traceback (most recent call last)
Input In [8], in <cell line: 5>()
      1 #repository.import_records(folder=str(twist / "A_round"), use_file_names_as_ids=True, topology="circular")
      2 B_assembly_plan = dc.AssemblyPlan.from_spreadsheet(
      3     assembly_class=dc.Type2sRestrictionAssembly, path= str(locations.algae / "B_round_2.csv")
      4 )
----> 5 B_simulation = B_assembly_plan.simulate(sequence_repository=repository)
      6 print("Assembly stats:", B_simulation.compute_stats())
      7 B_simulation.compute_summary_dataframe()

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/AssemblyPlan/AssemblyPlan.py:257, in AssemblyPlan.simulate(self, sequence_repository)
    255 if assembly.name in cancelled_assemblies:
    256     continue
--> 257 simulation_result = assembly.simulate(sequence_repository)
    258 simulation_results.append(simulation_result)
    259 for record in simulation_result.construct_records:

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/Assembly/builtin_assembly_classes/Type2sRestrictionAssembly.py:180, in Type2sRestrictionAssembly.simulate(self, sequence_repository, annotate_parts_homologies)
    177 warnings = []
    179 records = sequence_repository.get_records(self.parts)
--> 180 mix = generate_type2s_restriction_mix(
    181     records, enzyme=self.enzyme, name="type2s_mix"
    182 )
    183 if self.enzyme == "auto":
    184     self.enzyme = str(mix.enzymes[0])  # it has been autoselected!

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/AssemblyMix/RestrictionLigationMix.py:101, in generate_type2s_restriction_mix(parts, enzyme, name)
     99 if enzyme == "auto":
    100     enzyme = autoselect_enzyme(parts)
--> 101 return RestrictionLigationMix(
    102     parts=parts,
    103     enzymes=[enzyme],
    104     fragment_filters=[NoRestrictionSiteFilter(str(enzyme))],
    105     name=name,
    106 )

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/AssemblyMix/RestrictionLigationMix.py:69, in RestrictionLigationMix.__init__(self, parts, enzymes, fragments, fragment_filters, name, annotate_fragments_with_parts)
     67 self.name = name
     68 self.annotate_fragments_with_parts = annotate_fragments_with_parts
---> 69 self.initialize()

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/AssemblyMix/AssemblyMix.py:30, in AssemblyMix.initialize(self)
     28 if not hasattr(self, "fragments") or self.fragments is None:
     29     self.compute_fragments()
---> 30 self.compute_reverse_fragments()
     31 self.compute_connections_graph()

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/AssemblyMix/mixins/FragmentsMixin.py:33, in FragmentsMixin.compute_reverse_fragments(self)
     31 for fragment in self.fragments:
     32     fragment.is_reversed = False
---> 33     new_fragment = fragment.reverse_complement()
     34     new_fragment.is_reversed = True
     35     new_fragment.reverse_fragment = fragment

File ~/micromamba/envs/cf-cloning/lib/python3.10/site-packages/dnacauldron/Fragment/Fragment.py:34, in Fragment.reverse_complement(self)
     31 """Reverse-complement the fragment while keeping its class"""
     32 # Note: if the fragment has a StickyEndSeq, it will be properly
     33 # reverse-complemented with its ends
---> 34 new_record = SeqRecord.reverse_complement(self)
     35 new_record.__class__ = self.__class__
     36 return new_record

File ~/.local/lib/python3.10/site-packages/Bio/SeqRecord.py:1238, in SeqRecord.reverse_complement(self, id, name, description, features, annotations, letter_annotations, dbxrefs)
   1233     seq = self.seq.reverse_complement_rna(
   1234         inplace=False
   1235     )  # TODO: remove inplace=False
   1236 else:
   1237     # Default to DNA)
-> 1238     seq = self.seq.reverse_complement(
   1239         inplace=False
   1240     )  # TODO: remove inplace=False
   1241 if isinstance(self.seq, MutableSeq):
   1242     seq = Seq(seq)

TypeError: StickyEndSeq.reverse_complement() got an unexpected keyword argument 'inplace'

The tutorial uses parts as keyword argument for SequenceRepository but now its collections as dict of dict

Unwanted case-sensitivity when indexing fragment against source record

in StickyEndFragment.list_from_record_digestion (line 142), add upper() conversion, changing from existing
index = record.seq.find(fragment)
to fixed
index = record.seq.upper().find(fragment)

Current implementation is broken for source records with lowercase Seq.
If there is a mandate somewhere where these should be uppercase, then instead add a useful error report here.

problem with snapgene_reader

Attempting to install results in the following error for snapgene reader using pip3

Using cached https://files.pythonhosted.org/packages/f6/90/bfd3cc66c6e5045f4156b695708dcec715e9ff18d4aef9493fd7f966f5aa/snapgene_reader-0.1.15.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-build-kl5wwig5/snapgene-reader/setup.py", line 13, in
long_description=open('README.rst').read(),
File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 1031: ordinal not in range(128)

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-kl5wwig5/snapgene-reader/

Annotating features with qualifiers

I could be doing something wrong, but adding qualifiers in dc.biotools.record_operations.annotate_record don't seem to be working for me. Rather the qualifiers are just being added to /note. Annotating using the same qualifiers dict in SeqFeature seems to be working fine for me though.

annotations crossing overhangs lost

Just started playing around with DnaCauldron-- super cool module! One thing that isn't a showstopper but would be nice if resolved is that annotations crossing overhangs seem to be lost. it'd be nice if all of those were kept. is this a possibility? I looked for an option in the source but didn't see anything.

Test fails with Biopython v1.79

Test test_hierarchical_biobrick fails with Biopython v1.79.

In v1.79 "The Seq and MutableSeq classes in Bio.Seq now store their sequence contents as bytes and bytearray objects, respectively." ( https://github.com/biopython/biopython/blob/master/NEWS.rst#3-june-2021-biopython-179 )

Following BioBrickStandardAssembly - StickyEndFragment.list_from_record_digestion() - StickyEndSeq.list_from_sequence_digestion - overhang_bit = fragments[0][:overhang]
we see that this causes the slice functionality to return incorrect sequences, with the bytestring flanked with b' and '. This problem occurs only when 2 enzymes are used (also relevant for #15).

Reproducible example:

from dnacauldron.Fragment.StickyEndFragment.StickyEndSeq import StickyEndSeq
f = StickyEndSeq('CTAGTAAAAAAAAAAA')
f[:4]
# (None-b'CTAG'-None)

The best solution seems to be to implement a StickyEndSeq.slice(from, to) method that slices on the to_standard_sequence(discard_sticky_ends=True) export and creates a new sticky end seq.

example with ladders

It would be nice to have a report where ladders of restriction analysis are also generated. I know that people use PyDNA for it, but it is not clear for me how I can combine DnaCauldron assembly with ladders from PyDNA. And example can be useful.

Linear assemblies?

I am trying to simulate a three fragment golden gate type assembly where the end product is not a circular dna but a linear dna.

If i run the below code with three fragments suitable for circular assembly i get the expected ouput of:

circular connection graph pdf
final assembly genbank file
summary.csv with one correct assembly consiting of frag1, frag2 and frag3 and no error.csv file

If i run the code with three fragments that are suitable for linear assembly, but cannot be circularized it get:

linear connection graph pdf
no final assembly genbank file
no summary.csv file and error.csv contains "unused_parts: FRAG1 & FRAG2 & FRAG3"

Is there a way to specify that I am expecting a linear assembly and not a cirucular?

repository = dc.SequenceRepository()
repository.import_records(files=['seqs.fa'])

parts_list = list(repository.collections["parts"])

assembly = dc.Type2sRestrictionAssembly(
                                        name="golden_gate",
                                        parts=parts_list,
                                        expected_constructs="any_number",
                                        )

simulation = assembly.simulate(sequence_repository=repository)

report_writer = dc.AssemblyReportWriter(include_mix_graphs=True, include_part_plots=True)

simulation.write_report(
                        target="output",
                        report_writer=report_writer,
                        )

golden_gate_type2s_mix_parts_graph.pdf

Palindromes in Golden Gate assembly, only want one construct

Hello. I have a palindrome overhang in one of my parts. This is causing two constructs to get produced. I have to use this palindrome for my application.

This gives me two assemblies: a correct one with the vector and all the parts, and an incorrect one with the vector placed twice with several parts repeated. I tried creating a simple assembly plan, but I still get the same result.

I can set max_constructs=1, and at least for this example I get the shorter/simpler assembly that I want. But I'm not sure I can guarantee this is how it will behave all the time. Or does it?

Thanks for any help you can provide!

Choosing enzyme in AssemblyPlan.from_spreadsheet()

When the backbone contains 2 sets of correctly oriented enzyme sites (BsaI and BsmBI), then autoselect may not correctly choose the one we want to use.
This can be addressed by specifying an enzyme for each assembly:

assembly_plan = dc.AssemblyPlan.from_spreadsheet(
    name="Assembly",
    path=assembly_plan_path,
    assembly_class=dc.Type2sRestrictionAssembly
)
for assembly in assembly_plan.assemblies:
    assembly.enzyme = "BsaI"

so this is not an issue in DNA Cauldron, but causes a bug in CUBA where the enzyme selected in the dropdown menu is ignored.

A possible solution is to make a note about this in the documentation and add the above code in CUBA.

Purpose of OligoPairAnnealing class (oligo_annealing method in assembly plan)

Is this used only for BASIC assembly adapters?

What is the best way to simulate assembly of DNA Weaver's oligo output? Currently using Gibson (gibson_oligo) to simulate their assembly.

edinburgh-genome-foundry / dnacauldron Goto Github PK

dnacauldron's People

Contributors

Stargazers

Watchers

Forkers

dnacauldron's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs