GithubHelp home page GithubHelp logo

Comments (18)

Zulko avatar Zulko commented on June 26, 2024 1

Sorry, I went too fast. You are right, your example did produce the error you reported, and the fix you suggested also works, and is now merged in master, so if you upgrade your example should work.

pip3 install [email protected]:Edinburgh-Genome-Foundry/DnaWeaver.git

Apparently you were the first user to have a case with 0 orders, and wanting a report out of it!

from dnaweaver.

aaroncooper avatar aaroncooper commented on June 26, 2024 1

Awesome, thanks! Happy to oblige and break some code :)

I changed my docker build to pull directly from github (unfortunately I have to install git in the first place now, but that's no big deal). It's now working-- thanks for the quick fix.

From earlier in our conversation, sounds like you're thinking of how to work better with circular DNA. I think I can just set the origin on a junction for the time being to work around that, but it'd be fantastic for a future release!

from dnaweaver.

Zulko avatar Zulko commented on June 26, 2024

Hey, thanks for your interest in DNA Weaver! The library is still undertested for circular sequences afaik, but there should be ways to make it work as you intend.

I believe I understood the issue correctly, and I agree with your approach of providing the origin as a "part" (this I almost equivalent to saying "I have the backbone already"). I am not completely sure why this doesn't work in your example, i.e. why the fragment is not recognized. Do you have a minimal script (ideally with short-ish sequences) to reproduce the issue?

Another and perhaps simpler solution uses the cut_location_constraints parameter to forbid cutting inside a sequence of interest:

uncuttable_sequence = 'CGGGTGCTTGGTGAGACAGCGTATTTTGC'

def cut_location_constraint(sequence):
    """Prevent an assembly station from cutting inside the uncuttable_sequence"""
    index = sequence.find(uncuttable_sequence)
    if index == -1:
        return lambda index: True  # allow any cut
    start, end = index, index + len(uncuttable_sequence)
    return lambda index: not (start <= index <= end)  # refuse cuts on that segment


assembly_station = dw.DnaAssemblyStation(
    name="Gibson Assembly Station",
    assembly_method=dw.GibsonAssemblyMethod(
        overhang_selector=dw.TmSegmentSelector(min_tm=55, max_tm=70),
        min_segment_length=500,
        max_segment_length=4000,
        cut_location_constraints=[cut_location_constraint]  # <=========
    ),
    supplier=[...],
    ...
)

On another note, if you want to check the solutions returned by DnaWeaver, you can use DnaCauldron, which will take your fragments, your enzyme, and will check that the restriction-ligation will produce the expected construct.

import dnacauldron as dc

repository = dc.SequenceRepository(parts={"f1": fragment_1, "part_B":...})
assembly = dc.Type2sRestrictionAssembly(parts_names)
simulation = assembly.simulate(sequence_repository=repository)
assert len(simulation.construct_records) == 1
predicted_construct_record = simulation.construct_records[0]

from geneblocks import  sequences_are_circularly_equal
sequences_are_circularly_equal([predicted_construct_record, my_reference_sequence])

from dnaweaver.

aaroncooper avatar aaroncooper commented on June 26, 2024

Thanks for the incredibly quick response! I think my issue might be caused by GG fragments not being recognized properly-- I made a test case where the result can be made by GG of just two fragments in the parts library, but the script is suggesting I order synthetic DNA for the whole construct. Any idea what I'm doing wrong?
testing_GG.zip

from dnaweaver.

veghp avatar veghp commented on June 26, 2024

Thank you for bringing attention to this.
I can confirm that your fasta sequences do not assemble with https://cuba.genomefoundry.org/simulate_gg_assemblies .

If you convert them to genbank using https://cuba.genomefoundry.org/convert_sequence_files
then the assembly works (note that the names need to be shortened to 20 char to fit the genbank format):
simulated

I'll look into the DNA Weaver part of the problem. I suspect it may be related to the conversion of fasta to BioPython records.

from dnaweaver.

aaroncooper avatar aaroncooper commented on June 26, 2024

Hey Peter, thanks for checking this out. Are you saying I need to convert the parts files to GenBank? just wondering how I would import them if so-- looks like the GoldenGatePartsLibrary constructor only allows fasta files.

from dnaweaver.

Zulko avatar Zulko commented on June 26, 2024

@veghp what you are observing with DnaCauldron might be an issue with the mix of upper case and lower case nucleotides in the sequences. It is different from the issue with DNA Weaver.

@aaroncooper The main issue in your files is that the part sequences in your fasta should have a N for the "wildcard" nucleotide, like in this example. For instance the first part sequence should read cgtctcNATGG...TAANgagacg. I know, it's not super practical and could certainly be improved in the future.

Once you fix that, you will see that DNA Weaver is not always great at finding the best strategies for the edges of the sequence, and in particular it may not manage to use a DNA part if it is located at the very beginning of the sequence. This can be solved by asking Weaver to find strategies for different "rotated" versions of your record, which is slower but works. So the end of your script becomes:

best_quote = None
# try successive rotations of the record, every 200bp
for rotation in range(0, len(record), 200):
    rotated_record = record[rotation:] + record[:rotation]
    desired_sequence = dw.SequenceString.from_record(rotated_record, topology='circular')
    assembly_station.prepare_network_on_sequence(desired_sequence)
    quote = assembly_station.get_quote(desired_sequence)
    if quote.accepted and ((best_quote is None) or (quote.price < best_quote.price)):
        best_quote = quote
print (best_quote.assembly_step_summary())

(for memo @veghp : I did that recently for the synbiocad project and it worked fine)

from dnaweaver.

aaroncooper avatar aaroncooper commented on June 26, 2024

Got it, that makes sense. Representing circular DNA is super annoying, eh?

I actually just got done playing around with this more-- I had surmised that the N might be necessary, as the EMMA example was running fine for me and that was the only difference I could perceive. That does seem to fix the issue! I did run into another bug, very minor, which makes sense-- price of 0 is nonsensical:

Ordering plan (Golden Gate Assembly Station)::                                                                              
  0-722: From GG_parts - price 0.00 - lead_time 0.0 - Part: test_EGFP_GG_source
  722-2649: From GG_parts - price 0.00 - lead_time 0.0 - Part: test_pUC57mini_GG_vector
Price:0.00, total lead_time:0.0
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Price'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dnaweaver_GG_test.py", line 89, in <module>
    report.write_full_report("report.zip")
  File "/usr/local/lib/python3.7/site-packages/dnaweaver/AssemblyPlanReport/mixins/FolderReportMixin.py", line 15, in write_full_report
    self.write_pdf_report(root._file("assembly_report.pdf").open("wb"))
  File "/usr/local/lib/python3.7/site-packages/dnaweaver/AssemblyPlanReport/mixins/PdfReportMixin.py", line 177, in write_pdf_report
    html_out = self.make_html_report()
  File "/usr/local/lib/python3.7/site-packages/dnaweaver/AssemblyPlanReport/mixins/PdfReportMixin.py", line 142, in make_html_report
    cost_orders = orders_dataframe["Price"].sum()
  File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 2800, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/usr/local/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2648, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Price'
root@192ba9ba9de8:~/test/testing_GG# 

from dnaweaver.

aaroncooper avatar aaroncooper commented on June 26, 2024

Sorry that this has bled into another issue, and I'm happy to open a different one if you think appropriate. I have tried defining a non-zero price_per_part but it is still crashing for this KeyError. Defining a non-zero cost for the GoldenGateAssemblyMethod does not rescue it either.

from dnaweaver.

Zulko avatar Zulko commented on June 26, 2024

Hmm not sure what causes this error I have never seen it before, do you have a minimal example? And ok for a new issue (because closing issues is good for moral :D )

from dnaweaver.

veghp avatar veghp commented on June 26, 2024

I tried the solution proposed by @Zulko and it works:
assembly_report.pdf

One shortcoming is that DNA Weaver will cut the section where the "new" origin is into two orders. This also means it will not recognize parts from your library, for that section.

I did not come across the above KeyError problem. In the new issue can you please confirm which version of DNA Weaver you are using (PyPI or Github)?

from dnaweaver.

Zulko avatar Zulko commented on June 26, 2024

@veghp The assembly report you posted shows a solution that doesn't use the parts library, but with the rotation strategy I suggest above this is solved and the final best_quote uses the parts library.

from dnaweaver.

veghp avatar veghp commented on June 26, 2024

Yes, sorry. Here is the correct report:
assembly_report.pdf

from dnaweaver.

aaroncooper avatar aaroncooper commented on June 26, 2024

Here's some code and input with the price error.
testing_GG.zip

I'm using dnaweaver from PyPI, just installed it a few days ago. I looked back at my docker image build and it seems to be 0.3.4. (I tried looking at the __version__ attribute of dnaweaver in the python interpreter, but it doesn't seem to have that attribute.)

from dnaweaver.

aaroncooper avatar aaroncooper commented on June 26, 2024

I couldn't help looking into it a bit, and it seems like in PdfReportMixin, orders_dataframe, unlike pcr_dataframe and parts_dataframe, has no explicit column definition at time of dataframe creation. Perhaps adding that would fix this?

from dnaweaver.

Zulko avatar Zulko commented on June 26, 2024

It is strange that it works for both @veghp and I, but not for you. What is your pandas version? Maybe it is a python3.7 thing, or a pandas version problem (do you have pandas >1.0?). In last resort you can also try installing the Github master with pip3 install [email protected]:Edinburgh-Genome-Foundry/DnaWeaver.git.

from dnaweaver.

aaroncooper avatar aaroncooper commented on June 26, 2024

Oh, interesting-- the last example I sent is working for you without the Price key error?

I'm on pandas 1.0.3.

from dnaweaver.

Zulko avatar Zulko commented on June 26, 2024

👍 Closing this thread. Please report any other bugs/difficulties you find!

For circular DNA, the rotation trick above should work in most cases, but it is still weak in edge cases like "a whole plasmid made only of existing parts", for which a more direct method may be better adapted (e.g. identify the required parts directly using a blast search).

from dnaweaver.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.