Python scripting library for generating designs readable by scadnano.

Home Page: https://scadnano.org

License: MIT License

Python 73.15% SuperCollider 0.03% Shell 0.02% Scala 26.80%

dna-origami dna-structures cadnano dna-sequences

scadnano-python-package's People

Contributors

Stargazers

Watchers

Forkers

cgevans scadnano-test-user n-sarraf unhumbleben anelisecho

scadnano-python-package's Issues

fix documentation formatting on PyPI

The PyPI page for scadnano shows the markdown in README.md but does not format it properly. See if there is a way to specify something that is formatted properly.

imported cadnano design fails to export

Importing this cadnano design in the web interface, and then exporting it to cadnano, fails with the error

Error exporting file: '>' not supported between instances of 'NoneType' and 'int'

I haven't tried it yet from the command line with the Python library directly.

cadnano file attached as zip.

Double Layer Origami 12 LQ edits.zip

publish package to PyPi

Instructions here: https://medium.com/@joel.barmettler/how-to-upload-your-python-package-to-pypi-65edc5fe9c56

allow non-consecutive helices

Allow helix indices to be non-consecutive.

This implies changing the type of DNADesign.helices from List<Helix> to Map<int, Helix>.

This is necessary to allow cadnano designs to be imported while preserving helix indices, since it allows non-consecutive helix indices.

See UC-Davis-molecular-computing/scadnano#171.

circular Strands

Add a Boolean field Strand.circular and draw a crossover from last substrand to first. Any equivalent cyclic permutation of the substrands should display in the same way.

Even if one doesn't wish the final design to have circular strands, this will help to avoid the problem that sometimes an intermediate design temporarily creates a circular Strand, even though subsequent edits make it linear again. Currently such designs are simply not allowed.

See UC-Davis-molecular-computing/scadnano#5.

add more options for assigning DNA

Currently the only option is to give a custom sequence, or use rotation 5588 of the standard "p7249" M13 sequence.

In addition to p7249, there are also p7560 and p8064 (https://www.tilibit.com/collections/scaffold-dna). Furthermore, the user should be able to specify the rotation (default 5588 for M13).

custom image for DNA modifications

Currently DNA modifications can only be displayed as a string. Support a custom image to be specified, e.g., a star for a fluorophore.

support domain labels, strand labels

As in codenano: https://docs.rs/codenano/0.5.1/codenano/

re-arrange order of API documentation, and list classes somewhere (e.g., on the side)

Figure out how to configure Sphinx to re-arrange order of documentation in API.

Currently the API documentation puts things in the same order as the source code, which itself is not easy to change because of some dependencies of later type declarations on previous type declarations.

Also, see if there's a way to list the classes (Strand, Domain, etc.) on the side.

add StrandBuilder.with_domain_sequence

Also with_sequence, but with_domain_sequence can do a partial assignment of a DNA sequence.

Helix created with major_ticks but no max_offset/min_offset crashes

Fix Helix.__post_init__() to only check major_ticks against max_offset and min_offset if they are all non-None.

need tutorial

The API documentation for the scadnano Python scripting package is fairly extensive, but we need a simple tutorial that walks someone through the concepts one by one in a reasonable order, showing them how to write scripts to generate anything from simple designs up to a full-sized origami or non-origami system (e.g., tile-based design).

color should accept an integer

Currently the Strand.color field can be a hex string (e.g., "#ffaa12") or a map (e.g., {"r": 123, "g": 456, "789"}).

codenano also uses a decimal integer, e.g., 123456 interpreted as a 24-bit number encoding the RGB values 8 bits at a time. Accept this sort of color specification as well.

from_cadnano_v2 keyerror: 'vstrands'

I've got a simple scadnano script to load a .dna file and write the corresponding cadnano .json file in the same directory, which can be found here: https://github.com/jcalumba/oxdna_relax/blob/master/export.py .
My example .dna file can be found here:
https://github.com/jcalumba/oxdna_relax/blob/master/export.dna .

When I try to run this script with any given *.dna file, I get the following error:
C:\Users\jcalumba\scadnano\scadnano-python-package\scadnano\backup>python ../export.py export.dna
2
Traceback (most recent call last):
File "../export.py", line 16, in
origami = design.from_cadnano_v2('.', name)
File "C:\Users\jcalumba\scadnano\scadnano-python-package\scadnano\scadnano.py", line 2207, in from_cadnano_v2
num_bases = len(cadnano_v2_design['vstrands'][0]['scaf'])
KeyError: 'vstrands'

To fix it, I tried to pass in an empty list of vstrands when I instantiated my DNAdesign object, seen here:
design = sc.DNADesign(helices=[], strands=[], vstrands=[], grid=sc.square)

but got a keyword error:

File "../export.py", line 15, in
design = main()
File "../export.py", line 5, in main
design = sc.DNADesign(helices=[], strands=[], vstrands=[], grid=sc.square)
TypeError: init() got an unexpected keyword argument 'vstrands'

_from_scadnano_json fails on the attached json

{ "version": "0.3.0", "helices": [ {"grid_position": [0, 0]}, {"max_offset": 32, "grid_position": [0, 1]} ], "strands": [ { "color": "#0066cc", "substrands": [ {"helix": 0, "forward": true, "start": 0, "end": 32} ], "is_scaffold": true } ] }

Exception raised: '>' not supported between instances of 'NoneType' and 'int'

allow Helix.position to specify x,y,z under "origin"

scadnano represents position like this:

{ "x": 0, "y": 0, "z": 0, "pitch": 0 , "roll": 0 , "yaw": 0}

but codenano represents them like this:

{ "origin": { "x": 0, "y": 0, "z": 0}, "pitch": 0 , "roll": 0 , "yaw": 0}

Although I prefer the former to keep the file format flat and readable (and scadnano will continue to write in that format), we should be able to read the latter.

use Kelly's colors

These look "nice"

kelly_colors = ['F2F3F4', '222222', 'F3C300', '875692', 'F38400', 'A1CAF1', 'BE0032', 'C2B280', '848482', '008856', 'E68FAC', '0067A5', 'F99379', '604E97', 'F6A600', 'B3446C', 'DCD300', '882D17', '8DB600', '654522', 'E25822', '2B3D26']

See https://medium.com/@rjurney/kellys-22-colours-of-maximum-contrast-58edb70c90d1

swap position x z coordinate interpretation

codenano has the same interpretation of y, but x and z are swapped.

Let's swap them. Then the main view shows x-y coordinates, and the side view shows z-y coordinates.

add with_domain_label for method chaining

First we need to close issue #86.

add full type hints to all cadnano import/export functions

remove DNAOrigamiDesign

The web interface has no concept of a special origami design type. A DNADesign is implicitly an origami if at least one strand is a scaffold, and multiple strands can be scaffolds.

The Python package is inconsistent, because many Strand's can have the field is_scaffold set to true, but only one of them can be equal to DNAOrigamiDesign.scaffold.

It would be cleaner just to remove DNAOrigamiDesign. There could still be convenience methods for scaffold(s) such as assigning M13 to the first strand labeled as a scaffold.

BREAKING CHANGE: rename DNADesign class to Design

This isn't a big deal, but Design is a bit easier to type, and "DNA" is a bit redundant since this is all about DNA (and is not the same as the DNA sequence assigned to a strand.)

support import of .dna file to create DNADesign

We need to be able to load a .dna file into the objects of the library.
Example usecase: would make my life easier when debugging the .dna <-> cadnano formats.

rewrite origami_rectangle to use add_nick and add_*_crossover

The current version of origami_rectangle.create specifies each Strand by explicitly listing its Substrands. This is tedious and error-prone.

It is simpler to draw two long strands, one in each direction, on each Helix, and then use the methods add_nick, add_half_crossover, and add_full_crossover, just as one would do using cadnano to manually design the origami.

See examples/6-helix-bundle-honeycomb.py for an example.

add Geometry parameters field to DNADesign

It should have the same fields and interpretation as Parameters in codenano:

https://docs.rs/codenano/0.5.1/codenano/struct.Parameters.html

cadnano_v2 import shifts helix positions

I imported the "squarenut.json" origami from here: https://www.dropbox.com/s/zsm3xlnyurnffd9/Nature09.zip?file_subpath=%2FNature09

According to the included squarenut.svg (in the file squarenut.zip), the helices should be positioned this way:

But importing it in scadnano, they appear this way (as though each is shifted to the right by one):

Is this a bug in the import? Or is it a mistake in the way I am implementing the honeycomb lattice in scadnano? (Documented here.)

I intended for it to interpret honeycomb coordinates exactly the same as cadnano, so if I got that wrong, I'll just switch it. But first, I wanted to check to see if it is a bug in the import.

I don't have a working cadnano installation, so it's difficult for me to test this.

Put another way, scadnano assumes that the helix at the origin (helix 21 in the two designs shown) has neighbors above it, below and to the right, and below and to the left, with empty space below it, above and to the right, and above and to the left. The cadnano design seems to invert this.

The file squarenut.zip has the cadnano .json file, the cadnano exported SVG file, and the scadnano .dna file created after importing in the web interface (which calls the Python scripting interface, which is why I posted it in the Python repo).

support Extensions on the end of a Strand

This gives a nice way to specify toeholds and extensions common in DNA strand displacement designs.

This is not supported currently in scadnano, and it will take a lot of effort to support it, since much of the logic pervading the code assumes the first and last substrands are Domain's. There's nothing requiring this in principle, but it will be a headache to change it. Also, some design decisions will have to be made along the way.

For example, the default staple name for exporting sequences is the same as cadnano, where the staple is named after the (helix,offset) pairs of its 5' and 3' ends. We could do the same thing, where if the end of a strand is an Extension, we use the adjacent Domain's to name the staple. But little decisions like this will probably have to happen all over the place.

add support for Helix.major_tick_start and Helix.major_tick_periodic_distances

These were just added to the scadnano web interface to make it easier to edit tick marks manually. But they also give a shorter way to store common periodic tick marks in the .dna file, so they should also be implemented in the Python package as well.

add Strand.reverse() and DNADesign.reverse_all() methods

scadnano can create designs not describable in cadnano, for example using loopouts or parallel crossovers.

One incompatible feature is that scaffolds can go reverse on even-numbered helices; in cadnano they always go forward on even-numbered helices and reverse on odd-numbered helices. But if this is the only incompatibility, then it can easily be fixed by reversing the polarity of all strands: reverse the direction of each Substrand, and reverse the order of the list of all Substrands. A method to do this to a whole DNADesign would simply do this to all strands.

alter helices_view_order to be permutation of helix indices, not of 0,1,...,h-1

Currently if the number of helices is h, then DNADesign.helices_view_order is a permutation of the list [0,1,...,h-1].

It would be more natural if it is a permutation of the set of helix indices.

custom extension in write_scadnano_file

Currently the filename can be customized. If extension is specified instead (should be mutually exclusion with filename), then keep the same name as the script, but use the custom extension instead of the default. (Mostly important for write_scadnano_file, but could be used e.g., with writing DNA sequences.

pip install raises FileNotFoundError for README.md

With scadnano already installed, attempting to upgrade raises an exception:

(base) PS C:\Users\pexat> pip install --upgrade scadnano
Collecting scadnano
  Downloading https://files.pythonhosted.org/packages/02/56/74995fd209b99b246b652b4065bda926ef2c01254a94e814364a4e6e0b09/scadnano-0.2.0.tar.gz (46kB)
     |████████████████████████████████| 51kB 234kB/s
    ERROR: Complete output from command python setup.py egg_info:
    ERROR: Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\Users\pexat\AppData\Local\Temp\pip-install-9u2no0ri\scadnano\setup.py", line 8, in <module>
        with open(path.join(this_directory, 'README.md'), encoding='utf-8') as f:
    FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\pexat\\AppData\\Local\\Temp\\pip-install-9u2no0ri\\scadnano\\README.md'
    ----------------------------------------
ERROR: Command "python setup.py egg_info" failed with error code 1 in C:\Users\pexat\AppData\Local\Temp\pip-install-9u2no0ri\scadnano\

It appears to upgrade okay, however.

is_scaffold bug?

In Strand.to_json_serializable, currently at scadnano.py line 1249.

I strongly believe that you want to check hasattr(self, is_scaffold) and not hasattr(self, is_scaffold_key) as it is ccurently.

support DNA modifications such as biotins/fluorophores/etc.

The easiest way to do this is simply to ignore IDT-style modifications such as /5Biosg/ACGT written into the DNA sequence.

For the long term, support having these modifications specified as part of the Strand object, which can be displayed in scadnano.

allow conversion of insertions/deletions to be placed inline

This is similar to UC-Davis-molecular-computing/scadnano#145 in the web interface, but for the scripting package.

organize examples into user-facing official examples and miscellaneous examples for internal testing

Currently, there are a large number of designs in the examples folder, many not well-commented. I've put those there whenever I made up a new design, typically to test a new feature in the Python package or the web interface.

These should be broken into a "miscellaneous" folder that's like the current one, and an official "user-friendly" examples folder. The latter should contain a small number of well-commented examples intended to showcase the features of scadnano.

switch honeycomb coordinate system to match cadnano

Currently, the scadnano honeycomb coordinate system is a subset of the hex honeycomb system, so simply omits certain (x,y) coordinates.

cadnano uses a bijection between pairs of integers and coordinates. Switch to this would involve more conversion between hex and honeycomb coordinates, but would perhaps make it easy to think about honeycomb coordinates as being uniquely identified by rows and columns.

See UC-Davis-molecular-computing/scadnano#169

allow chained commands for less verbose way to add Strands to DNADesign

codenano allows "chained commands" for a less verbose way to create strands. For example see here: https://docs.rs/codenano/0.5.1/codenano/

design.strand(0, 0).to(31)
    .cross(1).to(10)
    .cross(2).to(21);
// Now its reverse complement:
design.strand(2, 21).to(10)
   .cross(1).to(31)
   .cross(0).to(0);

This is equivalent to the more verbose Python code:

domain_11 = sc.Domain(0, True, 0, 31)
domain_12 = sc.Domain(1, False, 10, 31)
domain_13 = sc.Domain(2, True, 10, 21)
strand1 = sc.Strand([domain_11, domain_12, domain_13])
domain_21 = sc.Domain(2, False, 10, 21)
domain_22 = sc.Domain(1, True, 10, 31)
domain_23 = sc.Domain(0, False, 0, 31)
strand1 = sc.Strand([domain_11, domain_12, domain_13])

or the slightly less verbose

strand1 = sc.Strand([
    sc.Domain(0, True, 0, 31),
    sc.Domain(1, False, 10, 31),
    sc.Domain(2, True, 10, 21),
])
strand1 = sc.Strand([
    sc.Domain(2, False, 10, 21),
    sc.Domain(1, True, 10, 31),
    sc.Domain(0, False, 0, 31),
])

Note that this requires crossovers to be "vertical", i.e., they have the same offset on the from Helix and the to Helix. Perhaps that can be overridden with an optional second parameter to cross. But since the most common crossover is vertical, the less verbose method is superior, since it reduces the amount of redundant information that needs to be specified.

add chained command for adding new domain on current helix

Currently it is expected that cross/loopout and to alternate. Allow two consecutive to's, and add a new function (like to) where two consecutive of them do not make two domains, but merely change the current offset.

remove Helix.rotation and Helix.rotation_anchor from JSON and model; use Helix.position.roll instead

import/export cadnano file format

cadnano_v2 import of squarenut design adds extra strands

In this paper, SI Figure S5 shows a "squarenut design". It is available as a cadnano file here.

I put together a zip file squarenut.zip with three files: the cadnano squarenut.json file, the imported scadnano squarenut.dna file, and the SVG image squarenut.svg showing how the design should appear in cadnano.

This is what the first two helices look like in the SVG file (but there are similar problems in every helix)

This is how they appear in scadnano:

As you can see, there are extra staple strands on the left and right ends of the two helices (in red in scadnano). Each appears to be confined to a single helix, but have long-range horizontal crossovers from one side of the helix to the other.

In the file squarenut.dna, here are the two extra strands on helix 0:

{
  "color": "#cc0000",
  "substrands": [
    {"helix": 0, "forward": false, "start": 9, "end": 35},
    {"helix": 0, "forward": false, "start": 133, "end": 135}
  ]
}

and

{
  "color": "#cc0000",
  "substrands": [
    {"helix": 0, "forward": false, "start": 112, "end": 133}
  ]
},

make ColorCycler part of DNADesign

Ensure that it starts on the first color for every newly created DNADesign.

This will ensure a consistent cycling of colors, instead of being dependent on the global ColorCycler variable.

document :param: and :return: for each library function/method

Currently these are missing on most. Many functions/methods do discuss their parameters and return values. For these it may be most appropriate to move that discussion into :param: and :return:.

add unit tests testing DNAOrigamiDesign with helices that aren't consecutive

set gridless 3D position of parallel helices from crossovers between pairs of helices

This is supported with a button in the web interface:

UC-Davis-molecular-computing/scadnano#289

It should also be supported with a method on DNADesign in the python interface, where it takes an optional iterable of crossovers and an optional iterable of helices (similar to how the user in the web interface can select some helices and some crossovers).

put version in only one place

I tried to make a file scadnano_version, that various other files such as scadnano.py, setup.py, conf.py, could import to see the version.

No matter how I did it, some code would fail to import it properly, whether in CI unit testing, building the docs, or packaging for distribution in PyPI.

Figure out a way to write the current version in exactly one file. Maybe this is as simple as making it a text file that is read, rather than imported, since the Python import rules are so Byzantine.

remove grid_position third coordinate

Use Helix.min_offset instead to specify that a helix starts at a different offset than 0.

save parts of JSON not used by scadnano

Specify for each part of the design, when it is from from a JSON file, it should store all the fields that are not used by scadnano, and write them back out on serialization. This will allow scadnano to edit designs created by other programs that use fields scadnano doesn't use.

This is essentially the same as this feature in the web interface:

UC-Davis-molecular-computing/scadnano#6

allow "domains" on each BoundSubstrand and Loopout

scadnano (and cadnano) logically break up each Strand into Substrand's based on which Helix they are bound in. It is common to furthermore break up each of these into logical "domains" based on which other Strand is bound.

However, some schemes such as the Wang/Thachuk/Soloveichik "leakless circuits" have several consecutive domains between the same Strand's on the same Helix. Thus, we would not want to enforce that when a BoundSubstrand switches from one Domain to another, this necessary occurs at a point where the other Strand switches identity.

If Set<Domain> domains is present as a top-level field in DNADesign, then Strand.dna_sequence should be null for any Strand that has Substrand's with domains. A domain has two fields, String name and int length. For each Substrand with a List<Domain> domains field, the sum of the lengths must equal Substrand.dna_length(). Any domain with name name is considered complementary to any domain with name name + '*'.

uc-davis-molecular-computing / scadnano-python-package Goto Github PK

scadnano-python-package's People

Contributors

Stargazers

Watchers

Forkers

scadnano-python-package's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs