harmslab / gpmap Goto Github PK
View Code? Open in Web Editor NEWA Python API for managing genotype-phenotype map data
A Python API for managing genotype-phenotype map data
I have DST information for drug resistant and susceptible bacteria. How can I convert them to a genotype-phenotype map for your gpmap & epistasis modules?
Need to properly handle extra data that isn't used by GPMap. Right now, GPMap ignores extra data annotated on each genotype. Users however, will likely want to keep this information together.
The question is, should we expose these attributes through the GPMap interface. Or should they only live in the underlying GPMap DataFrame?
The dictionary and json formats put metadata and data at the same level, i.e.
{
"name": "my_data",
"description": "a sentence about my data",
"genotypes": [...],
"phenotypes": [...],
.
.
.
}
I suggest that we move data underneath a "data" key. This aligns more with the GenotypePhenotypeMap
anyways, which stores the data in a DataFrame under the data
attribute. This cleanly separates the data from metadata.
This idea came up when thinking about how someone stores data in an Excel file or CSV file. Metadata can't easily go in those formats next to the data, so they usually live in a separate file (like a JSON or YAML file). If they were to merge, say, an Excel data and JSON meta data file into a single JSON file, I think it's more clear to have the data become a single field with subfields.
{
"name": "my_data",
"description": "a sentence about my data",
"data" : {
"genotypes": [...],
"phenotypes": [...],
.
.
.
}
}
I wonder if it's suitable for full sequence?
When I test the test_data and change it to amino acid genotypes, it works well. However, when I elongate the test sequence to 242 aa, it can't work.
And I got error at the step of SequenceSpace():
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 14.2 GiB for an array with shape (1912602624,) and data type float64
I hope to use it to analyse random mutantions on a 239 amino acids long gene.
Error:
AttributeError: GenotypePhenotypeMap instance has no attribute '_indices'
When using code from tutorial:
from gpmap import GenotypePhenotypeMap
wildtype = "AA"
genotypes = ["AA", "AV", "AM", "VA", "VV", "VM"]
phenotypes = [1.0, 1.1, 1.4, 1.5, 2.0, 3.0]
gpm = GenotypePhenotypeMap(wildtype, genotypes, phenotypes)
Attempted on two machines with same result.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.