lcsb-biocore / cobrexa.jl Goto Github PK
View Code? Open in Web Editor NEWConstraint-Based Reconstruction and EXascale Analysis
Home Page: https://lcsb-biocore.github.io/COBREXA.jl/
License: Apache License 2.0
Constraint-Based Reconstruction and EXascale Analysis
Home Page: https://lcsb-biocore.github.io/COBREXA.jl/
License: Apache License 2.0
as we need to stay compatible with certain external packages, triggering their testing pipelines needs to be integrated into our testing cycle
There's no haskey()
for MAT files
using MAT
file=matopen("test/data/toyModel1.mat")
haskey(file, "model")
ERROR: MethodError: no method matching haskey(::MAT.MAT_v5.Matlabv5File, ::String)
From our discussion today via Slack (@exaexa @laurentheirendt):
LinearModel
-> CoreModel
SBMLModel
, MATModel
, JSONModel
, (maybe YAMLModel
?) types to store models read in from those files. Use all fields supported by the various file types.StandardModel
or CoreModel
depending on the input type with restrictions on which type of model depending on the purpose of the reconstruction functionsCoreModel
and SBMLModel
can at most output a CoreModel
StandardModel
or CoreModel
accessors
rather than deep copies when converting model types@exaexa had a great idea of using callbacks as function arguments to modify the function applied to a model. E.g. using user defined bounds on some reactions in FBA etc. In fba(...)
, fva(...)
and pfba(...)
I have a bunch of keyword arguments that are largely repeats and should all be wrapped in some way. This is likely a feature the average user will end up using often I think.
Type name change
it is a bit ugly that JSON and SBML model contain .m
but MATModel contains .mat
. Either clean up to all .m
or use .json
and .sbml
.
Same for model names used in function parameters, there's m
, model
and a
, with occasional excesses.
Lots of tests should be homogenized e.g. test/io/io_test.jl
tests read write of StandardModel
and test/io/writer.jl
does the same thing but in a different way. I think we can get rid of test/testing_functions.jl
by making the testing style more uniform. Will add more comments as I spot things. Also, #64 will change all the file names in src/io
and this should be reflected in test/io
.
Housekeeping
There's a pretty good package for prettyprinting reasonably without colors, instead with all these nice features like auto-ellipsis and indentation: https://github.com/MechanicalRabbit/PrettyPrinting.jl
We should really use that.
The tests and code are getting loaded with wildcards; we should have the same for docs building.
I found a few examples of outdated code in src/sampling/hit_and_run.jl
that rely on having StandardModel .reactions
as a vector. It would be great to have that cleaned out.
doc tests should be run on merge requests, but the documentation should not be deployed
original issue: https://git-r3lab.uni.lu/PerMedCoE/COBREXA.jl/-/issues/21
we need to speed up the testing pipeline
Originally posted by @stelmo in #79 (comment)
This is fine (good size for testing quickly) but not realistic sized, we should put in the docs that the user must make these constants much larger.
lots of doc tests are actually failing - https://git-r3lab.uni.lu/lcsb-biocore/COBREXA.jl/-/jobs/244834
original issue: https://git-r3lab.uni.lu/PerMedCoE/COBREXA.jl/-/issues/17
We need to add Codecov and Coveralls to the testing pipeline
master
I still don't like the automatic file loading thing because it introduce alphabetic ordering issues with package loading, e.g. I have to append "a" in front of reaction.jl to make sure it loads before cobraModel.jl... I think it solves a slight inconvenience of listing all the packages but introduces a bigger inconvenience with having to have creative file names (less descriptive names)...
So far this only seems to affect me, but what do ya'll think?
Let's make macros to really make the user interface clean. I have implemented fba macros in #53 (see the last few commits) and @marvinvanaalst has implemented a macro for reaction adding in #59 . @exaexa and I were also talking about this extensively on slack, it would be really cool to have something like this (from slack @exaexa):
vs = @variants
knockout(123)
knockout(4345)
...
end
@mod_variants! vs
remove_reaction(123)
end
@combine_variants! vs
no_modification()
add_random_ATP()
add_some_toxin()
end
Currently I have an fba mini version of this working as shown below:
using COBREXA
using Tulip
model = read_model(joinpath("e_coli_core.json"))
biomass = findfirst(model.reactions, "BIOMASS_Ecoli_core_w_GAM")
glucose = findfirst(model.reactions, "EX_glc__D_e")
vec = @flux_balance_analysis_vec model Tulip.Optimizer begin
modify_objective(biomass)
modify_constraint(glucose, -8.0, -8.0)
end
src/io/io.jl
:
For example:
download("https://www.vmh.life/files/reconstructions/AGORA/1.03/reconstructions/sbml/Abiotrophia_defectiva_ATCC_49176.xml", "testModel.xml");
model = readSBML("testModel.xml");
findall(getOCs(model)!.=0)
# all zeros
getLBs(model)
# -Inf when should be -1000
original issue: https://git-r3lab.uni.lu/PerMedCoE/COBREXA.jl/-/issues/19
I think that this is not needed anymore ...
State:
SBML.jl imports the dense version of the model on file, we should only have one struct that stores this information (avoid clutter). I think this should be CobraModel since model construction will likely happen in it... ?
We should put references to algorithms in the docs in the code, Cobrapy does it and I like it, it makes it easier to understand why/what exactly is implemented.
Originally posted by @stelmo in #60 (comment)
Transcript:
Mo 1:03 PM
we should change the file names of "modeling.jl" and "model_manipulations.jl" to something like "manipulate_linearmodel.jl" and "manipulate_fullmodel.jl"
mirek 1:04 PM
or modeling/fullmodel and modeling/linearmodel
would also make it easier to split into doc sections
Mo 1:05 PM
also I think find_exchange_metabolites should be near my exchange_reactions and we should just have one name and dispatch on model type
mirek 1:05 PM
yeah
Mo 1:05 PM
this is of course not for this PR but later
should I make an issue to remind us?
mirek 1:05 PM
same thing for the prettyprinting&misc functions probably, they are now mixed with the model docs
I'll make one
Originally posted by @exaexa in #78 (comment)
Also see the related issue in GigaSOM:
LCSB-BioCore/GigaSOM.jl#147
We should definitely have some way of doing knockouts on models. This makes the most sense for StandardModel
. Currently the plan is to add a field to StandardModel
:
mutable struct StandardModel <: MetabolicModel
id::String
reactions::Array{Reaction,1}
metabolites::Array{Metabolite,1}
genes::Array{Gene,1}
gene_reaction::Dict{Gene, Array{Reaction,1}}
end
This should make looking up reactions affected by deletions quicker than loopings over all reactions for each gene.
Then a new function,
knockout_modification = knockout(model, gene1, gene2, ..., geneN)
needs to made that will intelligently create a callback that can be passed to the modifications
argument in the analysis functions to actually do the knockout. WIP
original issue: https://git-r3lab.uni.lu/PerMedCoE/COBREXA.jl/-/issues/15
I commented the test for this function out and added a @test_broken true as a placeholder. I'm pretty sure this function got lost in translation somewhere :)
The docs are more like a collection of ideas now, we should give it a clear tutorial structure.
My idea for the structure is as follows:
Feel free to edit/suggest.
using the extraordinary powers of @cylon-x ๐ค
Currently the samplers are not super robust and the testing leaves much to be desired.
...so that one can turn them on and off on demand with the new interface.
@info "Annoying message"
becomes
@_topic_log @info "Default-suppressed annoying message"
original issue: https://git-r3lab.uni.lu/PerMedCoE/COBREXA.jl/-/issues/22
Showing incomplete information is often confusing, some parts do not "combine" well.
Use this as a guideline:
http://hackage.haskell.org/package/base-4.15.0.0/docs/Text-Show.html#t:Show
Bound vectors are typically not sparse in the usual sense. They are populated with max/min bounds and not very many zeros. It might make sense to make our own "sparse" vector format where the zeros are actually the max or min bounds. This might save significant storage at the exa-scale...
Download.download
in something that does all this automagicallyIdea (for discussion):
JuMP uses snake_case, Julia actually uses lot of snake_case for everything too. Should we go that way too before the package is out?
(currently the badges point to Elmo's repo)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.