The cobrexa.jl's discuss from lcsb-biocore

trigger test pipelines in external repos

as we need to stay compatible with certain external packages, triggering their testing pipelines needs to be integrated into our testing cycle

`loadModel` is broken

There's no haskey() for MAT files

using MAT
file=matopen("test/data/toyModel1.mat")
haskey(file, "model")

ERROR: MethodError: no method matching haskey(::MAT.MAT_v5.Matlabv5File, ::String)

Ongoing design considerations

From our discussion today via Slack (@exaexa @laurentheirendt):

LinearModel -> CoreModel
Creation of SBMLModel, MATModel, JSONModel, (maybe YAMLModel?) types to store models read in from those files. Use all fields supported by the various file types.
Analysis functions should work on all model types. Input: model type. Output: numbers etc.
Reconstruction functions will output StandardModel or CoreModel depending on the input type with restrictions on which type of model depending on the purpose of the reconstruction functions
Squashing models: example: CoreModel and SBMLModel can at most output a CoreModel
Reconstruction functions will have as an input StandardModel or CoreModel
accessors rather than deep copies when converting model types

Implement function callbacks to make function modifications uniform and streamlined

@exaexa had a great idea of using callbacks as function arguments to modify the function applied to a model. E.g. using user defined bounds on some reactions in FBA etc. In fba(...), fva(...) and pfba(...) I have a bunch of keyword arguments that are largely repeats and should all be wrapped in some way. This is likely a feature the average user will end up using often I think.

Change CobraModel to FullModel

Type name change

Consistent model variable naming

it is a bit ugly that JSON and SBML model contain .m but MATModel contains .mat. Either clean up to all .m or use .json and .sbml.

Same for model names used in function parameters, there's m, model and a, with occasional excesses.

generate and deploy docker & singularity containers

add container specification files
deploy

Regroup tests to files that match src/

Lots of tests should be homogenized e.g. test/io/io_test.jl tests read write of StandardModel and test/io/writer.jl does the same thing but in a different way. I think we can get rid of test/testing_functions.jl by making the testing style more uniform. Will add more comments as I spot things. Also, #64 will change all the file names in src/io and this should be reflected in test/io.

Community model + tutorial

load a heap of whatever models
have the model structure re-ID them correctly and add exchange reactions

File cleanup + housekeeping

Housekeeping

Move analysis problem modifications out of reconstruction and into analysis directory

Make prettyprinting systematic

There's a pretty good package for prettyprinting reasonably without colors, instead with all these nice features like auto-ellipsis and indentation: https://github.com/MechanicalRabbit/PrettyPrinting.jl

We should really use that.

Wildcard documentation building

The tests and code are getting loaded with wildcards; we should have the same for docs building.

Fix the code examples after StandardModel has changed

I found a few examples of outdated code in src/sampling/hit_and_run.jl that rely on having StandardModel .reactions as a vector. It would be great to have that cleaned out.

add doc test to pipeline

doc tests should be run on merge requests, but the documentation should not be deployed

Expand IO (MAT)

original issue: https://git-r3lab.uni.lu/PerMedCoE/COBREXA.jl/-/issues/21

speed up testing pipeline

we need to speed up the testing pipeline

Explain the need of modifying the sampling defaults in docs

Originally posted by @stelmo in #79 (comment)

This is fine (good size for testing quickly) but not realistic sized, we should put in the docs that the user must make these constants much larger.

add many doctests, convert all examples to doctests

lots of doc tests are actually failing - https://git-r3lab.uni.lu/lcsb-biocore/COBREXA.jl/-/jobs/244834

Migration to github.com

original issue: https://git-r3lab.uni.lu/PerMedCoE/COBREXA.jl/-/issues/17

add CodeCov

We need to add Codecov and Coveralls to the testing pipeline

add travis and github actions

this should only be tested on master
to be implemented when package has been released

Homogenize tests

After #92 the package structure will change somewhat. Let's fix the test layout to match the src directory structure better, and perhaps implement better tests (@stelmo I am looking at you)

improve readme for beginners

Automatic file loading

I still don't like the automatic file loading thing because it introduce alphabetic ordering issues with package loading, e.g. I have to append "a" in front of reaction.jl to make sure it loads before cobraModel.jl... I think it solves a slight inconvenience of listing all the packages but introduces a bigger inconvenience with having to have creative file names (less descriptive names)...

So far this only seems to affect me, but what do ya'll think?

Implement macros for all functions (where appropriate)

Let's make macros to really make the user interface clean. I have implemented fba macros in #53 (see the last few commits) and @marvinvanaalst has implemented a macro for reaction adding in #59 . @exaexa and I were also talking about this extensively on slack, it would be really cool to have something like this (from slack @exaexa):

vs = @variants
  knockout(123)
  knockout(4345)
  ...
end

@mod_variants! vs
  remove_reaction(123)
end

@combine_variants! vs
  no_modification()
  add_random_ATP()
  add_some_toxin()
end

Currently I have an fba mini version of this working as shown below:

using COBREXA
using Tulip

model = read_model(joinpath("e_coli_core.json"))
biomass = findfirst(model.reactions, "BIOMASS_Ecoli_core_w_GAM")
glucose = findfirst(model.reactions, "EX_glc__D_e")

vec = @flux_balance_analysis_vec model Tulip.Optimizer begin
    modify_objective(biomass)
    modify_constraint(glucose, -8.0, -8.0)
end

clean up error handling and warnings

src/io/io.jl:

the warning is fishy, probably should be an error
the error only prints an error, should throw

Some SBML formats (level 3?) are not read correctly

For example:

download("https://www.vmh.life/files/reconstructions/AGORA/1.03/reconstructions/sbml/Abiotrophia_defectiva_ATCC_49176.xml", "testModel.xml");
model = readSBML("testModel.xml");

findall(getOCs(model)!.=0)
# all zeros
getLBs(model)
# -Inf when should be -1000

establish more elaborate PR and issue templates

i/o of SBML files

original issue: https://git-r3lab.uni.lu/PerMedCoE/COBREXA.jl/-/issues/19

remove PyCall Dependency

I think that this is not needed anymore ...

Make sure the accessors for MetabolicModel kinda feature-complete

State:

SBMLModel should just be CobraModel

SBML.jl imports the dense version of the model on file, we should only have one struct that stores this information (avoid clutter). I think this should be CobraModel since model construction will likely happen in it... ?

Add references to methods where possible

We should put references to algorithms in the docs in the code, Cobrapy does it and I like it, it makes it easier to understand why/what exactly is implemented.

Originally posted by @stelmo in #60 (comment)

reorganize file contents

Transcript:

Mo  1:03 PM
we should change the file names of "modeling.jl" and "model_manipulations.jl" to something like "manipulate_linearmodel.jl" and "manipulate_fullmodel.jl"
mirek  1:04 PM
or modeling/fullmodel and modeling/linearmodel
would also make it easier to split into doc sections
Mo  1:05 PM
also I think find_exchange_metabolites should be near my exchange_reactions and we should just have one name and dispatch on model type
mirek  1:05 PM
yeah
Mo  1:05 PM
this is of course not for this PR but later
should I make an issue to remind us?
mirek  1:05 PM
same thing for the prettyprinting&misc functions probably, they are now mixed with the model docs
I'll make one

We should eventually use StableRNGs for sampling and all other random number generation.

Originally posted by @exaexa in #78 (comment)

Also see the related issue in GigaSOM:
LCSB-BioCore/GigaSOM.jl#147

Implement knockouts in an efficient way

We should definitely have some way of doing knockouts on models. This makes the most sense for StandardModel. Currently the plan is to add a field to StandardModel:

mutable struct StandardModel <: MetabolicModel
    id::String
    reactions::Array{Reaction,1}
    metabolites::Array{Metabolite,1}
    genes::Array{Gene,1}
    gene_reaction::Dict{Gene, Array{Reaction,1}}
end

This should make looking up reactions affected by deletions quicker than loopings over all reactions for each gene.

Then a new function,

knockout_modification = knockout(model, gene1, gene2, ..., geneN)

needs to made that will intelligently create a callback that can be passed to the modifications argument in the analysis functions to actually do the knockout. WIP

storage of large model

original issue: https://git-r3lab.uni.lu/PerMedCoE/COBREXA.jl/-/issues/15

`fluxBalanceAnalysisVec` seems to be missing

I commented the test for this function out and added a @test_broken true as a placeholder. I'm pretty sure this function got lost in translation somewhere :)

Overview issue: documentation structure

The docs are more like a collection of ideas now, we should give it a clear tutorial structure.

My idea for the structure is as follows:

Feel free to edit/suggest.

replace format check with bot action to fix automagically

using the extraordinary powers of @cylon-x 🤖

generate html files from tutorials automatically

Fix samplers and create good tests for them

Currently the samplers are not super robust and the testing leaves much to be desired.

Fix ACHR
Add better tests
Add projections to ensure robust sampling in case the samplers go out of bounds

Decorate miscellaneous info&warnings with log topics

...so that one can turn them on and off on demand with the new interface.

@info "Annoying message"

becomes

@_topic_log @info "Default-suppressed annoying message"

fluxVariabilityAnalysis doesn't check termination status

original issue: https://git-r3lab.uni.lu/PerMedCoE/COBREXA.jl/-/issues/22

Prettyprinting should not hide any information

Showing incomplete information is often confusing, some parts do not "combine" well.

Use this as a guideline:
http://hackage.haskell.org/package/base-4.15.0.0/docs/Text-Show.html#t:Show

Bounds should not be sparse vectors, maybe a new data type...

Bound vectors are typically not sparse in the usual sense. They are populated with max/min bounds and not very many zeros. It might make sense to make our own "sparse" vector format where the zeros are actually the max or min bounds. This might save significant storage at the exa-scale...

Clean-up downloading of test models

check if the file exists before downloading
always check against a hash and print an error if the hashes do not match, so that we can quickly spot that something fishy happened to the models
preferably wrap the Download.download in something that does all this automagically

lcsb-biocore / cobrexa.jl Goto Github PK

cobrexa.jl's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs