GithubHelp home page GithubHelp logo

canmod / macpan2 Goto Github PK

View Code? Open in Web Editor NEW
2.0 3.0 0.0 46.42 MB

Rebuilding https://github.com/mac-theobio/McMasterPandemic/

Home Page: https://canmod.github.io/macpan2/

License: GNU General Public License v3.0

R 60.20% C++ 38.71% Makefile 0.26% TeX 0.63% Rez 0.01% MATLAB 0.19%
compartmental-models epidemiology forecasting mixed-effects model-fitting optimization simulation-modeling simulation

macpan2's People

Contributors

bbolker avatar flynn-primrose avatar guanwg avatar jaganmn avatar jfree-man avatar kevinozoid avatar mayaearn avatar papsti avatar stevencarlislewalker avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

macpan2's Issues

Modify the square bracket function so that arbitrary sub-blocks of the matrix can be extracted

There are three arguments:

  • m -- a matrix being accessed
  • i -- the column index
  • j -- the row index

Currently i and j can only take 1-by-1 matrices. Please modify so that they can take arbitrary matrices of indices and apply the following algorithm (or a more efficient equivalent).

  • Flatten i and j by stacking columns on top of each other
  • Return the sub-matrix of m containing the rows identified by i and columns identified by j
  • The order of the rows and columns in this matrix should be consistent with the ordering provided by i and j

Debugging vignette

Write a vignette illustrating the options(error = recover) strategy and why it is the best approach with the object oriented architecture. Should illustrate how and why anonymous functions that immediately follow calls to lapply in the stack of often of particular interest.

Model updating functionality

Problem -- synchronization v speed trade-off

We always store constructor arguments as fields. We always try to store information that is derived from the arguments in methods so that if the argument fields are updated the derivations will be in sync with the new arguments. Sometimes these derived methods are problematically slow (e.g. when computing the ad_fun -- this really cannot be computed each time we need a simulation or the whole TMB performance gains will be severely compromised. On the other hand we do not want to store this type of derived information as fields because then they get out of sync if the arguments are updated.

Solution

  • Create a Synchronize class that is a direct parent for classes that contain a single derived information method.
  • Instances of these child classes can be composed with the focal class containing the problematic derived information method.
  • Classes inheriting from Synchronize also contain a field that caches the results of the derived information method.

document `oor` package some more?

It would be helpful to say a little bit more about oor in the README file; I tried to install locally and failed, then figured out that remotes::install_github() would work automatically.

Gamma Kernel

User Story

I want to generate a gamma convolution kernel on the c++ side so that I can optimize its shape parameters.

Signature

gamma_kernel(length, proportion, mean, cv)

Arguments

  • length -- Number of time-steps in the kernel
  • proportion -- Height parameter for the kernel
  • mean -- Location parameter for the kernel
  • cv -- Spread parameter for the kernel

Development Notes

This should be already done in macpan1 as the only kernel available in that engine. The arguments map as follows -- proportion = c_prop, mean = c_mean, cv = c_cv

See here for a description and here for the implementation.

User Notes

The following expression,

reports = convolution(I, gamma_kernel(14, c_prop, 0.1, 0.25))

would allow a user to fit the c_prop parameter used to control the under-reporting fraction.

Flows product

  • Create a flows_explicit method in Model that always returns the flows with the optional columns (e.g. from_partition)
  • [ ]

Colon operator and sequence function

Sequence

  • R-side symbol -- seq
  • Three arguments
    • from -- first integer in the sequence
    • length -- length of the sequence
    • by -- number of integers to skip between adjacent elements in the sequence
  • Example:
    • Input: seq(0, 4, 6)
    • Output: c(0, 6, 12, 18)

Colon

  • R-side symbol -- :
  • Two arguments
    • from -- first integer in the sequence
    • to -- last integer in the sequence
  • Example:
    • Input: 5:7
    • Output: c(5, 6, 7)

clean up starter models directory to include only actual starter models

The SI_products directory for example should be somewhere else, because if you try to create a compartmental model from this directory it will fail. But any directory in starter_models should be a valid compartmental model.

Update: perhaps what we mean by an 'actual starter model' is one with a README.md file that contains a yaml header? This would mean that other starter models are OK to be in the directory, but they will not show up with show_models() or in https://canmod.github.io/macpan2/articles/example_models.html and so will be unlikely to be discovered.

possible inconsistency in terminology for flow spec

in the "Flow between Compartments" section of vignettes/model_definitions.Rmd, the columns of the flows data frame are initially described as

from | to | rate_component | component_type

but then in the bulleted list, the last two columns are named

 flow | flow_type

moreover, in the SIR example, the last two column names in flows.csv are

flow_component | component_type

i'm not sure whether this is a true inconsistency or whether it's just that i'm not yet familiar enough with macpan2, but i thought i'd flag it as it was unclear to me as a macpan2 novice.

Managing Sparsity

Background

We want to avoid sparse matrices for now, but we do have genuinely sparse matrices -- rate and flow matrices -- that will slow things down if expressed as a dense matrix.

Can we do a few things now that will soften this issue?

We are already planning to get the rates in triplet form -- from, to, rate. This raises the possibility of addressing any use cases directly from the triplet form without explicitly forming a matrix -- dense or sparse.

One use case is to just compute the inflow and outflow vectors, which could be solved by a tapply-like function on the triplet form.

However, if we want to take the dominant eigenvalue of the next-generation matrix, we need (I think need) to take a matrix inverse and this will presumably benefit from genuinely sparse methods.

The main concerns with moving to sparse matrices are that it would increase our testing burden and possibly the complexity of some functions that require handling sparse and dense cases differently. But if we make special functions that take triplet form input and then do specialized targeted tasks that 'need' sparse methods (eg nextgen matrices), we could isolate this complexity.

Functions

groupSums

Arguments:

  • column vector, x, of values to sum
  • column vector, z, the same length as x containing indices into the return vector, y
y = Zeros(z.max() + 1, 1);
for (i = 0; i < x.size(); i++) {
  y[z[i]] += x[i]
}
return y;

Script for SV-E-IH-R model

  • Write a script that directly calls TMBModel and TMBSimulator
  • Have clear comments about the what the model inputs mean
  • Have the TMBModel object be as close as possible to what we believe now will be generated by the model files translator
  • Create another version of the script that includes a time-varying parameter

Assign function

User Story

I want to be able to break apart a matrix into smaller pieces (e.g. unpack a state vector into scalar states; state -> S,I,R), so that I can update the matrix/vector using linear algebra but still have convenient access to the components as variables in and of themselves.

Signature

assign(m x1, x2, ...)

Arguments

  • m -- A matrix with values
  • x1, x2, ... -- Matrices that will have their values modified in-place by the values in m

Behaviour

In column-major order, loop over the elements in m and assign them to the elements in the x matrices in column-major order as well. Stop either when all elements in m have been assigned or when all elements in the x matrices have been filled, which ever comes first.

Return Value

A single one-by-one matrix with a zero in it -- this is a stand-in for NULL.

The null matrix is in the last position of the 'mats' list. Therefore, on the R side we need to pass one more matrix than the user provides. For sanity we should always pass this null matrix even if the assign function is not used in the model.

This is a difference between mats (on the c++ side) and valid_vars (on the R side). The former has one more element than the former. This additional element is the null matrix. To compute its zero based index into mats, we compute length(valid_vars) on the R side. Therefore, when the assign function is used, the expr_output_id = length(valid_vars)

Parse derivations.json and flows.csv

The derivations.json file can be used to generate what we will call 'user-defined expressions' and flows.csv can be used to generate state-updating expressions.

We need the following three methods that take a model definition and return all user-defined expressions that should get passed to the ...

  1. ... before argument in TMBModel
  2. ... during argument in TMBModel
  3. ... after argument in TMBModel

We also need a method to generate the state-updating expressions, which should be appended at the end of the during list generated in the parsing of the user-defined expressions.

Tasks

  • flows.csv
  • derivations.csv

Concatenation of String objects not finished

Steps

library(testthat); library(macpan2)
x = macpan2:::StringUndottedVector("S", "E", "I", "R")
y = macpan2:::StringUndottedVector("D")
z = macpan2:::StringUndottedVector("S", "E", "I", "R", "D")
expect_identical(c(x, y), z) # failing now

The problem seems to be inconsistent and incomplete implementation of value_combiner methods.

Modify the specs and variable names related to time lags and list variables

  • Remove lists from the spec
  • Remove time lags from the spec, at least as an atomic concept
  • Remove expr_output_count because without lists this will always be 1
  • Describe how the engine makes both the (1) matrix-valued arguments themselves and (2) the indices to these arguments both available to a developer of a new function
  • Modify the names of argument value list and argument index list in the C++ code so that they are more descriptive of this idea

New functions rbind_lag and rbind_time

Objectives

  • Remove extract_lag, extract_time, select_lag, select_time
  • Replace with rbind_lag and rbind_time

Specs rbind_lag

  • Arguments
    • m -- a matrix with saved history
    • i -- a column vector of integers
  • Behaviour
    • For each value of i, access the value of the matrix, m, that many time steps in the past
    • If each value of m has the same number of columns then create a single matrix by stacking the rows on top of each other in the order provided by i -- and return the result
    • If not all values of m have the same number of columns then return an error
    • If the simulation history is not saved for the first argument, then throw an error unless the user only asks for the current matrix
    • At each iteration the vector of lags will determine a set of time-steps -- but only time-steps between 1 and T inclusive are valid, and all others should be thrown away

Specs rbind_time

Same as for rbind_lag but now the indices in i refer to absolute time steps instead numbers of time-steps in the past.

Example User

Notes to the future

We might want to extend this to cbind_lag, cbind_time, flatten_lag, and flatten_time, but for now rbind_lag and rbind_time are fine.

Convolution

Use-case example

kernel = gamma_kernel(9, 0.1, 0.25, 0.4)  ## pre-simulation loop
reports = convolution(foi * S, kernel)  ## every simulation loop

Specs for convolution

convolution(m, kernel):

  • m -- matrix with saved history
  • kernel -- column vector of length less than the number of iterations

At iteration, t, do the following.

  1. If t < length(kernel) return zero, otherwise continue.
  2. For each element of m, get a vector with the history of this vector over each of the proceeding length(kernel) time times (including the current time)
  3. For each element of m, take the inner product between this history vector and the kernel.
  4. Return a new matrix the same shape as m but with the inner products.

Clean up validity messaging

This is a technical task to deal with a limitation of the current machinery for communicating to the user when there is a validity problem. TODO: add more detail

C++ developer utility function to repeat dimensions of 1 over an n by m matrix

As a C++ developer who gets a matrix with one row or one column,
I want a function to repeat the values in those rows and columns n and m times,
so that I can use it in for loops over rows and columns with n by m matrices.

For example, consider the following function.

case MP2_NORMAL_DENSITY:
  rows = r[0].rows();
  cols = r[0].cols();
  m = matrix<Type>::Zero(rows, cols);
  for (int i=0; i<rows; i++) {
      for (int j=0; j<cols; j++) {
          m.coeffRef(i,j) = -dnorm(r[0].coeff(i,j), r[1].coeff(i,j), r[2].coeff(i,j), 1);
      }
  }
  return m;

Here we have three arguments observed, r[0], expected, r[1], and standard deviation, r[2]. It would be nice to be able to do this:

case MP2_NORMAL_DENSITY:
  rows = r[0].rows();
  cols = r[0].cols();
  r[1] = RecycleInPlace(r[1], rows, cols);
  r[2] = RecycleInPlacer[2], rows, cols);
  m = matrix<Type>::Zero(rows, cols);
  for (int i=0; i<rows; i++) {
      for (int j=0; j<cols; j++) {
          m.coeffRef(i,j) = -dnorm(r[0].coeff(i,j), r[1].coeff(i,j), r[2].coeff(i,j), 1);
      }
  }
  return m;

Simplify spec

We do not need

  • DATA_IVECTOR(mats_save_hist) -- because we have (r|c)bind_(lag|time) we have a more flexible way of specifying what history should be saved
  • DATA_IVECTOR(expr_output_count) -- because we do not have lists of matrices anymore, the output count is always 1

calibrate + forecast vignette

Right now I don't think it's possible for a non-developer to figure out how to use macpan2 for calibration and forecasting. I understand that's not the focus of Irena and Mike's efforts right now (or of the upcoming workshop), but it would be good to have a short document that shows how to do this - maybe for the SIR model, for some simulated data and then for some real SIR-ish data (e.g. take an example or two from the fitode package ?)

Move macpan.cpp over to src

Maybe through a make rule so that C++ developers can treat the current location as a development environment, just as we do now in macpan1.

New function: triplets_to_matrix

New function called triplets_to_matrix(x, v, m) with the following arguments.

  • x -- A matrix
  • v -- A column vector of values to add to elements of x
  • m -- A matrix of integers

Here is pseudo-code for the function.

# 1. Fill x with all zeros.
x[,] = 0

# 2. Add elements in v to elements of x, according to the indices in m.
for (int k=0; i<n; i++)
  x[m[k,0], m[k,1]] += v[m[k,2], 0]

# 3. Return x.
x

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.