jonesor / rage Goto Github PK

View Code? Open in Web Editor NEW

10.0 10.0 15.0 6.24 MB

R tools for performing ageing focussed analyses using matrix population models.

Home Page: https://jonesor.github.io/Rage/

R 100.00%

rage's People

Contributors

Stargazers

Watchers

Forkers

iainmstott patrickbarks davan690 robsalgo polcap levisc8 jjackson-eco bartholdja cschuette817 tdjames1 jchecastaldo chelseacthomas wpetry hinrich62 ecosantos

rage's Issues

Replace matFmu arguments

Replace matFmu argument in functions standardizevitalrates and reprodStages with more generic argument to identify reproductive stages, because identifying reproductive stages using the mean fecundity matrix is just one possible approach to identifying reproductive stages.

Code duplication across functions

Related to #14 , there's a fair amount of code duplication across a few Rage functions:

makeLifeTable, dEntropy, and kEntropy all use the same loop to calculate survivorship. I believe the loop within longevity is mathematically equivalent (it's finding the age where lx = 0.01), though coded a bit differently.
makeLifeTable and dEntropy use the same loops to calculate age-specific fertility (mx) and clonality (cx).

To avoid some of this duplication and make for easier code maintainability, we could perhaps pull some of the duplicated sections into separate internal functions that get called by the user-facing functions when required.

Alternatively, we could keep the lx/mx/cx functions within makeLifeTable, and then re-work dEntropy and kEntropy so that they take a life table or life table components as arguments, instead of transition matrices.

Make longevity and R0 return NA if matU is singular

Functions longevity (life expectancy component) and R0 will fail if matU is singular (i.e. non-invertable). This usually indicates infinite life expectancy due to a 100% stasis loop in the final stage(s).

In jonesor/Rcompadre#30 I suggest adding a column-flag to cleanDB corresponding to singular matU, which makes it easier for a user to remove problem rows prior to an analysis.

I propose we also modify longevity and R0 to return NA if matU is singular (and update the documentation accordingly). This would allow the user to retain rows in the db subset they're working with for which life expectancy or R0 cannot be calculated, which may be convenient if the user is also conducting other analyses with other sets of variables (i.e. the user doesn't have to create a bunch of different subsets of db for different analyses).

Bugs in makeLifeTable, dEntropy, and kEntropy

I've noticed a few potential bugs in makeLifeTable, some of which also apply to dEntropy, and kEntropy.

Applies to `makeLifeTable`, `dEntropy`, and `kEntropy`

1. I think the loop to calculate age-specific survivorship

  matUtemp = matU
  # ...
  for (o in 1:nSteps) {
    survivorship[o,] = colSums(matUtemp %*% matU)
    matUtemp = matUtemp %*% matU
  }

should instead be

  matUtemp = matU
  # ...
  for (o in 1:nSteps) {
    survivorship[o,] = colSums(matUtemp)
    matUtemp = matUtemp %*% matU
  }

As written, the code effectively skips over the first age class, for which survivorship should be colSums(matU).

Caswell's Matlab code for lx is (Caswell 2001, pg 120):

  for x = 1:150
    surv(x,:) = sum(T^x);

I think the current R code is effectively doing this:

  for x = 1:150
    surv(x,:) = sum(T^(x+1));

Applies to `makeLifeTable` and `dEntropy`

2. In the loops to calculate age-specific fertility (fertMatrix) and clonality (clonMatrix), there are a few potential issues (these only seem to cause problems when startLife > 1):

diag(t(e) %*% matUtemp2) is supposed to create a square matrix with vector t(e) %*% matUtemp2 along the diagonal, but because t(e) %*% matUtemp2 returns a 1-row matrix, diag(...) instead pulls out the single 'diagonal' element of that matrix, [1,1]. This could be fixed using as.numeric(), or by using colSums(matUtemp2) instead of the matrix multiplication.
The single instance of scalar multiplication should be changed to matrix multiplication (Caswell 2001 pg. 120).
The matrix inverse fails anytime a column within matU contains all zeros (even if it's only the last column). That whole operation is really just dividing columns of matUtemp2 by their sum, which can be coded without the matrix inverse (Caswell 2001 pg. 123).

Applies only to `makeLifeTable`

3. In the calculation of Lx, I think the substract operation should instead be addition, to get the average survivorship between age x and x+1. L[x] = (l[x] + l[x+1]) / 2

4. I think the calculation for Tx should be sum(L[x]:L[maxAge]), instead of sum(L[1]:L[x]) (cumsum is doing the latter).

Clarify/fix vitalRatePerturbation

vitalRatePerturbation doesn't work the way I'd expect. If you pass it an age-structured model (i.e. matU is a Leslie matrix with survival probabilities along the subdiagonal), it will return non-zero sensitivities for growth, e.g.

matU <- rbind(c(0.0, 0.0, 0.0),
              c(0.3, 0.0, 0.0),
              c(0.0, 0.5, 0.0))
  
matF <- rbind(c(0.0, 0.0, 0.8),
              c(0.0, 0.0, 0.0),
              c(0.0, 0.0, 0.0))

vitalRatePerturbation(matU, matF)
> # SSurvival    SGrowth  SShrinkage  SReproduction  SClonality ...
> # 0.8760486  0.8760486           0      0.2054321           0 ...

I would've expected the above model to be decomposed into something with no vital rates for growth, like

matA <- rbind(c( 0,  0, f),   # edited... wrote this as a 4x4 initially
              c(s1,  0, 0),
              c( 0, s2, 0))

I've looked at some stage-structured models too, and compared results to popbio::vitalsens. In general the sensitivities to survival produced by vitalRatePerturbation correspond to what I'd expect, but sensitivities to the other vital rates do not.

Is there a ref for this method @robcito? I've looked through Caswell 2001, but haven't been able to figure out a method for what I think we want to do here (at least not a general one).

Pace & Shape

Hi all

I'm currently working on my fork adding Pace & Shape measurements to Rage that Annette Baudisch and I have been developing; if we want, these can eventually be incorporated into the package itself (alongside other more standard ones we'll either have or would make sense, e.g. life expectancy, Keyfitz Delta, Gini).

For the new measures (which [humble brag] if I say so myself are more generalisable and comparable than many other shape measures), I need to calculate area under a curve. The only function I know of (MESS::auc) has a lot of dependencies at 10 or more. I wanna either ask whether people are OK with this, or whether they know of a lighter option (or want to press me to find one), if indeed these measures would be useful in the package eventually.

Rearrange demogstat assignment in perturbation functions

In the initial assignment of the demogstat variable in matrixElementPerturbation and vitalRatePerturbation, I think the if (demogstat == "lambda") statement should come before the if (is.character(demogstat)) statement. Otherwise the function will fail if demogstat == 'lambda'.

Specify MASS::ginv, instead of simply ginv in dentropy and kentropy.

MASS::ginv, would be better than just ginv.

Add lxCrit argument to mpm_to_ functions

The mpm_to_ fns (and makeLifeTable) use arguments N/nSteps to specify the number of age classes to calculate over. If a user wants to apply these to many MPMs (e.g. with COMPADRE), they'll probably just set N to a large constant (e.g. 1e5). This approach can be computationally time-consuming, and is unnecessary for the many species whose lx trajectories will approach 0 long before N.

Therefore, suggest adding an argument lxCrit to these fns (default 1e-4, i.e. 1 in 10,000), and editing such that lx values are calculated until the age class reaches N OR lx drops below lxCrit — whichever occurs first.

On the 150-row Compadre subset, this change reduces the computational time on my computer from 45s to 1s (given N = 1e5).

If the user wants the fn to ALWAYS calculate to age class N, they can simply set lxCrit = 0.

Modify R0 to always calculate per-generation growth rate

In Matrix Population Models, Caswell discusses methods for calculating R0 in age- vs. stage-classified models (s. 5.3.4). The continuous age-classified version (integral over lx * mx) gives the per-generation population growth rate, which corresponds qualitatively with lambda in terms of being < 1, = 1, or > 1.

To calculate R0 with the same properties from a stage-classified model, Caswell (p. 128) recommends the calculation:

R <- matF %*% N  # where N is the fundamental matrix
R0 <- popbio::lambda(R)  # i.e. dominant eigenvalue of R

whereas our current calculation is:

R <- matF %*% N
R0 <- R[startLife, startLife]

The two are the same if offspring only arise in stage startLife, but otherwise differ (only the former is affected by the production of offspring in any stage other than startLife).

I suggest we update R0 to use the eigenvalue method.

Add hazard rate outputs

A recent paper in MEE by Torbjørn Ergon et al argues for a greater focus on hazard rates rather than survival probabilities in population analyses.

I think it would be straightforward to add hazard rate outputs to Rage fns vitalRates, vitalRatePerturbations, and makeLifeTable.

If S is a stage-specific survival probability, the corresponding time-averaged hazard is

h = -log(S)

and the sensitivity of S with respect to h is

dS_dh = -exp(-h)

Dealing with missing stages in collapseMatrix and standardizeMatrix

If one or more standardized stages (e.g. pre-reproductive) is not present in the MPM passed to standardizeMatrix, the collapse list returned by reprodStages will contain one or more NA. In this case, collapseMatrix will return rows/columns of 0 for the missing stage(s), and then the code at the end of standardizeMatrix will convert some of these 0s to NA.

But it only converts the stasis elements to NA, rather than the entire row/column of the missing stage.

Suggest moving this NA conversion to collapseMatrix to ensure consistent behaviour regardless of whether collapseMatrix is called directly or indirectly within standardizeMatrix, and expanding the NA to the entire missing row/column.

Problem with startLife > 1 in qsdConverge

I think there's an issue with using startLife > 1 in qsdConverge. Because the vast majority of matU are reducible (i.e. b/c there are no repro transitions to 'complete the cycle'), convergence to the quasi-stable distribution may depend on the initial population vector, which is a function of startLife. E.g. in the matrix below, because the stable distribution of the 1st stage is >> 0, and there is no retrogression to stage 1, we would never get convergence with any startLife > 1.

matU <- rbind(c(0.5, 0.0, 0.0, 0.0),
              c(0.1, 0.3, 0.1, 0.0),
              c(0.0, 0.1, 0.2, 0.1),
              c(0.0, 0.0, 0.1, 0.3))

popbio::stable.stage(matU)
## 0.5000 0.3125 0.1250 0.0625

popdemo::isErgodic(matU)
## FALSE

I'm not sure how to deal with this. Checking isErgodic(matU) could help, but there are non-ergodic matU that still do converge to the QSD (for some startLife), and a few ergodic but imprimitive matU that don't converge.

Use popbio to calculate lambda and ssd

Since we're already importing popbio, I suggest we consistently use popbio::lambda and popbio::stable.stage at relevant points within our functions, primarily for ease of code readability, and also for consistency. By my count, the relevant functions include:

collapseMatrix
qsdConverge
matrixElementPerturbation
vitalRatePerturbation
vitalRates

Break lifeTimeRepEvents into constituent functions

The function lifeTimeRepEvents currently produces a list object with up to 8 elements.
I think this should be simplified by breaking apart the function into its constituent functions.

i.e. there can still be a function called lifeTimeRepEvents that produces such a list, but it would be useful to have separate functions which are called by lifeTimeRepEvents .

This will solve some problems including (1) users only wanting a single measure, but being forced to calculate 8 (which could be time consuming with big datasets); (2) the function will fail if one of the 8 calculations produces an error; (3) it will become easier to debug where errors are produced with separate functions.

Ensure that matrix tests are consistent within each function.

Each function should begin by checking whether the matrices that are used pass required tests (e.g. isErgodic, isPrimitive etc. from popdemo).
The error/warning messages produced should be consistent accross all functions.
Perhaps also suggesting that the matrices be cleaned using Rcompadre::cleandb

Summary document for 'special cases'

I think it would be useful, both for us and for users, to have a document summarizing how each Rage function deals with a number of 'special cases' of MPMs (e.g. NAs, all zeros, SurvivalIssue, matU singular).

Not sure what form this should ultimately take / where it should live, but I've created a Google sheet to get us started. It should be editable by anyone with the link, so feel free to add/update/edit.

Specifying critical lx in longevity function

The maximum longevity component of longevity finds the age when survivorship (lx) falls to some critical proportion (call this lx_crit), which is 0.01 with the default setting of initPop = 100. Specifically, the population starts with initPop individuals in stage class startLife, and is projected through time until there is only 1 individual remaining (call this n_crit, which is fixed at 1).

I think it's potentially confusing that the user effectively adjusts lx_crit by changing initPop (i.e. lx_crit = n_crit / initPop = 1 / initPop).

As far as I can tell, the result depends only on the ratio, not the absolute values of initPop and n_crit. So I think it would be more intuitive if initPop was fixed at 1 and removed as an argument, and the user instead specified a new argument lx_crit (constrained to the interval [0,1], default = 0.01).

Alternatively, we could keep the function as is, and just improve the documentation.

(Also, related to #15 , I believe the loop within longevity is mathematically equivalent to the survivorship loop within makeLifeTable, dEntropy, and kEntropy, so some of this duplicated code could be pulled out into a common function.)

Remove standardizedVitalRates

standardizedVitalRates calls four other Rage functions in sequence: rearrangeMatrix, reprodStages, collapseMatrix, then vitalRates. The first 3 convert an MPM to a standardized format, and the 4th extracts averaged vital rates.

I see no reason to group these 4 components into a single function:

The main reason is that each component function has it's own set of optional arguments (e.g. reprodStages has different options for dealing with propagule and post-reproductive stages, vitalRates has different options for how to split and weight vital rates). standardizedVitalRates doesn't currently support any of these — it simply uses the defaults of the component functions. Adding this support would make standardizedVitalRates overly complicated, in my opinion.
Second, the four component functions may each fail for different reasons. Combining them in a single function makes debugging more difficult.
Third, a user that wants to extract vital rates from a standardized matrix is likely to want to calculate other things from that standardized matrix as well. But the standardized matrix is not returned by standardizedVitalRates.

Therefore, I suggest we remove standardizedVitalRates and encourage users to use the four component functions separately (e.g. with an example on the website or vignette). We could alternatively bundle the first 3 component functions into a new function (e.g. standardizeMatrix), which I think would be more useful to users, and not quite as complicated.

Remove 'maxRep' return from rearrangeMatrix

I think the maxRep element returned by rearrangeMatrix is unnecessary, as it's a simple function of another returned element, reproStages. Suggest removing.

entropy_k shouldn't be used with partial lx trajectories

entropy_k (formerly kEntropy) doesn't work as I'd expect with partial survivorship trajectories. For a given hazard function (declining, constant, or increasing) the value of entropy returned depends on how many age classes are used.

For example, the constant-hazard lx trajectory 0.9^(0:10) yields an entropy of 0.434 (indicating an increasing mortality hazard) whereas the lx trajectory 0.9^(0:100) yields an entropy of 0.998 (indicating a constant hazard). (Recall entropy < 1 indicates senescence, and entropy > 1 indicates negative senescence.)

x1 <- 0:10
x2 <- 0:100

# with 10 age classes, suggests positive senescence
lx1 <- 0.9^x1  # true lx trajectory has constant hazard
entropy_k(lx1, trapeze = TRUE)
> [1] 0.4341153

# with 100 age classes, suggests negligible senescence
lx2 <- 0.9^x2  # true lx trajectory has constant hazard
entropy_k(lx2, trapeze = TRUE)
> [1] 0.9978724

There are even cases where, for a given rate of change in the mortality hazard, using a different number of age classes qualitatively changes the interpretation regarding positive vs. negative senescence.

 # with 10 age classes, suggests positive senescence
hx3 <- exp(-1 - 0.2 * x1) # true hazard function is declining
lx3 <- hx_to_lx(hx3)      # survivorship
entropy_k(lx3, trapeze = TRUE)
> [1] 0.8645491

 # with 100 age classes, suggests negative senescence
hx4 <- exp(-1 - 0.2 * x2) # true hazard function is declining
lx4 <- hx_to_lx(hx4)      # survivorship
entropy_k(lx4)
> [1] 1.692954

At the least, I think we should include a strong warning in the documentation about this behaviour (and direct users to shape_surv, which does handle partial lx trajectories). On the other hand, users working with MPMs will usually be working with partial survivorship trajectories (e.g. truncated at the quasi-stable distribution), so perhaps we should just remove entropy_k from Rage altogether?

Clarify/simplify arguments in vitalRates

The splitStages argument can be either "all", "ontogeny", or a vector of standardized matrix classes — e.g. c("prop", "active", "active", "dormant").

I suggest we restructure so that this third option is achieved by setting splitStages = "matrixStages" and specifying the vector of matrixStages in a separate argument matrixStages.

I think this would make splitStage somewhat easier to document and validate, and better matches our use of matrixStages arguments in other functions.

The weighted argument can be either FALSE, "SSD", or a vector of stage-specific weights.

I suggest we change "weighted" to "weights", which corresponds slightly better with the available options, and use a default value of NULL instead of FALSE for consistency with most other Rage functions (i.e. NULL corresponds to equal weighting of all stage classes).

Finally, suggest moving the documentation of splitStages and weighted from the @return section to the respective @param sections.

Bugs in vitalRates

(1) The splitStages == "all" block takes a sum of weighted vital rates whereas the "ontogeny" and "matrixStages" blocks take a mean. Because the stage-specific vital rates have already been weighted by this point (with weights that sum to 1), I think all blocks should use sum to obtain a weighted average.

(2) Somewhat separately, I think the weighted sums need a proper divisor (this is related to the issue of structural zeros vs sampled zeros).

The current method sum(weights * vital_rates) / sum(weights) works if the vector of vital_rates is nonzero for all stage classes, but otherwise we should divide by the sum of the weights for the relevant (i.e. nonzero) stage classes rather than all stage classes.

E.g.

matF <- rbind(c(0, 0, 5),
              c(0, 0, 0),
              c(0, 0, 0))

weights <- c(1, 1, 1)
weights <- weights / sum(weights)

fecund <- colSums(matF)
fecund_weighted <- weights * fecund

# divide by sum of all weights (current method)
# divisor of sum(weights) is implicit
sum(fecund_weighted) # 1.67

# divide by sum of relevant weights
i <- fecund > 0
sum(fecund_weighted[i]) / sum(weights[i]) # 5.00

allow functions to accept CompadreM class

Functions that currently require matU, matF, ... arguments should also accept objects of class CompadreM, automatically parsing the matrix components or failing (with a helpful error) when a necessary component is missing.

Maybe best to write function generics with methods for the manual entry vs. com(p)adre database objects?

standardizedVitalRates causes build error.

Please could you check the example can be run without error.
Currently produces -> Error: Expecting matrices with equal dimensions

Remove 'dist == 0' check from qsdConverge

The qsdConverge fn checks for convergence by assessing whether min(dist) >= conv (where conv is the convergence threshold, e.g. 5%). If convergence is not reached within the allotted time, the fn returns NA and a warning.

The fn also returns NA and the same warning if sum(dist) == 0, which I guess indicates that the population starts out exactly at the stable distribution. This latter scenario doesn't seem like a problem to me... the fn would otherwise just return 1 if the population starts out at the stable distribution, which seems reasonable. Suggest removing this check.

camel that shit

dentropy -> dEntropy
kentropy -> kEntropy

plotLifeCycle order

I've copied this over from jonesor/compadreDB#21 (comment) where @RobSalGo first posted this issue.

I think that the stages in the function plotLifeCycle get automatically re-ordered to alphabetical, unless they are given a number before them (which is not very sensible). This makes the reading of the lifecycle from left-to-right a bit less straight forward. It'd be great if we could let the authors choose the order in which the stages appear

Conscious uncoupling / grouped naming-scheme

A few Rage functions (primarily vitalRates, matrixElementPerturbation, and vitalRatePerturbation) involve a lot of intermediate steps, and have complex outputs. I suggest breaking these down into simpler functions to allow users more flexibility (following #20).

Relatedly, I think we should consider transitioning to a naming scheme where groups of similar functions have the same prefix, so that they're easier for users to locate. E.g. Instead of matrixElementPerturbation and vitalRatePerturbation, we could use perturb_transitions and perturb_vitals. The vital rate functions I propose below could all begin with vitals_ or vr_, and @iainmstott's shape functions with shape_.

Break apart vitalRates

I suggest three layers of functions:

(1.) matrix-returning functions (element-specific vital rates of growth, shrinkage, stasis, and reproduction, all conditional on survival)

vr_mat_U and vr_mat_R

(2.) vector-returning functions (stage-specific vital rate sums)

vr_vec_survival, vr_vec_growth, vr_vec_shrinkage, vr_vec_stasis, vr_vec_reproduction, vr_vec_dormancy

(3.) scalar-returning functions (average vital rates across all stages, possibly weighted)

vr_survival, vr_growth, vr_shrinkage, vr_stasis, vr_reproduction, vr_dormancy

Each of these should allow the user to specify the set of possible transitions so that we can distinguish between structural zeros and sampled zeros, and to specify stages that should be excluded from calculation (e.g. dormant stages).

Break apart perturbation functions

I haven't thought as much about these, but we could start with pulling the nested loop that's common to both functions into a separate function (i.e. given matA return a sensitivity matrix based either on lambda or a user-supplied demogstat argument).

I think we should also avoid returning both sensitivities and elasticities in a single functional call. The type of output could be specified with an additional argument type, or perhaps we could use separate sensitivity_ and elasticity_ functions.

Change package name to something more googleable

The Rcompadre repo is the first hit if you google "Rcompadre".

The Rage repo doesn't make page 1 with any of the following (of course search results are personalized, so may vary):

"Rage"
"Rage R"
"Rage library"
"Rage R package"
"Rage GitHub" (!!!)

We'll eventually grow in popularity, but I don't think we'd ever be a top hit for just "Rage". I don't have any better ideas at the moment. Just throwing it out there.

Handling row/column names from input matrices

Outputs from some Rage functions maintain row/column names from the input MPM (matU/matF/matC), where applicable, but others do not. E.g. If matU has colnames, the output of life_expect will retain the colname for stage startLife, whereas the output from longevity will not.

Suggest standardizing such that:

scalar outputs are always unnamed
stage-specific vector/matrix outputs retain stage names from input matrices, where applicable, except for functions that collapse stage classes (e.g. mpm_collapse)

Should Rage have a narrower focus?

It's grown quite diverse. Many of the fns concern age-related analyses (the stated purpose), but others are not really related to ageing (e.g. perturb_ fns, vr_ fns, plotLifeCycle).

Should we move some of the non-ageing fns to a separate package? I don't have a strong opinion yet, but think it's worth exploring.

As a related side note, plotLifeCycle has an extensive dependency chain (due to DiagrammR), that has caused some minor CI issues.

Change methods::getFunction to base::match.fun in perturbation fns

In the perturbation functions, passing the name of a function I've created as a character string to the demogstat argument doesn't work (except 'lambda'), I think because methods::getFunction doesn't search the parent environment by default. I suggest switching to base::match.fun, which does the same thing as getFunction but searches the parent environment. Switching has the added benefit that we no longer need to import methods.

Remove mature_life_expect

The function mature_life_expect (taken from lifeTimeRepEvents) calculates the difference between life expectancy from stage 'start', and mean age at reproductive maturity (also from stage 'start').

It's described as 'remaining life expectancy at maturity', but I think this description is likely to be confused with 'life expectancy at reproductive maturity' (i.e. life expectancy conditional on achieving reproductive maturity). It's the latter trait that I think most users would be interested in. What mature_life_expect currently calculates would more accurately be described as 'life expectancy from birth that's remaining at reproductive maturity'.

Suggest deleting mature_life_expect from Rage, and adding 'from first reproduction' functionality to life_expect, as in #53.

Update documentation for collapseMatrix

(collapseMatrix is scheduled to be moved from Rcompadre to Rage, as per jonesor/Rcompadre#28)

The documentation for collapseMatrix contains the note:

Note: this function is only valid for models without clonality.

I think collapseMatrix is perfectly valid for models with clonality. The fn was originally written without a matC option but has since been updated, so I think the note can be removed (unless anyone objects?).

I think we should, however, add a note to the documentation to clarify that collapseMatrix is only gauranteed to preserve λ and the relative stable distribution, but is not expected to preserve relative reproductive values, sensitivities, net reproductive rates, life expectancy, etc.

Simplifying functions that can take matF, matC, or matF+matC

In dEntropy, R0, and lifeTimeRepEvents (and possibly makeLifeTable), I think the conditional structure of the output is overly complicated.

The user may pass:

matF only, or
matC only, or
matF and matC

and the number of items returned depends on which arguments get passed, as do the names of the returned list elements.

In my opinion this conditional behaviour is a pain to document, potentially difficult for new users to understand, and makes the functions harder to program with.

One option for simplification is to only have a single argument for the reproductive matrix (e.g. matR), to which the user can pass whatever they desire... i.e. one of matF, matC, or something like matFC (= matF + matC).

In this case, I think we would also need to make it easier for the user to derive matFC as a separate slot/colum within db (I think @levisc8's proposal in jonesor/Rcompadre#27 to merge the mat and metadata slots into a single data slot goes a long way here).

Vignette

We need a vignette that guides users through a workflow of using the package.
This can draw on materials used for COMPADRE workshops that are in our compadreDB repository.

More than one Vignette is OK, if necessary.

Remove 'post' argument from reprodStages

I think the post argument in reprodStages — an integer indicator for post-reproductive stages — is unnecessary.

There is a separate argument reproStages (a logical vector) from which we could just as easily identify post-reproductive stages.

(The use of reprodStages within standardizeVitalRates would have to be updated accordingly, but the functionality would remain the same).

Add author for reprodStages.R and standardizedvitalrates.R

Warning: @author [reprodStages.R#24]: requires a value
Warning: @author [standardizedvitalrates.R#9]: requires a value

Pre- vs post-breeding census issues

I still have trouble with this distinction (apparently I'm in good company).

I think the Rage functions that involve vital rates of fecundity (vr_ fns, perturb_vr, and possibly mpm_to_mx; using devel branch names here) effectively assume a post-breeding census design. I'm not sure whether it's possible to add pre-breeding census functionality for these, since in a pre-breeding model the newborn survival rate only appears in the fecundity transition — so the birth rate and newborn survival rate are not independently estimable.

I can't decide whether this problem also applies to mpm_to_mx (and therefore makeLifeTable), or whether mx trajectories from pre- and post-breeding MPMs are comparable.

Not sure how to resolve this. For now I guess we can just add some documentation to the relevant functions.

Standardize argument names

Arguments 'N' (qsd_converge), 'nSteps' (entropy_ fns), 'maxAge' (longevity), and 'xmax' (shape_ fns), all refer to the maximum age class to which some quantity should be calculated. Suggest standardizing all to 'xmax'.

Argument 'startLife' is a bit long. Suggest shortening to 'start'.

I think the 'surv' and 'rep' arguments of the shape_ functions could be changed to 'lx' and 'mx', respectively. Arg 'lx' is already used in the lx_to_ functions, and could also be used in the entropy_ functions per #58.

The remaining arguments include a mix of snake_case (e.g. 'exclude_col'), camelCase (e.g. 'reproStages'), and smushedform (e.g. 'demogstat'). Suggest standardizing, though I'm not sure how. I lean slightly toward snake_case (except for matU/matF/matC), which rOpenSci recommends for both function and argument names. But I don't feel too strongly.

Selection of current argument names:

vr_ fns: 'exclude_col', 'weights_col', 'surv_only_na', 'dorm_stages'
mpm_ fns: 'reproStages', 'matrixStages'
various fns: 'lxCrit'
perturb_ fns: 'demogstat'

Allow collapseMatrix to collapse non-adjacent stages & remove rearrangeMatrix

collapseMatrix is currently written in a way that prevents a user from collapsing non-adjacent stages. Specifically, the collapse argument is a character vector whose elements must reflect single stages or sequences of adjacent stages, e.g.

collapse <- c('1', '2-4', '5')

I suggest we change collapse to a list, to allow the user to collapse non-adjacent stages, e.g.

collapse <- list(1, 2:4, 5)           # same as above
collapse <- list(1, c(2, 4), c(3, 5)) # collapse non-adjacent stages (e.g. stage 3 inter-reprod.)

I think this would make rearrangeMatrix unnecessary, because it's only function is to move any inter-reproductive stages to the right side of a matrix, to segregate reproductive and post/non-reproductive stages ahead of collapsing.

If we remove rearrangeMatrix, reprodStages can be updated to identify inter-reproductive stages (it already takes the required arguments, i.e. reproStages).

Bug in mature_age and mature_distrib

In life cycles lacking any connection between stage 'start' and the reproductive stage(s), or life cycles lacking reproductive stages altogether, the mature_age function in the devel branch (taken from lifeTimeRepEvents) incorrectly returns a value of 1. E.g.

# life cycle with no connection from stage 1 to the repro stage (3)
matU <- rbind(c(0.0, 0.0, 0.0),
              c(0.1, 0.1, 0.0),
              c(0.0, 0.0, 0.0))

matF <- rbind(c(0.0, 0.0, 4.2),
              c(0.0, 0.0, 0.0),
              c(0.0, 0.0, 0.0))

Rage::mature_age(matU, matF, start = 1)
>> [1] 1

Suggest updating to return NA in these cases.

The function mature_distrib has a similar issue, though only when there is a single reproductive stage class (with no connection from stage 'start').

Split longevity function in two?

The longevity function returns two components:

life expectancy and variance therein, estimated based on a Markov chain approach (from the fundamental matrix)
maximum longevity, estimated via population projection (requires a initial population vector)

These two components seem sufficiently different to me that it might be worthwhile to split them into separate functions, or perhaps add an argument to the existing function to allow the user to select which component(s) to return.

Also, maximum longevity takes ~10x longer to calculate (on my computer: 5.5s vs 0.5s for the whole COMPADRE db), so it would be nice to exclude this extra computation time if the user only wants life expectancy.

Use human-readable matrices in documentation examples

I propose using a consistent, human-readable format when creating matrices in the @example section of Rage functions, e.g.

Format 1

matU <- rbind(c(  0,   0,    0,   0),
              c(0.1, 0.1,  0.1,   0),
              c(  0, 0.3,  0.3, 0.1),
              c(  0,   0,  0.2, 0.1))

Format 2

matU <- rbind(c(0.0, 0.0,  0.0, 0.0),
              c(0.1, 0.1,  0.1, 0.0),
              c(0.0, 0.3,  0.3, 0.1),
              c(0.0, 0.0,  0.2, 0.1))

I think Format 1 is slightly easier to read in the R documentation browser. Thoughts?

Change arguments in entropy fns from matU/matR to lx/mx

The entropy functions perform calculations on lx/mx trajectories, which the functions first derive from the relevant MPM using mpm_to_lx()/mpm_to_mx().

Suggest making entropy fns act directly on lx/mx trajectories instead of matU/matR, to correspond to the shape_ fns (which act directly on lx/mx trajectories). I think the 2-step workflow is more transparent (i.e. the age-from-stage methods used to derive lx/mx are conceptually independent from the entropy calculation), and allows for alternative age-from-stage methods for deriving lx/mx from an MPM (e.g. #53).

Make descriptions of matrices (matU, matF, matC and matA) consistent across documentation

Descriptions of matrices should be made consistent across all functions.

Uli's functions

Develop functions to include various life history traits from Uli's PNAS paper @jonesor

Change makeLifeTable so that it returns a list object that includes a life table.

makeLifeTable currently returns a list object with numerous vectors.
Most of these are the constituent columns of a life table, with the addition of R0Fec and something called TcFec.
I think that the object returned should be a list object with a data.frame (the life table) and the other two statistics (R0...).

In addition, the description (Rd file) should be updated - it is not correct as it stands.

Add 'from first reproduction' functionality

For analyses related to senescence, I generally want to estimate age-related traits starting from the age of first reproduction. Current Rage functions work great when there is a single obvious stage of first reproduction, but need modification when a life cycle is such that there is more than one possibility.

About 131/422 MPMs in my COMPADRE 'working subset' have multiple possibilities for the stage of first repro. E.g. In the following life cycle (assuming a startLife of either 1 or 2), individuals might first reproduce either in stage 3 or 4 — stage 4 is twice as likely to be the stage of first reproduction.

matU <- rbind(c(0.0, 0.0, 0.0, 0.0),
              c(0.1, 0.1, 0.0, 0.0),
              c(0.0, 0.1, 0.6, 0.2),
              c(0.0, 0.2, 0.1, 0.1))

matF <- rbind(c(0.0, 0.0, 4.2, 5.1),
              c(0.0, 0.0, 0.0, 0.0),
              c(0.0, 0.0, 0.0, 0.0),
              c(0.0, 0.0, 0.0, 0.0))

Because survivorship from first reproduction varies depending on which stage the first repro occurs in, we'd want to integrate over both possibilities relative to the probability that an individual first reproduces in the given stage.

Suggest working toward this functionality (possibly with new, separate functions) for calculating lx/hx/px, life expectancy, longevity, and time to qsd — all from first reproduction.

jonesor / rage Goto Github PK

rage's People

Contributors

Stargazers

Watchers

Forkers

rage's Issues

Applies to makeLifeTable, dEntropy, and kEntropy

Applies to makeLifeTable and dEntropy

Applies only to makeLifeTable

Recommend Projects

Recommend Topics

Recommend Org

Jobs

Applies to `makeLifeTable`, `dEntropy`, and `kEntropy`

Applies to `makeLifeTable` and `dEntropy`

Applies only to `makeLifeTable`