mages / chainladder Goto Github PK

View Code? Open in Web Editor NEW

74.0 74.0 62.0 38.62 MB

Claims reserving models in R

Home Page: https://mages.github.io/ChainLadder/

R 92.65% TeX 7.35%

chainladder's People

Contributors

Stargazers

Watchers

chainladder's Issues

Interest for a method to extract the diagonal?

I'd find useful when working with the package to be able to easily extract the SO-NE diagonal of a triangle, the same way base:::diag extracts the NO-SE diagonal of a matrix.

Would this be a welcome addition to the package? I'll gladly contribute the patch.

CRAN2016 Release

It's thank time again, eh Markus? Thanks to github/ChainLadder/Compare, the current NEWS holds all the changes to the repository since CRAN2015 (August). I think this is an opportunity for contributors to brag a little bit. Let's put together a little vignette that demonstrates the reasons for the changes with use cases. I can volunteer to start the RMarkdown file -- or someone else can if you get to it within the next couple of days. Then when we fork and add material, hopefully git will keep it all straight. (This wouldn't be the first time I misunderstood how git works, so if I'm wrong, correct me! :-) ) Would be nice to make this available with the next CL release. Markus, what do you think about targeting CRAN2016 for either Aug if there's no broad interest or September if folks are interested. Simply comment to this "Issue".

Cashflow projection

Hi,

is there a build-in method to calculate the cashflow projection from the complete triangle? Currently I am running the following code, which might be usefull for others as well

triang2cashflow <- function(matj) {
  cf = rep(0, nrow(mat)-1)

  for(i in 1:length(cf)){
    idx_row = nrow(mat):(1+i)
    idx_col = (1+i):nrow(mat)
    tmp = 0
    for(j in seq_along(idx_col)){
      tmp = tmp + mat[idx_row[j], idx_col[j]]
    }
    cf[i] = tmp
  }

  cf
}

Plot of ClarkLDF not showing p value

When plotting ClarkLDF the Normal Q-Q Plot is showing "Shapiro-Wilk p. value = 0."

Is there a way to show the rest of the p value?

MackChainLadder error with multiple rows at same age

When two origin periods are at the same age, the statistics are calculated but the recursive generation at future ages fails due to incorrect looping logic. Here is a 3x3 example from GenIns:

G <- GenIns[8:10,1:3]
summary(MackChainLadder(G, est.sigma = "Mack"))$ByOrigin
Latest Dev.To.Date Ultimate IBNR Mack.S.E CV(IBNR)
8 2864498 1.0000000 2864498 0 0.0 NaN
9 1363294 0.4961176 2747925 1384631 234192.7 0.1691373
10 344014 0.1311672 2622713 2278699 305432.5 0.1340381

Now duplicate the last row:

G <- rbind(G, 11 = G["10",])
summary(MackChainLadder(G, est.sigma = "Mack"))$ByOrigin
Latest Dev.To.Date Ultimate IBNR Mack.S.E CV(IBNR)
8 2864498 1.0000000 2864498 0 0.0 NaN
9 1363294 0.4961176 2747925 1384631 0.0 0.00000000
10 344014 0.1311672 2622713 2278699 226228.3 0.09927961
11 344014 0.1311672 2622713 2278699 305432.5 0.13403811

Origin year 9 loses its standard error and origin years 10 and 11 should be the same.

The ability to handle rows at the same age is important when analyzing origins broken down into more detail. I will work on a solution that incorporates ChainLadder's GetLatestCumulative function.

Mack model

When getting the plot of Chain ladder developments by origin period it is throwing an error saying In expand.grid(origin = as.numeric(dimnames(.FullTriangle)$origin), :
NAs introduced by coercion. Suggestions on resolving?

filemock3 <-file.choose()
mock3 <- read.csv(filemock3, header = FALSE)
mock3 <- as.triangle(as.matrix(mock3))
mock3
dev
origin V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11
1 573 1392 1736 2035 2259 2360 2349 2311 2324 2324 2324
2 710 1600 1889 2026 2077 2093 2105 2105 2107 2110 NA
3 919 1947 2237 2372 2458 2460 2464 2463 2463 NA NA
4 1459 3488 4325 4623 4717 4728 4734 4738 NA NA NA
5 2145 6824 8875 9777 10093 10128 10131 NA NA NA NA
6 3976 11049 13727 14736 14960 15077 NA NA NA NA NA
7 6747 15757 18463 19713 20110 NA NA NA NA NA NA
8 6019 13139 15679 16952 NA NA NA NA NA NA NA
9 5360 12320 14707 NA NA NA NA NA NA NA NA
10 6012 12641 NA NA NA NA NA NA NA NA NA
11 6603 NA NA NA NA NA NA NA NA NA NA
mock3mack <- MackChainLadder(mock3, est.sigma = "Mack")
mock3mack
MackChainLadder(Triangle = mock3, est.sigma = "Mack")

Latest Dev.To.Date Ultimate IBNR Mack.S.E CV(IBNR)
1 2,324 1.000 2,324 0.00 0.000 NaN
2 2,110 1.000 2,110 0.00 0.977 Inf
3 2,463 0.999 2,465 1.67 3.129 1.876
4 4,738 0.997 4,752 13.55 13.965 1.031
5 10,131 1.000 10,129 -1.55 62.252 -40.085
6 15,077 1.000 15,084 7.38 89.183 12.078
7 20,110 0.992 20,275 165.02 186.541 1.130
8 16,952 0.967 17,521 569.46 306.574 0.538
9 14,707 0.896 16,405 1,698.44 389.187 0.229
10 12,641 0.741 17,050 4,409.25 646.818 0.147
11 6,603 0.314 21,046 14,443.33 2,244.681 0.155

          Totals

Latest: 107,856.00
Dev: 0.84
Ultimate: 129,162.55
IBNR: 21,306.55
Mack.S.E 2,548.62
CV(IBNR): 0.12

plot(mock3mack)
plot(mock3mack, lattice=TRUE)
Warning message:
In expand.grid(origin = as.numeric(dimnames(.FullTriangle)$origin), :
NAs introduced by coercion

Odd bootstrap results

When I use the attached data and code to produce bootstrap results the output seems odd. The IBNR S.E for origin year 5 (a mature year) is for example 10% for seed 12 and 17% for seed 99 (not cherry picking seeds, just picked a couple at random).
Both of these are much higher than the Mack SE of 0.08%.
I have been working on a lot of triangles and for many bootstrap and mack give similar results, but for some there are very large differences, particularly in mature years. Sometimes the difference only appears on certain seeds and sometimes running with the same seed but a slight tweak to the data such as different number of decimal places or extra (stable) year of development on a large triangle cause the odd bootstrap SE result to appear or disappear.
Any suggestions? Am I understanding correctly that this is likely to be a issue with the package rather than differences between Mack and Bootstrap methods?

set.seed(12)
B <- BootChainLadder(data, R=1000, process.distr="gamma")
mack <- MackChainLadder(data, est.sigma="Mack")

data.zip

BootChainLadder(Triangle = data, R = 1000, process.distr = "gamma")

Latest Mean Ultimate Mean IBNR IBNR.S.E IBNR 75% IBNR 95%
1 0.216 0.216 0.00e+00 0.0000 0.00e+00 0.00e+00
2 0.247 0.247 9.55e-05 0.0125 1.62e-76 1.04e-06
3 0.251 0.250 -8.37e-04 0.0150 3.70e-21 3.54e-04
4 0.968 0.966 -2.52e-03 0.0497 1.37e-04 3.59e-02
5 1.725 1.723 -2.61e-03 0.1033 3.94e-03 6.75e-02
6 1.952 1.949 -2.81e-03 0.0846 7.29e-03 9.43e-02
7 1.952 1.944 -7.88e-03 0.1030 8.87e-03 8.66e-02
8 1.825 1.819 -5.92e-03 0.0965 8.62e-03 8.30e-02
9 0.894 0.884 -9.46e-03 0.0671 1.15e-03 6.18e-02
10 0.260 0.258 -2.34e-03 0.0295 4.08e-06 1.80e-02
11 0.520 0.517 -2.42e-03 0.0413 5.02e-04 5.20e-02
12 1.532 1.513 -1.98e-02 0.1081 5.44e-03 7.66e-02
13 1.166 1.144 -2.17e-02 0.0928 1.71e-03 7.20e-02
14 0.354 0.350 -3.96e-03 0.0416 5.46e-04 3.70e-02
15 0.623 0.618 -5.40e-03 0.0556 3.61e-03 6.41e-02
16 1.119 1.099 -2.06e-02 0.0914 5.97e-03 8.31e-02
17 0.803 0.775 -2.80e-02 0.0846 -1.42e-04 6.09e-02
18 0.761 0.741 -1.93e-02 0.0798 6.35e-03 8.32e-02
19 0.455 0.451 -4.34e-03 0.0638 1.12e-02 9.51e-02
20 0.426 0.446 1.96e-02 0.0724 4.52e-02 1.49e-01
21 0.449 0.505 5.54e-02 0.0909 9.35e-02 2.11e-01
22 0.691 0.944 2.53e-01 0.1860 3.57e-01 5.99e-01
23 0.112 0.426 3.14e-01 0.2955 4.56e-01 8.38e-01

MackChainLadder(Triangle = data, est.sigma = "Mack")

Latest Dev.To.Date Ultimate IBNR Mack.S.E CV(IBNR)
1 0.216 1.000 0.216 0.00e+00 0.00e+00 NaN
2 0.247 1.000 0.247 -2.28e-05 2.54e-05 -1.113
3 0.251 1.000 0.251 3.90e-06 6.64e-05 17.006
4 0.968 1.001 0.968 -6.36e-04 4.58e-04 -0.721
5 1.725 1.001 1.724 -9.99e-04 7.62e-04 -0.763
6 1.952 1.001 1.950 -1.98e-03 1.11e-03 -0.562
7 1.952 1.004 1.945 -7.04e-03 8.75e-03 -1.244
8 1.825 1.004 1.817 -7.69e-03 8.45e-03 -1.099
9 0.894 1.010 0.885 -8.67e-03 8.57e-03 -0.988
10 0.260 1.007 0.258 -1.92e-03 5.33e-03 -2.774
11 0.520 1.009 0.515 -4.40e-03 7.91e-03 -1.799
12 1.532 1.012 1.515 -1.76e-02 1.71e-02 -0.973
13 1.166 1.015 1.149 -1.72e-02 2.04e-02 -1.189
14 0.354 1.010 0.350 -3.61e-03 1.51e-02 -4.189
15 0.623 1.012 0.616 -7.47e-03 2.18e-02 -2.917
16 1.119 1.019 1.099 -2.03e-02 3.85e-02 -1.894
17 0.803 1.035 0.776 -2.70e-02 3.99e-02 -1.475
18 0.761 1.023 0.743 -1.73e-02 4.69e-02 -2.719
19 0.455 1.002 0.454 -8.51e-04 4.52e-02 -53.153
20 0.426 0.950 0.449 2.26e-02 8.38e-02 3.711
21 0.449 0.883 0.509 5.94e-02 1.03e-01 1.732
22 0.691 0.724 0.954 2.63e-01 2.28e-01 0.865
23 0.112 0.258 0.435 3.23e-01 3.96e-01 1.227

MackChainLadder: `tail = FALSE` works fine, `tail = TRUE` errors.

Hello,

I have this incremental triangle:

incr_tri <- structure(c(1426070.24536192, 1736770.1007, 2639782.9874, 3587024.63956496, 
3865940.962, 5673143.20642889, 5714213.50358944, 4301115.41676496, 
6693050.8, 8217307.30397754, 11307235.24834, 9905168.2946652, 
12130, 1359660.9508, 1633281.9735, 3215907.0814, 6528376.49714343, 
7486409.03571738, 5076725.2128, 1004333.56011442, 6730057.650404, 
7331244.53689184, 5148580.23881475, 5202152.529135, 9842034.6463464, 
NA, 470925.445, 653460.5987, 1095930.8238, 2305059.52683733, 
2980199.38384495, 1223521.20321201, 1465304.4065, 2154081.68602803, 
2257224.018628, 2045705.0784234, 2503710.4930872, NA, NA, 51722.662, 
229922.0163, 778652.711590961, 841699.634600001, 2074852.01257497, 
692920.451400001, 892086.349364856, 862832.063100001, 1328214.8, 
1339655.7333272, NA, NA, NA, 66196.5024000001, -5951.36274700332, 
304899.7604, 484964.062129008, 1361228.6917, -670073.959978774, 
215230.3069, 862344.206280001, 0, NA, NA, NA, NA, 136136.177211975, 
189860.1108, 298799.72226939, 56610.0911999997, -259425.922708202, 
188747.8818, 191191.1053, 618330.214400001, NA, NA, NA, NA, NA, 
28622.3487999998, 121967.659968453, 96199.9843999995, 31803.0385980848, 
339990.021899998, 66964.3233000003, 599504.000800001, NA, NA, 
NA, NA, NA, NA, 154710.976175, -57885.6664000005, 115649.495765802, 
-17023.9271000009, 40990.3735000007, 16334.6632000003, NA, NA, 
NA, NA, NA, NA, NA, 4964.32774377428, -1413.74210000038, 2367.94040000066, 
-21930.2992000002, 128457.843200002, NA, NA, NA, NA, NA, NA, 
NA, NA, 2558.33874706039, -2120.56850000005, 19535.0869999994, 
0, NA, NA, NA, NA, NA, NA, NA, NA, NA, 320, -5278.58229999989, 
0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 217.206400000025, 
5382.70040000044, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), dim = c(13L, 
13L), dimnames = list(NULL, c("0", "1", "2", "3", "4", "5", "6", 
"7", "8", "9", "10", "11", "12")))

I convert it to cumulative:

# cumulative triangle:
cum_tri <- ChainLadder::incr2cum(incr_tri)

If I use MackChainLadder() when tail = FALSE it works fine:

# tail = FALSE:
tail_false <- ChainLadder::MackChainLadder(
  Triangle = cum_tri, 
  tail = FALSE
)

But when I set tail = TRUE it errors:

# tail = TRUE:
tail_true <- ChainLadder::MackChainLadder(
  Triangle = cum_tri, 
  tail = TRUE
)

Error:

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  NA/NaN/Inf in 'y'
In addition: Warning message:
In log(.f - 1) : NaNs produced

Why does it error? Is there a workaround?

Modify documentation to make the weights parameter of chainladder more clear

In the chainladder() method, it was unclear to me how the weights matrix would be applied. I mistakenly assumed that the weight W_{i,j+1} corresponded to the age-to-age factor given by C_{i,j+1}/C_{i,j}. Turns out the weight W_{i,j} corresponds to that factor.

Didn't see any mention of this in the documentation. Would be nice to include it so others don't make the same mistake I did.

Thanks

Bootchainladder Gamma process variance incorrect parametrization

Correct me if i'm wrong, but i think the parametrization of the rgamma function is wrong within the Bootchainladder function.

England and Verrall (2002) stated that using the gamma distribution for the process variance E[C_ij] = m_ij and Var[C_ij] = m_ij^2*phi.

The documentation of the rgamma function states correctely that, using a as shape parameter and s as scale parameter, E(X) = a*s and Var(X) = a*s^2.

By my calculations the shape and scale parameters should be respectively a = 1/phi and s = m_ij* phi. But in Bootchainladder it is parametrized as a = m_ij/phi and s = phi. The mean is correct but the variance is m_ij*phi, which does not follow Englang and Verrall.

From the Bootchainladder function:
if (process.distr == "gamma") processTriangle[!is.na(simExp)] <- sign(simExp[!is.na(simExp)]) * rgamma(length(simExp[!is.na(simExp)]), shape = abs(simExp[!is.na(simExp)]/scale.phi), scale = scale.phi)

Print methods should return their argument invisibly

Issue

Some print methods do not return their argument unchanged. For example:

library(ChainLadder)
mcl <- MackChainLadder(RAA)

# Printing changes it's argument
mcl_print <- print(MackChainLadder(RAA))
identical(mcl, mcl_print)              # Returns FALSE

# Brackets change the assinged object
(mcl_brackets <- MackChainLadder(RAA))
identical(mcl, mcl_brackets)           # Returns FALSE


# If object is assigned to a name before printing / brackets, results differ
mcl_print2 <- print(mcl)
identical(mcl_print, mcl_print2)       # Returns TRUE

(mcl_brackets2 <- mcl)
identical(mcl_brackets, mcl_brackets2) # Returns FALSE

This behaviour can lead to confusion and is not in line with the print generic's documentation (?print):

print prints its argument and returns it invisibly (via invisible(x))

Expected behaviour

The behaviour I was expecting, illustrated with summary():

mtc_smry <- summary(mtcars)

# Printing returns it's argument
mtc_smry_print <- print(summary(mtcars))
identical(mtc_smry, mtc_smry_print)              # Returns TRUE

# Brackets have no impact
(mtc_smry_brackets <- summary(mtcars))
identical(mtc_smry, mtc_smry_brackets)           # Returns TRUE


# No impact if object is assigned to a name before printing / brackets
mtc_smry_print2 <- print(mtc_smry)
identical(mtc_smry_print, mtc_smry_print2)       # Returns TRUE

(mtc_smry_brackets2 <- mtc_smry)
identical(mtc_smry_brackets, mtc_smry_brackets2) # Returns TRUE

System info

I am using the current GitHub version of ChainLadder. Here's my sessionInfo()

R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server 7.5 (Maipo)

Matrix products: default
BLAS: /opt/R/3.5.0/lib64/R/lib/libRblas.so
LAPACK: /opt/R/3.5.0/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ChainLadder_0.2.6

loaded via a namespace (and not attached):
 [1] biglm_0.9-1       statmod_1.4.30    zoo_1.8-2         tidyselect_0.2.4  purrr_0.2.5      
 [6] reshape2_1.4.3    splines_3.5.0     haven_1.1.1       lattice_0.20-35   carData_3.0-1    
[11] colorspace_1.3-2  stats4_3.5.0      yaml_2.1.19       rlang_0.2.1       pillar_1.2.3     
[16] foreign_0.8-70    glue_1.3.0        tweedie_2.3.2     readxl_1.1.0      bindrcpp_0.2.2   
[21] bindr_0.1.1       plyr_1.8.4        stringr_1.3.1     munsell_0.5.0     cplm_0.7-7       
[26] gtable_0.2.0      cellranger_1.1.0  zip_1.0.0         expint_0.1-4      coda_0.19-1      
[31] systemfit_1.1-22  rio_0.5.10        forcats_0.3.0     lmtest_0.9-36     curl_3.2         
[36] Rcpp_0.12.17      scales_0.5.0      abind_1.4-5       ggplot2_3.0.0     stringi_1.2.3    
[41] openxlsx_4.1.0    dplyr_0.7.6       grid_3.5.0        tools_3.5.0       sandwich_2.4-0   
[46] magrittr_1.5      lazyeval_0.2.1    tibble_1.4.2      car_3.0-0         pkgconfig_2.0.1  
[51] MASS_7.3-50       Matrix_1.2-14     data.table_1.11.4 actuar_2.3-1      assertthat_0.2.0 
[56] minqa_1.2.4       R6_2.2.2          nlme_3.1-137      compiler_3.5.0

as.triangle for data.frame has bug with column names

Line 85 in "Triangles.R" has a bug which leads to incorrect naming of columns:

In the following code, the order of "dev" and "origin" are swapped in the aggregate function and the line which names the columns.

aggTriangle <- stats::aggregate(Triangle[[value]],
list(Triangle[[dev]], Triangle[[origin]]),
sum)
names(aggTriangle) <- c(origin, dev, value)

ChainLadder doesn't like dates as origins

The following code will generate an error:

op <- as.Date(paste0(2001:2010, "-01-01"))
lags <- 1:10

triangle <- expand.grid(op, lags)
names(triangle) <- c("origin", "dev")
set.seed(1234)
triangle$value <- rnorm(100)

triCL <- ChainLadder::as.triangle(triangle)
plot(triCL, lattice=TRUE)

The error is:

Error in .as.LongTriangle(x, na.rm) : 
  The origin and dev. period columns have to be of type numeric
 or a character which can be converted into numeric.

The origin column in the triangle is numeric. A call to typeof will return "double" and class will return "Date".

Probable bug in `glmReserve()` source code

I think there's a typo when checking if user supplied arg weight:

if (!("triangle") %in% class(triangle))
    stop("triangle must be of class 'triangle'")
  if ("offset" %in% names(list(...)))
    stop("'offset' should be passed using the 
         'exposure' attribute of the triangle!")
  if ("weigth" %in% names(list(...)))
    stop("'weight' should not be used")

Shouldn't it be if ("weight" %in% names(list(...)))?

A misspelling in the vignette?

ChainLadder/vignettes/ChainLadder.Rnw

Line 286 in 3ef9731

example, for the 1988 origin year, the age of the 1351 value,

Hi,

Wonder if the example should be that "the age of the 13112 value, evaluated as of 1990, is three years". Looks like a typo to me.

Regards,
Ben

Total.ParameterRisk & Total.ParameterRisk

My issue is with your R code for MackChainLadder formula in particular with both “Total.ProcessRisk” and “Total.ParameterRisk” elements. They both seem to produce incorrect results. If you'd like detailed explanation of the issue send me an email to [email protected]

Simulate MackChainladder

This is a question more than an issue, but I'm not sure where to post questions.

How can I simulate scenarios from the mackchainladder for each year and period such that the sims have the correct distributions. In particular, I want that if I run enough sims I get results consistent with the mack standard errors for each year and in aggregate?

dev.period order not correct when using plot(triang, lattice=TRUE)

I have 10 development periods and when I use plot(triang, lattice=TRUE) the display of the development periods is not in order (1, 10, 2, 3, 4, 5, 6, 7, 8, 9):

LDF Interpolation Based on Recent Boor Paper in Variance

Could you consider adding a function with the captioned. Here is a version that we wrote:

interpolate_ldfs <- function(observed_ldf_df, interp_age){
  # observed_ldf_df <- sel_data
  # interp_age <- 9

  ## At some age ('ldf_2_one') all selected 'ldfs' = 1 for all 'ages' >= ldf_2_one 
  ## Hence our 'pct_ibnr' -> inf for all 'ages' >= 'ldf_2_one', 
  ## and recieve error when fit linear model
  ## Test if 'interp_age' >= 'ldf_2_one' then return 1. Else proceed to interpolation 

  ldf_2_one <- min(observed_ldf_df$age[observed_ldf_df$ldf == 1]) 
  #the first age which the ldf is 1

  if (interp_age >= ldf_2_one) {
    return(1)
  } else {

    ## Exclude rows from 'observed_ldf_df' where ldf == 1  
    observed_ldf_df <- observed_ldf_df[observed_ldf_df$ldf != 1,]

    observed_ldf_df <- observed_ldf_df %>% 
      dplyr::mutate(pct_ibnr = 1 - (1 / ldf)) 

    ## Fit weibull model
    weibul_model <- lm(log(-log(observed_ldf_df$pct_ibnr)) ~ 
        log(observed_ldf_df$age)) # Boor Eq (8)

    ## Define the age of the ldfs above and below the interpulated age  
    age_below <- interp_age - (interp_age %% 12) 
    age_above <- interp_age + (12 - (interp_age %% 12))


    fit_below <- exp(-exp(weibul_model$coefficients[1] + 
        weibul_model$coefficients[2] * log(age_below))) 
    fit_above <- exp(-exp(weibul_model$coefficients[1] + 
        weibul_model$coefficients[2] * log(age_above)))
    fit_at <- exp(-exp(weibul_model$coefficients[1] + 
        weibul_model$coefficients[2] * log(interp_age))) 

    ## Selected ldfs at age_below and age_above
    observed_below <- observed_ldf_df$pct_ibnr[observed_ldf_df$age == age_below]
    observed_above <- observed_ldf_df$pct_ibnr[observed_ldf_df$age == age_above]

    ## observed_below is na when age_below < 12. Set equal to 1
    if(interp_age < 12) observed_below = 1

    ## variables to make extrapolation easier
    max_obs_age <- max(observed_ldf_df$age)

    if(interp_age < max_obs_age){   # interpolate
      interp_along_curve <- observed_below + (((fit_at - fit_below) / 
          (fit_above - fit_below)) * 
          (observed_above - observed_below))
    }  else{                           # extrapolate

      fit_at_max_age <- exp(-exp(weibul_model$coefficients[1] + 
          weibul_model$coefficients[2] * 
          log(max_obs_age)))

      obs_at_max_age <- observed_ldf_df$pct_ibnr[observed_ldf_df$age == 
          max_obs_age]

      interp_along_curve <- fit_at * obs_at_max_age / fit_at_max_age
    }
    ## Calculate ldf
    implied_ldf <- 1 / (1 - interp_along_curve)
    ## Adjust for age < 12 months 
    implied_full_ay_ldf <- ifelse(interp_age >= 12, implied_ldf, 
      implied_ldf * 12 / interp_age) 

    return(implied_full_ay_ldf)

  }}

glmReserve

when I run "glmReserve" in RExcel with "var.power=1" and "cum=FALSE" and "mse.method = bootstarp" and "nsim =1000" I get error message "Microsoft Excel is waiting for another application to complete an OLE action"
When I run it in R it takes absolutely ages and eventually I have to kill it with getting no results

Origin is missing when no claims into the origin year

Hi all, Greate library.
Playing with demo ""DatabaseExamples"" i have added data for new lob (Marine1) in MSAccess data base. As you can see some data are missing (origin year 2013 missing claims)

Marine1	0	1	2
2012	17850
2013
2014	1167.96	53010
2015			4900
2016	9308.33	69289.5
2017	377

as a result Origin years are missing and replaced by .... (1,3,4,5,6) instead of 2012, 2013 is not reported at all , 2014, 2015, 2016, 2017

BootChainLadderResults
$Marine1
FUN(Triangle = X[[i]], R = 999, process.distr = ..2)

	Latest	Mean Ultimate
1	17,850	17,850
3	54,178	54,178
4	4,900	4,900
5	78,598	78,598
6	377	377

             Totals

Latest: 155,903
Mean Ultimate: 155,903
Mean IBNR: 0
IBNR.S.E 0
Total IBNR 75%: 0
Total IBNR 95%: 0

Standardisation of models

Following the discutions that started at #44, i'd like to continue my investigation on norms for models.

For Univariates (and non-bootstrap) models, do we agree that :

A model takes a single triangle as a data input, or something that can be coerced to a triangle
A model can take aditional parameters such as :
A model object should contains at least :
- Some standard results : last diagonal, ultimates, reserves, standard error of reserves, etc...
- One matrix : the completed triangle
- Estimated parameters if any.
- Quality assessing values (p-values for glm,...)
The object returned by the model should ansewr correctly some generic functions :
- Ultimates()
- CDR()
- Others ?

Do i miss something that's important to one model or another ?

My goal is to be able to fit models from the same function, in the caret-way, something like :

data(ABC)
mod1 <- fitTriangle(ABC,method="Mack",...)
mod2 <- fitTriangle(ABC,method="Merz",...)
mod3 <- fitTriangle(ABC,method="glm",family=quasipoisson(link="log"),...)
mod4 <- fitTriangle(ABC,method="glm.nb",...)
mod.list = list(mod1,mod2,mod3,mod4)

and then to get extraction in a standardized way :

purrr::map(mod.list,CDR)
purrr::map(mod.list,"ultimates")
purrr::map(mod.list,"ultimate.s.e")
purrr::map(mod.list,"total.s.e")

etc.

Thoughts ?

Is it possible to summarize P/I ratio values according to Mack?

Mack and Bootstrap Reserve Risk Calculations

I'm quite new to R and have had some trouble pulling Mack and Bootstrap reserve risk information from R.

Using the pre-defined RAA incurred triangle, how do I get the "Total.Process Risk" and "Total.ParameterRisk" shown on page 35 of the package pdf https://cran.r-project.org/web/packages/ChainLadder/ChainLadder.pdf?
I must be missing simple command somewhere but I'm only able to see total Mack S.E by year upon calling "MackChainLadder(RAA)".

I understand Total Mack S.E ^2 = (Process Risk^2) + (Paramater Risk^2) but I'd like to see the individual breakout.

Additionally, the Mack S.E's and Ultimates sometimes show up in abbreviated format (2.03e+06) rather than something like 2028950. Is there any way to re-format the Mack summary?

Similar question, but for the Bootstrap method included in the package. Rather than producing a mean simulated IBNR by accident year, is it possible to input reserves by AY and show Mean Reserve / Mean Std Dev on the summary screen?

Thanks in advance.

User defined development factor in BootChainLadder

Hi,

I think it would be very useful to add an option in the BootChainLadder function in order to allow the user to force development factors other than those stemming from a pure application of the chainladder method. Indeed the calibration of development factors encompasses a certain level of expert judgement which can give rise to user defined development factors. The bootstrap should then be based on those DF and not on the canonical ones.

Is this something possible ? Thanks.

dfCorTest and cyEffTest fail for mxn triangle

The functions are set up to only handle nxn triangles. The AY correlation function appears to just need the value of n adjusted for years with one LDF. However, the CY test function may need more nuance.

BootChainLadder method different than England/Verrall

Correct me if I'm wrong, but it looks like the getExpected function is calculating the fitted triangle based on the Chain Ladder ultimates. This is different than England/Verrall Appendix 3 in which you "Obtain cumulative fitted values for the past triangle by backwards recursion, starting with the observed cumulative paid to date in the latest diagonal."

getExpected <- function(ults, ultDFs){ ults <- expandArray(ults, 2, dim(ultDFs)[2]) ultDFs <- expandArray(ultDFs, 1, dim(ults)[1]) return(ults * ultDFs) }

While the manual states "The implementation of BootChainLadder follows closely the discussion of the bootstrap model in section 8 and appendix 3 of the paper by England and Verrall (2002)", I think it's appropriate to note the difference in methods.

Simple and and vol-weighted averages for "non-square" triangules

If I want a volume all, or simple all set of link ratios a call to ata(paid.tri) works fine (where paid.tri = cumulative paid development triangle).

If I want to create a volume-weighted fit on the last five years worth of LDF factors I have found that I can utilise the weights argument as follows: chainladder(paid.tri,weights=wgt.vol.5), where I supply an input triangle of weights.

For "annual-annual" triangles, a five-year weighted average can be achieved using this method provided that the user supplies a weights triangle where the latest 6 diagonals are set = 1 and the balance are NAs. However, when I try the same on a annual/quarterly input triangle (i.e. non square) it fails and returns the following...

"Error in checkTriangle(Triangle) :
Number of origin periods, 10, is less than the number of development periods, 39."

By annual/quarterly I mean annual origin cohorts but tracking development progress quarter on quarter, which is common for Lloyd's and reinsurance companies. In my case I had data running out to 9.75 development years, and had 10 origin years. The error message therefore makes sense but essentially indicates that the chainladder algorithm requires a square-input matrix.

I'm wondering if there is a smart way to overcome this? Personally i know that a lot of people would find it easier to adopt Chainladder if they could easily derive simple and volume-weighted averages using a call to a pre-built UDF, e.g. chainladder(paid.tri, no.diag=5, TypeOfAverage="Vol")

I've found that a n-year simple average can be achieved using the following and I think this would be a good addition to your help file even if you don't recast the "chainladder"function.

######################################
require(zoo)

#quick UDF to get rid of NAs:
link_ratio_simple_n_yrs<-function(x,no.diag){
mean(tail(na.trim(x,sides="both"),no.diag))
}

#derive link ratios using build in UDF:
paid.link.ratios<-ata(paid.tri)

#get simple average as:
apply(paid.link.ratios,2,link_ratio_simple_n_yrs,no.diag=5)

######################################

However I can't tell (and this is annoying) how to achieve the same type of thing for volume weighted averages? Any thoughts? or a pain in a a$$?

I suspect my work around for simple weighted average will fail if there are inf values in the triangle. The following page may provide a better alternative to na.trim above: https://artax.karlin.mff.cuni.cz/r-help/library/IDPmisc/html/NaRV.omit.html

Error: package or namespace load failed for ‘ChainLadder’

I tried to install a package on a jupyter notebook install.packages("ChainLadder", "/Users/mymac/anaconda/lib/R/library") and I have this message when I load library(ChainLadder)
any idea?
thanks!

Error: package or namespace load failed for ‘ChainLadder’

Traceback:

library(ChainLadder)

stop(gettextf("package or namespace load failed for %s", sQuote(package)),

. call. = FALSE, domain = NA)

tweedieReserve : var.power=NULL not working

Hello,

The tweedieReserve function is not working when the var.power argument is NULL.

The documentation of the package says that "If NULL, it will be assumed to be in (1,2) and estimated using the cplm package.". I think it is a mistake in the documentation. Indeed, when analysing the code, I don't think the author has tried to implement any code that tackles a NULL value for the var.power argument.

This option is only available with the function glmReserve.

Is that correct ?

It could be nice to implement it within the tweedieReserve code since this function allows ODP model based on the calendar year which is not the case of the glmReserve.

Thank you in advance.

Regards,

CLFM Variability

Do you plan on implementing the variability calculations from Section 4.2 of the CLFM paper in this package?

package car is missing

Hi,

An error message:
"Error: package or namespace load failed for ‘ChainLadder’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
there is no package called ‘car’"
is found after I have installed the package and try as.triangle function.

Can you have a look into it and please advise? Thanks a lot

Best regards

MackChainladder produces very large tail.sigma

Unit testing some code and came across this example:
MackChainLadder(auto$PersonalAutoIncurred,alpha=2,tail=TRUE)

I believe this set-up produces unrealistic MackS.E.s

MultiChainLadder fails

MultiChainLadder(list(GenIns, GenIns))
fails with lapack error
Error in solve(sigma, tol = solvetol) :
Lapack routine dsptrf returned error code 1

I am getting systemfit-generated errors with other triangles as well.

Creation of triangle objects

Another enhancement proposal I'm willing to work on.

Currently, one creates triangle objects using as.triangle on matrices or data frames. This works very well for imported data, but I think we could do better for triangles created from the command line (or in a script file, which amounts to the same).

Just like there are matrix and as.matrix, or data.frame and as.data.frame, we could have a function triangle to directly create triangle objects from vectors of data. The added benefit would be that one would not have to supply NA values for the lower triangle. Think of an interface similar to this to create a 4 x 4 triangle:

triangle(c(1:4), c(1:3), c(1:2), 1, byrow = TRUE)

Any interest?

Cheers

CLRS Presentation

I am one of the presenters at the R workshop at CLRS (2016). I was assigned the ChainLadder package. Here is my presentation in case you have any thoughts (no obligation, of course)
http://rpubs.com/rajesh06_2016/chainladder_clrs

Thanks - Raj

MackChainLadder should return the same result if Triangle is passed through a pipe

Problem

The value returned by MackChainLadder() depends on whether Triangle is passed directly (i.e. as a function argument) or using magrittr's pipe operator (%>%):

library(ChainLadder)
library(magrittr)

# Pass Triangle directly
mcl <- MackChainLadder(RAA)

# Pipe Triangle
mcl_piped <- RAA %>% 
  MackChainLadder()

identical(mcl, mcl_piped)         # Returns FALSE

Further information

Differences are in elements "call" and "Model":

idx.diff <- which(vapply(
  seq_along(mcl),
  function(i) !identical(mcl[[i]], mcl_piped[[i]]),
  logical(1))
)

names(mcl)[idx.diff]

Arguably, the only difference is in the original name of the Triangle object. This difference may look minor and cosmetic. However, it will create confusion to anybody trying verify that two pieces of code lead to the same outcome. Also, pipes are so prevalent these days that they shouldn't be ignored.

System info

I am using the current GitHub version of ChainLadder. Here's my sessionInfo():

R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server 7.5 (Maipo)

Matrix products: default
BLAS: /opt/R/3.5.0/lib64/R/lib/libRblas.so
LAPACK: /opt/R/3.5.0/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] magrittr_1.5      ChainLadder_0.2.6

loaded via a namespace (and not attached):
 [1] biglm_0.9-1       statmod_1.4.30    zoo_1.8-2         tidyselect_0.2.4  purrr_0.2.5      
 [6] reshape2_1.4.3    splines_3.5.0     haven_1.1.1       lattice_0.20-35   carData_3.0-1    
[11] colorspace_1.3-2  stats4_3.5.0      yaml_2.1.19       rlang_0.2.1       pillar_1.2.3     
[16] foreign_0.8-70    glue_1.3.0        tweedie_2.3.2     readxl_1.1.0      bindrcpp_0.2.2   
[21] bindr_0.1.1       plyr_1.8.4        stringr_1.3.1     munsell_0.5.0     cplm_0.7-7       
[26] gtable_0.2.0      cellranger_1.1.0  zip_1.0.0         expint_0.1-4      coda_0.19-1      
[31] systemfit_1.1-22  rio_0.5.10        forcats_0.3.0     lmtest_0.9-36     curl_3.2         
[36] Rcpp_0.12.17      scales_0.5.0      abind_1.4-5       ggplot2_3.0.0     stringi_1.2.3    
[41] openxlsx_4.1.0    dplyr_0.7.6       grid_3.5.0        tools_3.5.0       sandwich_2.4-0   
[46] lazyeval_0.2.1    tibble_1.4.2      car_3.0-0         pkgconfig_2.0.1   MASS_7.3-50      
[51] Matrix_1.2-14     data.table_1.11.4 actuar_2.3-1      assertthat_0.2.0  minqa_1.2.4      
[56] R6_2.2.2          nlme_3.1-137      compiler_3.5.0

tweedieReserve : Error in summary.tweedie when rereserving = FALSE

Hello,

When rereserving = FALSE is specifying , summary.tweedie does not work.

This is due to the fact that in the second part of the code of the summary.tweedie function (this part is applied if rereserving = FALSE) we have

else{
    out<- list(    
      Reserve=data.frame(
        IBNR=c(mean(res$distr.res_ult),
               sd(res$distr.res_ult),
               #sd(res$distr.res_ult)/mean(res$distr.res_ult),
               quantile(res$distr.res_ult,q)
        )
      ),
      Diagnostic=c(GLMReserve=res$GLMReserve,
                   "mean(IBNR)"=mean(res$distr.res_ult))
    )
  }
  
  rownames(out$Prediction) <- c("mean", "sd", paste0(q*100, "%"))          
  print(out)
}

Therefore there is an error because out$Prediction does not exist.

It is easily fixed by changing for example :

Reserve=data.frame(
        IBNR=c(mean(res$distr.res_ult),
               sd(res$distr.res_ult),
               #sd(res$distr.res_ult)/mean(res$distr.res_ult),
               quantile(res$distr.res_ult,q)
        )

Prediction=data.frame(
        IBNR=c(mean(res$distr.res_ult),
               sd(res$distr.res_ult),
               #sd(res$distr.res_ult)/mean(res$distr.res_ult),
               quantile(res$distr.res_ult,q)
        )

Thank you in advance for your opinion.

Scaled Pearson residuals formula

Edit:
Problem solved. Issue to close.

Mack method standard errors NaN issue

The Mack method returns NaN for the the standard error if "sigma[i - 2]^2" is zero in the Mack.S.E function.

Triangle with empty values in early development periods

Is it possible to perform as.triangle on a triangle that isn't complete? I have older data where I'm only looking at the most recent ~25 years of development, so the upper left corner of my triangle is empty. When I use as.triangle on the data it seems to start the triangle at the first not null development period, which will then mess up recent accident years of the triangle. Is there an option to have the triangle start at the smallest (or first if data is ordered) origin value and development value?
Example triangle for what I'd like to make:

This is the triangle that as.triangle would generate from that data:

Mack chainladder provides different standarized residuals.

The standardized residuals which I am getting are different from the ones which R is throwing up.

I am unsure whether this is a bug since the fitted values are fine.

I am using {Average ATA (weighted or simple or ordinary regression) minus ATA factors from the data} divided by the standard deviation of ATAs from the data.

Please note that ATAs here refer to link development ratios. Also, in this example I have assumed that the future development is based on all years simple average ATAs.

This could also be cited in the paper (page number 10) "Flexible Factor Chain Ladder Model: A Stochastic Framework for Reasonable Link Ratio Selections" by Emanuel Bardis and Ali Majidi; and Daniel Murphy. Attached the paper.
01_Murphy.pdf

I have created an excel file highlighting the differences from rows 60 to 94 in the tab "St. Residuals" of the attachment.

I am bothered about cells H and J 44 (in RED fonts).

Kindly let me know why I see these differences?
Std Residual analysis.xlsx

How to manually change "Adj S^2" value in MackChainLadder?

Hello,

When using the Mack Method in excel, we get #div/0! errors if the tail portion of our triangle sees no development for several periods on all relevant AYs. As can be seen from the picture below, I could hard-code to a 0 in the highlighted cell as a workaround.

However, after playing around with the tail.se and tail.sigma functions in ChainLadder, I've not been able to figure out how I would hard-code something in a similar fashion.

Is there such a work around for triangles with this type of development when running a MackChainLadder process?

Thanks again in advance.

MultiChainLadder plot error when NA's in triangle

When upper left of triangle is NA, plot fails. Here is a simple example:

GenIns[1, 1] <- NA
plot(MultiChainLadder(list(GenIns)))
Error in xy.coords(x, y, xlabel, ylabel, log) :
'x' and 'y' lengths differ
In addition: Warning message:
In cbind(do.call("rbind", fitted.values), dev) :
number of rows of result is not a multiple of vector length (arg 2)

This appears to be caused in the fitted method. By removing NA's in the variable 'x' (in the "MCL" case) after line 1272 thus
1272: x <- sapply(Triangles, "[", 1:(m-i),i)
x <- x[!is.na(x)]
old 1273: fitted[[i]] <- x%*%diag(B[[i]],nrow=p)
the package compiled and the error went away for me.

I did not thoroughly test, nor test the "GMCL" case.

Perhaps a more elegant solution would utilize each model's fitted method as is currently done for the 'residuals' method, but I did not investigate that.

Thanks,
Dan

R.version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 5.3
year 2019
month 03
day 11
svn rev 76217
language R
version.string R version 3.5.3 (2019-03-11)
nickname Great Truth

Adjusted Pearson residual calculation in BootChainLadder

Within the BootChainLadder function, adjusted Pearson residuals are calculated as follows:

adj.resids <- unscaled.residuals * sqrt(nobs/scale.factor)

This is consistent with the formula given in the "Addendum to 'Analytic and Bootstrap Estimates of Prediction Errors in Claims Reserving'" paper (England, 2001, equations 2.4 and 3.1).

However, it is not consistent with the Stochastic Claims Reserving In General Insurance paper (England & Verrall, 2002, Appendix 3). Here the bootstrapping procedure is described with the instruction to adjust the Pearson residuals using the following slightly different formula:

adj.resids <- unscaled.residuals * sqrt(n/scale.factor)

i.e. with n rather than nobs in the numerator.

n is the number of rows/columns in the claims triangle, nobs is the total number of data points observed ( nobs <- 0.5 * n * (n + 1) ), and scale.factor <- (nobs - 2 * n + 1), or "n - p" in the original paper.

The latter source is the paper actually referenced in the R help file for BootChainLadder, although this in turn references the former.

Does anybody know how the Pearson residual adjustment formula is derived? Are there cases for using either version or is one incorrect? There can be a significant impact on the resulting calculation of standard error of IBNR depending on which is used.

glmReserve: Missing row in summary

Hi,

There is an inconsistency where Mack summary has all origin periods, but glmReserve drops the first origin period.

library(ChainLadder)

dev_glm <- glmReserve(GenIns)
dev_mack <- MackChainLadder(GenIns)

dev_glm$summary
summary(dev_mack)$ByOrigin

Adding Devineau&Al's Bootstrap.

Hi,
I'm actualy trying to add some missing reserving models to the package. Is there some norms that all models should implement to be considered as viable models for Triangles ?

I've heard about the TriangleModel class, but i cant find it. Furthermore, wich S3 methods should be implemented for my new models ? What are the necessary outputs of a TriangleModel ? The issue is that with S3 classes everything is optional, so it's hard to understand how things are binded together.

I'm implementing both univariate and multivariate models (notably the recursives Mack/MW bootstraps). Do i need to know somethings specials about the Multivariate case (special classes ? special formating of outputs ? )

Is there somewhere a technical documentation about the package that could help me bring my code to the package standards ?

CDR at 99.5%

I am trying to calculate the CDR for solvency II proposes. How can I get the CDR from BootChainLadder object at 99.5% ? Thanks!

updated: CDR(BCL, probs=c(0.995))

I want to change presaved claims (from ChainLadder-package)

hi 😄
i have a short question regarding the ChainLadder package. I found a website with an instruction for the Munich Chain Ladder that is included in the ChainLadder-package.
But unfortunately, the data for "MCLpaid" and "MCLincurred" are both presaved and I have no idea how I can change them, because I want to use it for my own data.https://cran.rstudio.com/web/packages/ChainLadder/vignettes/ChainLadder.html#munich-chain-ladder
I would be very happy if somebody could help me please and I am sorry if this is a stupid question, but i am not very advanced in working with R.

Many Greetings

Process and parameter error

Hello!
Please I have 2 questions and really hope someone will help me to understand.

1.Is there any way to find the process and parameter error from the Mack ChainLadder output? Is process and parameter error the same as the process risk and parameter risk that the package provide? If they are, Is the sum of the two not supposed to be equal to Mack.S.E for a given accident year?

My second question is about glmReserve in the ChainLadder package. If I set var.power= NULL, it returns "var.power must be provided for this version" although the vignette expressed that the package can return a var.power that fits the data when you set var.power=NULL.

Am I missing something? I need help on these issues.

Thanks in advance

mages / chainladder Goto Github PK

chainladder's People

Contributors

Stargazers

Watchers

Forkers

chainladder's Issues

Issue

Expected behaviour

System info

Problem

Further information

System info

Recommend Projects

Recommend Topics

Recommend Org

Jobs