[toc]
$ \DeclareMathOperator*{\E}{\mathbb{E}} \DeclareMathOperator*{\V}{\mathbb{Var}} $
Credibility Theory and Generalized Linear Models, Nelder and Verral
- Risk parameter is Random Variable
- Best linear approximation to Bayesian estimate
- For exponential family of distributions, the credibility formula is exact
- Exponential families: basis of Generalized Linear Models
Evolutionary Credibility Theory: A Generalized Linear Mixed Modeling Approach, Tze Leung Lai & Kevin Haoyu Sun
##Credibility Problem
The credibility problem is the following. You observed loss data
We look at the average loss for the i-th risk $$ \bar Y_i =\frac{1}{m}\sum_{j=1}^m Y_{ij}. $$
So each risk/individual has his own experience with his own historical mean
-
$\forall i,\exists \Theta_i$ and the elemets of the set${\Theta_i| i \in {1,2, \ldots,n}}$ are i.i.d. - The vectors
${(Y_{i1},Y_{i2},\ldots,Y_{im})| 1\le i\le n}$ are i.i.d. - For fixed
$i$ , given$\Theta_i$ , the random variables$Y_{i1},Y_{i2},\ldots,Y_{im}$ are conditionally i.i.d.
Assumption 1 means that there is a unobservable random variable
Assumption 3 says that, for a given individual, the loss variables are independent and identically distributed.
The conditional expected mean:
The overall mean:
The conditional variance:
The credibility premium will be a premium of the form: $$ P=a_{i0}+ a_{i10}Y_{i1}+a_{i2}Y_{i2}+\cdots +a_{im}Y_{im} $$ i.e a linear combination of the observed
##Buhlmann Model
Let
Note that
The Buhlmann credibility premium is
$$
P_B(i) = z \cdot \bar Y_i + (1-z) \cdot \mu
$$
where
The credibility factor
The numerator,
The denominator
Note that the credibility factor
So we need to estimate
The natural eztimator of
For the estimation of
Note that
As in the Buhlmann credibility problem, we observe loss data
We look at the average loss for the i-th risk $$ \bar Y_i =\frac{1}{m}\sum_{j=1}^m Y_{ij}. $$
For each risk
We also assume that there are functions
Just like in the Buhlmann model, we let
Denote the risk volume of the i-th risk over all m years by $$ W_{i\cdot}=\sum_{j=1}^m W_{ij} $$ and the risk volume over all years and all insureds by $$ W_{\cdot\cdot}=\sum_{j=1}^m W_{i\cdot}. $$
First, we normalize the losses as
Let
-
$\mu:$ $$\hat\mu=\frac{1}{W_{\cdot\cdot}}\sum_{i=1}^n\sum_{j=1}^mW_{ij}X_{ij}=\frac{1}{W_{\cdot\cdot}}\sum_{i=1}^nW_{i\cdot }\bar X_{i}.$$ -
$\sigma^2:$ $$\hat\sigma^2=\frac{1}{n(m-1)}\sum_{i=1}^n\sum_{j=1}^mW_{ij}(X_{ij}-\bar X_i)^2.$$ -
$v^2:$ Let $$ W^* = \frac{1}{nm-1} \sum_{i=1}^n W_{i\cdot}\left(1-\frac{W_{i\cdot}}{W_{\cdot\cdot}}\right). $$ Then$$v^2 = \frac{1}{W^*} \left[ \left( \frac{1}{n\cdot m -1} \sum_{i=1}^n \sum_{j=1}^m W_{ij}(X_{ij}-\hat\mu)^2 \right) - \hat\sigma^2 \right] $$
Let
Considering the effect on
Let's now look at the estimator of $v$ . We have
$$
\E
\left[
\sum_{i=1}^n
\sum_{j=1}^m
W_{ij}
(X_{ij} - \hat\mu)^2
\right]
(m_\cdot - 1) \cdot \sigma^2 + \left( W_{\cdot\cdot} - \sum_{i=1}^n \frac{W_{i\cdot}}{W_{\cdot\cdot}} \right) \cdot v^2 $$ where, of course, $$ m_\cdot = \sum_{i=1}^n m_{i}. $$ Therefore, one has to change $$ W^* = \frac{1}{m_\cdot-1} \sum_{i=1}^n W_{i\cdot} \left( 1-\frac{W_{i\cdot}}{W_{\cdot\cdot}} \right). $$ and $$ v^2 = \frac{1}{W^*} \left[ \left( \frac{1}{m_\cdot -1} \sum_{i=1}^n \sum_{j=1}^m W_{ij}(X_{ij}-\hat\mu)^2 \right) - \hat\sigma^2 \right] $$
Consider an insurance portfolio where contracts are classified into cohorts.
In our terminology, this is a two-level hierarchical classification structure. The
observations are claim amounts
To each data point corresponds a weight — or volume –
The three types of estimators for parameters
Also
$$
B = \sum_{i=1}^{I}
z_{i\Sigma}
(X_{izw}- X_{zzw})^2
-
(I -1)\cdot a
$$
and
$$
d = z_{\Sigma\Sigma}
-
\sum_{i=1}^{I}
\frac{z_{i\Sigma}^2}{z_{\Sigma\Sigma}}
$$
with
$$
\bar X_{zzw} =
\sum_{i=1}^{I}
\frac{z_{i\Sigma}}{z_{\Sigma\Sigma}}
X_{izw}
$$
(Hence,
the Ohlsson estimators are $$ \hat a' = \frac{\sum_{i=1}^I A_i}{\sum_{i=1}^I c_i} $$ and $$ \hat b' = \frac{B}{d} $$
and the iterative (pseudo-)estimators are
$$
\tilde a = \frac{1}{
\sum_{i=1}^I
(J_i -1)
}
\sum_{i=1}^I
\sum_{j=1}^{J_i}
z_{ij}
(X_{ijw} - X_{izw})^2
$$
and
$$
\tilde b = \frac{1}{I -1}
\sum_{i=1}^I
z_{i}
(X_{izw} - X_{zzw})^2
$$
where
$$
X_{zzw} = \sum_{i=1}^I
\frac{z_i}{z_\Sigma}
X_{izw}
$$
Note the difference between the two weighted averages (3) and (10). See Belhadj
et al. (2009) for further discussion on this topic.
Finally, the estimator of the collective mean m is mˆ = Xzzw.
The credibility modeling function cm assumes that data is available in the
format most practical applications would use, namely a rectangular array (matrix
or data frame) with entity observations in the rows and with one or more
classification index columns (numeric or character). One will recognize the
output format of simul and its summary methods.
Then, function cm works much the same as lm. It takes in argument: a
formula of the form ˜ terms describing the hierarchical interactions in a data
set; the data set containing the variables referenced in the formula; the names
of the columns where the ratios and the weights are to be found in the data
set. The latter should contain at least two nodes in each level and more than
one period of experience for at least one entity. Missing values are represented
by NAs. There can be entities with no experience (complete lines of NAs).
In order to give an easily reproducible example, we group states 1 and 3 of the Hachemeister data set into one cohort and states 2, 4 and 5 into another. This shows that data does not have to be sorted by level. The fitted model using the iterative estimators is:
> X <- cbind(cohort = c(1, 2, 1, 2, 2), hachemeister)
> fit <- cm(~cohort + cohort:state, data = X, ratios = ratio.1:ratio.12,
+ weights = weight.1:weight.12, method = "iterative")
> fit
Call:
cm(formula = ~cohort + cohort:state, data = X, ratios = ratio.1:ratio.12,
weights = weight.1:weight.12, method = "iterative")
Structure Parameters Estimators
Collective premium: 1746
Between cohort variance: 88981
Within cohort/Between state variance: 10952
Within state variance: 139120026
The function returns a fitted model object of class "cm" containing the estimators of the structure parameters. To compute the credibility premiums, one calls a method of predict for this class:
> predict(fit)
$cohort
[1] 1949 1543
$state
[1] 2048 1524 1875 1497 1585
One can also obtain a nicely formatted view of the most important results with a call to summary:
> summary(fit)
Call:
cm(formula = ~cohort + cohort:state, data = X, ratios = ratio.1:ratio.12,
weights = weight.1:weight.12, method = "iterative")
Structure Parameters Estimators
Collective premium: 1746
Between cohort variance: 88981
Within cohort/Between state variance: 10952
Within state variance: 139120026
Detailed premiums
Level: cohort
cohort Indiv. mean Weight Cred. factor Cred. premium
1 1967 1.407 0.9196 1949
5
2 1528 1.596 0.9284 1543
Level: state
cohort state Indiv. mean Weight Cred. factor Cred. premium
1 1 2061 100155 0.8874 2048
2 2 1511 19895 0.6103 1524
1 3 1806 13735 0.5195 1875
2 4 1353 4152 0.2463 1497
2 5 1600 36110 0.7398 1585
The methods of predict and summary can both report for a subset of the levels by means of an argument levels. For example:
> summary(fit, levels = "cohort")
Call:
cm(formula = ~cohort + cohort:state, data = X, ratios = ratio.1:ratio.12,
weights = weight.1:weight.12, method = "iterative")
Structure Parameters Estimators
Collective premium: 1746
Between cohort variance: 88981
Within cohort variance: 10952
Detailed premiums
Level: cohort
cohort Indiv. mean Weight Cred. factor Cred. premium
1 1967 1.407 0.9196 1949
2 1528 1.596 0.9284 1543
> predict(fit, levels = "cohort")
$cohort
[1] 1949 1543
The results above differ from those of Goovaerts and Hoogstad (1987) for the same example because the formulas for the credibility premiums are different.
##Buhlmann Straub call
$
\DeclareMathOperator*{\Var}{\mathrm{Var}}
$
$
\DeclareMathOperator*{\Cov}{\mathrm{Cov}}
$
cm(~state, hachemeister, ratios = ratio.1:ratio.12)
##Gisler's Book Hierarchical Implementation
Level | Interpretations | Variables | Indices | Set of Indices |
---|---|---|---|---|
4 | Line of Business | |||
3 | Classes | g | G | |
2 | Risk Groups | h | H | |
1 | Individual Risks | i | I | |
0 | Data | ij |
###Some Notation:
$\Phi(\Psi_g) = $ set of
###Assumptions
- The random variables
$\Psi_g\ \ (g= 1, 2, \ldots, |G|)$ are i.i.d. with density$r_3(\psi)$ . - Given
$\Psi_g$ , the random variables$\phi_h\in \Phi(\Psi_g)$ are i.i.d. with the conditional density$r_2(\phi|\Psi_g)$ . - Given
$\Phi_h$ , the random variables$\theta_i\in \theta(\Phi_h)$ are i.i.d. with the conditional density$r_1(\nu|\Phi_h)$ . - Given
$\theta_i$ , the observations$X_{ij}\in \mathbf{D}(\theta_i)$ are conditionally independant with densities$r_0(x|\theta_i, w_{ij}),$ for which $$ \E\left[X_{ij}| \Theta_i\right]= \mu(\Theta_i) $$ and $$ \V \left[X_{ij}| \Theta_i\right]= \mu(\Theta_i) = \frac{\sigma^2(\Theta_i)}{Wij} $$ where$w_{ij}$ = known weights.
For
Let's list the sets of nodes in the hierarchical tree:
$$\begin{align}
\mathbb{G}&:={g : \Psi_g\in \Psi(\mu_0)}\
\mathbb{H}&:=\bigcup_{ g\in \mathbb{G}}\mathbb{H}g\
\mathbb{I}&:=\bigcup{h\in \mathbb{H}} \mathbb{I}_h
\end{align}$$
Further, we denote the number of elements in a set with
-
$|\mathbb{I}_h| =$ Number of nodes at the$\Theta$ -level which stems from$\Theta_h$ . -
$||\mathbb{I}| =$ Total number of nodes at the$\Theta$ -level. -
$||\mathbb{G}| =$ Total number of nodes at level 3.
We denote the number of observations for the i-th risk by
We let
Consider
$$
\begin{align}
\widehat{T_h^{(1)}} &=
c_h\cdot
\left{
\frac{|I_h|}{|I_h|-1}
\cdot
\sum_{i\in I_h}
\frac{w_{i\bullet}}{z_h^{(1)}}
\cdot
\left(
B_i^{(1)} -\overline{B}h^{(1)}
\right)^2
-\frac{|I_h|\cdot\widehat{\sigma^2}}{z_h^{(1)}}
\right}\
\
\text{where}\
\
z_h^{(1)}&=\sum{i\in I_h}w_{i\bullet},\
\overline{B}h^{(1)} &= \sum{i\in I_h}\frac{w_{i\bullet}}{z_h^{(1)}}\cdot B_i^{(1)},\
c_h=
\frac{|I_h|-1}{|I_h|}
\left{
\sum_{i\in I_h}\frac{w_{i\bullet}}{{z_h^{(1)}}}
\left(
1-\frac{w_{i\bullet}}{z_h^{(1)}}
\right)
\right}^{-1}
\end{align}
$$
So we choose
$$
\widehat{\tau_1^2} =
\frac{1}{|H|}
\sum_{h\in H}
\max
{\widehat{T_h^{(1)}},0}
$$
as estimator of
We define
$$
\begin{align}
\widehat{T_g^{(2)}} &=
c_h\cdot
\left{
\frac{|H_g|}{|H_g|-1}
\cdot
\sum_{h\in H_g}
\frac{w_{h}^{(2)}}{z_g^{(2)}}
\cdot
\left(
B_h^{(2)} -\overline{B}g^{(2)}
\right)^2
-\frac{|H_g|\cdot\widehat{\tau_1^2}}{z_g^{(2)}}
\right}\
\
\text{where}\
\
z_g^{(2)}&=\sum{h\in H_g}w_h^2,\
\overline{B}g^{(2)} &= \sum{h\in H_g}
\frac{w_h^2}{z_g^{(2)}}
\cdot \overline B_h^{(2)},\
c_g=
\frac{|H_g|-1}{|H_g|}
\left{
\sum_{h\in H_g}
\frac{w_h^{2}}{{z_g^{(2)}}}
\left(
1-\frac{w_h^{2}}{z_g^{(2)}}
\right)
\right}^{-1}.
\end{align}
$$
So we choose
$$
\widehat{\tau_2^2} =
\frac{1}{|G|}
\sum_{g\in G}
\max
{\widehat{T_g^{(2)}},0}
$$
as estimator of
Following the same scheme as the estimation of
#IRM Implementation
Using the actuar R package.
Create the hachemeister dataset:
library(actuar)
library(foreign)
vignette("credibility")
data.frame(hachemeister)
write.foreign(
data.frame(hachemeister),
"C:/Bits/cred/meister.txt",
"c:/Bits/cred/meister.sas",
package="SAS"
)
# Write out the transpose dataset
write.foreign(
data.frame(t(hachemeister)),
"C:/Bits/cred/tmeister.txt",
"c:/Bits/cred/tmeister.sas",
package="SAS"
)
https://drive.google.com/drive/folders/0B4foEG5BEjCqdk5oQTluV01pWnc
Tip: You can open any markdown URL within StackEdit Viewer using viewer#!url=.
##Password useage in Mercurial http://www.swiftsoftwaregroup.com/how-to-configure-tortoisehg-to-remember-your-username-and-password/
From this question I learned this syntax:
https://stackedit.io/viewer#!url=http://path/to/markdown.md However I did not find how to open a local file (which is possible with the "Import from disk" dialog).
Is is possible to open a local document with a similar syntax to:
https://stackedit.io/viewer#!url=file:///C:/test.md stackedit shareimprove this question asked Jun 24 '15 at 12:37
Unfortunately, the app does not handle any (download) protocol but http and https.
If you just wanted to access them (not edit and save) you could run a simple static file server you could access them, just not save them. Likely useless, I know, but here for completeness.
Because the app is hosted in a browser, you wont have real access to your local File System (except through Dropbox/Google Docs, which use an API). You can see that the app also has a local version of an MD file included, but again, read only.
I am sure there will be someone who might host it in electron, which would give you complete File System access with a few minimal tweaks.
I for one would love to integrate this into an internal documentation server. Along with my thousand other projects I want to work on...
http://www.wildandscruffy.com/woodworking-projects/metal-lathe-project-plans
http://www.green-trust.org/junkyardprojects/FreeHomeWorkshopPlans.html
http://absolutelyfreeplans.com/metalworking/metalworking.htm
##Introduction
- From Nelder and Verral
- Risk parameter is Random Variable
- Best linear approximation to Bayesian estimate
- For exponential family of distributions, the credibility formula is exact
- Exponential families: basis of Generalized Linear Models
and create the hachemeister dataset as follows:
library(actuar)
library(foreign)
vignette("credibility")
data.frame(hachemeister)
write.foreign(
data.frame(hachemeister),
"C:/Bits/cred/meister.txt",
"c:/Bits/cred/meister.sas",
package="SAS"
)
# Write out the transpose dataset
write.foreign(
data.frame(t(hachemeister)),
"C:/Bits/cred/tmeister.txt",
"c:/Bits/cred/tmeister.sas",
package="SAS"
)
##The Buhlmann Straub call in R
cm(~state, hachemeister, ratios = ratio.1:ratio.12)
and the corresponding result
Call:
cm(formula = ~state, data = hachemeister, ratios = ratio.1:ratio.12)
Structure Parameters Estimators
Collective premium: 1671.017
Between state variance: 72310.02
Within state variance: 46040.47
Surfacing the code outside the package so that it can be modified and debugged as we go.
source( file = "C:/Bits/cred/jrUtil.R")
source(file = "C:/Bits/cred/actuar/cm.R")
source(file = "C:/Bits/cred/actuar/bstraub.R")
DEBUG = T
cm(~state, hachemeister, ratios = ratio.1:ratio.12)
Call:
cm(formula = ~state, data = hachemeister, ratios = ratio.1:ratio.12)
Structure Parameters Estimators
Collective premium: 1671.017
Between state variance: 72310.02
Within state variance: 46040.47
Good, the results are the same.
Now, on the way to a SAS version.
Credibility Theory and Generalized Linear Models, Nelder and Verral Evolutionary Credibility Theory: A Generalized Linear Mixed Modeling Approach, Tze Leung Lai & Kevin Haoyu Sun