Original Launchpad bug 562376: <a href="https://bugs.launchpad.net/statsmodels/+bug/56

discussion and example also in <a class="issue-link js-issue-link" data-error-text="Fa

committed raising an exception in PR <a class="issue-link js-issue-link" data-error-te

Logit and glm don't raise warning for perfect separation case about statsmodels HOT 6 CLOSED

statsmodels commented on May 21, 2024

Logit and glm don't raise warning for perfect separation case

from statsmodels.

Comments (6)

wesm commented on May 21, 2024

[ LP comment 1 by: joep, on 2010-10-13 17:48:56.032290+00:00 ]

see also new thread Oct 13, Logit predict

logit_res.mle_retvals['converged']
True

we could check at the end of the fit() what the return value of the optimization is, and do further inspection if converged is not true

from statsmodels.

wesm commented on May 21, 2024

[ LP comment 2 by: Skipper Seabold, on 2010-12-15 00:18:27.728435+00:00 ]

Code to replicate:

import scikits.statsmodels as sm
import scikits.statsmodels.discretemod as dm
import numpy as np

Endog = np.array([1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0])
Exog = np.array([[ 10. , -2.7, 12.3, 1. ],
[ -2.7, 8.1, -5.7, 1. ],
[ 0.6, -5.8, -7.7, 1. ],
[ 5.5, 0.6, 2. , 1. ],
[ -2.3, 10.6, -3.7, 1. ],
[ 0.3, -2.3, 0.1, 1. ],
[ -0.8, 0.3, -1.3, 1. ],
[ -4. , 1.3, -1.4, 1. ],
[ 9.4, -4. , 6.9, 1. ]])

GLM_Model = sm.GLM(Endog, Exog, family = sm.families.Binomial())
GLM_results = GLM_Model.fit()
print GLM_results.params

Logit_Model = dm.Logit(Endog, Exog)
Logit_results = Logit_Model.fit()
print Logit_results.params

A possible solution is something like (not sure what the correct default tolerance should be):

from scipy import optimize

def callback(params):
if np.allclose(Logit_Model.cdf(np.dot(Logit_Model.exog,params))-Logit_Model.endog,0,atol=1e-4):
raise ValueError("Perfect or Quasi-Perfect separation detected")

func = lambda params : -Logit_Model.loglike(params)

In [93]: ret = optimize.fmin_bfgs(func, np.zeros(4)+1, callback=callback)

ValueError Traceback (most recent call last)

/home/skipper/ in ()

/usr/local/lib/python2.6/dist-packages/scipy/optimize/optimize.pyc in fmin_bfgs(f, x0, fprime, args, gtol, norm, epsilon, maxiter, full_output, disp, retall, callback)
505 gfk = gfkp1
506 if callback is not None:
--> 507 callback(xk)
508 k += 1
509 gnorm = vecnorm(gfk,ord=norm)

/home/skipper/ in callback(params)

ValueError: Perfect or Quasi-Perfect separation detected

from statsmodels.

wesm commented on May 21, 2024

[ LP comment 3 by: Skipper Seabold, on 2010-12-15 15:26:45.168352+00:00 ]

It has been proposed to do something like:

def callback(params):
if np.allclose(Logit_Model.cdf(np.dot(Logit_Model.exog,
params))-Logit_Model.endog,0,atol=1e-4):
print "_Perfect or Quasi-Perfect separation detected_"
print "Results are most likely not useful"
raise ValueError

func = lambda params : -Logit_Model.loglike(params)

try:
ret = optimize.fmin_bfgs(func, np.zeros(4)+1, callback=callback)
except:
ret = optimize.fmin_bfgs(func, np.zeros(4)+1, maxiter=1)

This is ok, but it does not give xopt values that actually demonstrate perfect separation. Perhaps if in the callback, we attach params to the model and then use these as starting values for the optimization in the except case, this will work.

from statsmodels.

wesm commented on May 21, 2024

[ LP comment 4 by: joep, on 2010-12-15 15:43:06.050398+00:00 ]

the callback function needs to hold on to the current state of the optimizer, params. In fitting model it will be relatively easy, because we can attach it to the model instance.

self.callback_params = params

and restart the second optimization, in the except, with start values self.callback_params

from statsmodels.

josef-pkt commented on May 21, 2024

discussion and example also in #66

summary method for Logit and Probit adds warning text about complete (quasi-) separation

more work is in https://github.com/statsmodels/statsmodels/tree/perfect-pred

from statsmodels.

josef-pkt commented on May 21, 2024

committed raising an exception in PR #100
added option to turn of exception in PR #184

from statsmodels.

Logit and glm don't raise warning for perfect separation case about statsmodels HOT 6 CLOSED

Comments (6)

In [93]: ret = optimize.fmin_bfgs(func, np.zeros(4)+1, callback=callback)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs