Truncated power law - divide by zero about powerlaw HOT 8 CLOSED

jeffalstott commented on June 24, 2024

Truncated power law - divide by zero

from powerlaw.

Comments (8)

jeffalstott commented on June 24, 2024

This is the Truncated_Power_Law class trying to set some initial parameters
(alpha and lambda), from which it will numerally search for a better fit.
We get an initial estimate on alpha analytically from the data using this
formula:

alpha = 1 + len(data)/sum( log( data / (self.xmin) ))

The problem is when sum( log( data / (self.xmin) )) == 0. This could occur
if the only data points are equal to xmin (so data/xmin == 1, and log(1) ==
0). It could also occur if it just happens that sum of the logs is 0
(vanishingly unlikely).

The result of such a situation is that alpha = inf. So that's the starting
guess for the numerical search, and it should very quickly get off that and
go find a real value for alpha. Do you still get a real number for the
alpha on the truncated power law fit?

fit = powerlaw.Fit(data)
fit.truncated_power_law.alpha

I am interested in how/why this situation could arise to begin with. As I
said, this seems only likely to occur if all data is exactly equal to xmin.
If that were the case all the fits should look terrible, as we effectively
have one data point. What do the data and fits look like?

I suppose another possibility is if data and xmin were both ints (instead
of floats). As ints, 3/2 == 1, and log(1) == 0, etc. So data and xmin could
just be within the same order of magnitude, and not exactly equal. However,
xmin should at least be a float. Is it in your case? Maybe we should
forcibly cast all the data as float at the initialization of the Fit object.

On Thu, Feb 27, 2014 at 4:54 AM, Philipp Singer [email protected]:

Hi Jeff,

Just stumbled upon a runtime error by fitting the truncated power law
function. Here is the corresponding message:

Assuming nested distributions
/usr/local/lib/python2.7/dist-packages/powerlaw.py:1351: RuntimeWarning:
divide by zero encountered in double_scalars
alpha = 1 + len(data)/sum( log( data / (self.xmin) ))
Traceback (most recent call last):
[...]
R, p = fit.distribution_compare('power_law', 'truncated_power_law',
normalized_ratio=True)
File "/usr/local/lib/python2.7/dist-packages/powerlaw.py", line 315, in
distribution_compare
[...]

Thanks,
Philipp

Reply to this email directly or view it on GitHubhttps://github.com//issues/8
.

from powerlaw.

psinger commented on June 24, 2024

Sorry for late response. Seems like it is indeed an int problem. My data contained of ints instead of floats. By forcing floats the error does not occur any longer.

from powerlaw.

jeffalstott commented on June 24, 2024

Thanks, Philipp. Perhaps we should cast all data as floats, then, so this
doesn't happen again.

On Wed, Mar 5, 2014 at 5:38 AM, Philipp Singer [email protected]:

Sorry for late response. Seems like it is indeed an int problem. My data
contained of ints instead of floats. By forcing floats the error does not
occur any longer.

Reply to this email directly or view it on GitHubhttps://github.com//issues/8#issuecomment-36729633
.

from powerlaw.

jeffalstott commented on June 24, 2024

All data is now casted as floats.

from powerlaw.

robegs commented on June 24, 2024

Hi Jeff,

I'm getting a similar error, I casted the data to float. But the error is still there:

fit.distribution_compare('power_law', 'truncated_power_law'

[...]

/home/myuser/anaconda/lib/python2.7/site-packages/powerlaw.pyc in _pdf_discrete_normalizer(self)
1383 from mpmath import exp # faster /here/ than numpy.exp
1384 C = ( float(exp(self.xmin * self.Lambda) /
-> 1385 lerchphi(exp(-self.Lambda), self.alpha, self.xmin)) )
1386 if self.xmax:
1387 Cxmax = ( float(exp(self.xmax * self.Lambda) /

/home/myuser/anaconda/lib/python2.7/site-packages/mpmath/ctx_mp_python.pyc in div(self, other)

/home/myuser/anaconda/lib/python2.7/site-packages/mpmath/libmp/libmpf.pyc in mpf_div(s, t, prec, rnd)
932 return fzero
933 if t == fzero:
--> 934 raise ZeroDivisionError
935 s_special = (not sman) and sexp
936 t_special = (not tman) and texp

ZeroDivisionError:

any possible solution??

from powerlaw.

jeffalstott commented on June 24, 2024

It looks like lerchphi is returning a 0. Can you:

Provide the data you used that led to this error.
Provide the minimal data needed to create this error.
OR
return the values of self.Lamba, self.alpha, and self.xmin right before the error?

Thanks!

from powerlaw.

robegs commented on June 24, 2024

Hi Jeff,

the data is too big, about 150M points. I don't know how to share it with you (the csv file is about 900Mb).

I need some hours to load the data an fit it again to give you the value for these variables.

Thanks for your prompt response!!

from powerlaw.

jeffalstott commented on June 24, 2024

How about trying on a smaller, random sample?

On Friday, October 31, 2014, robegs [email protected] wrote:

Hi Jeff,

the data is too big, about 150M points. I don't know how to share it with
you (the csv file is about 900Mb).

I need some hours to load the data an fit it again to give you the value
for these variables.

Thanks for your prompt response!!

—
Reply to this email directly or view it on GitHub
#8 (comment).

from powerlaw.

Truncated power law - divide by zero about powerlaw HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs