GithubHelp home page GithubHelp logo

Comments (8)

jeffalstott avatar jeffalstott commented on June 24, 2024

This is the Truncated_Power_Law class trying to set some initial parameters
(alpha and lambda), from which it will numerally search for a better fit.
We get an initial estimate on alpha analytically from the data using this
formula:

alpha = 1 + len(data)/sum( log( data / (self.xmin) ))

The problem is when sum( log( data / (self.xmin) )) == 0. This could occur
if the only data points are equal to xmin (so data/xmin == 1, and log(1) ==
0). It could also occur if it just happens that sum of the logs is 0
(vanishingly unlikely).

The result of such a situation is that alpha = inf. So that's the starting
guess for the numerical search, and it should very quickly get off that and
go find a real value for alpha. Do you still get a real number for the
alpha on the truncated power law fit?

fit = powerlaw.Fit(data)
fit.truncated_power_law.alpha

I am interested in how/why this situation could arise to begin with. As I
said, this seems only likely to occur if all data is exactly equal to xmin.
If that were the case all the fits should look terrible, as we effectively
have one data point. What do the data and fits look like?

I suppose another possibility is if data and xmin were both ints (instead
of floats). As ints, 3/2 == 1, and log(1) == 0, etc. So data and xmin could
just be within the same order of magnitude, and not exactly equal. However,
xmin should at least be a float. Is it in your case? Maybe we should
forcibly cast all the data as float at the initialization of the Fit object.

On Thu, Feb 27, 2014 at 4:54 AM, Philipp Singer [email protected]:

Hi Jeff,

Just stumbled upon a runtime error by fitting the truncated power law
function. Here is the corresponding message:

Assuming nested distributions
/usr/local/lib/python2.7/dist-packages/powerlaw.py:1351: RuntimeWarning:
divide by zero encountered in double_scalars
alpha = 1 + len(data)/sum( log( data / (self.xmin) ))
Traceback (most recent call last):
[...]
R, p = fit.distribution_compare('power_law', 'truncated_power_law',
normalized_ratio=True)
File "/usr/local/lib/python2.7/dist-packages/powerlaw.py", line 315, in
distribution_compare
[...]

Thanks,
Philipp

Reply to this email directly or view it on GitHubhttps://github.com//issues/8
.

from powerlaw.

psinger avatar psinger commented on June 24, 2024

Sorry for late response. Seems like it is indeed an int problem. My data contained of ints instead of floats. By forcing floats the error does not occur any longer.

from powerlaw.

jeffalstott avatar jeffalstott commented on June 24, 2024

Thanks, Philipp. Perhaps we should cast all data as floats, then, so this
doesn't happen again.

On Wed, Mar 5, 2014 at 5:38 AM, Philipp Singer [email protected]:

Sorry for late response. Seems like it is indeed an int problem. My data
contained of ints instead of floats. By forcing floats the error does not
occur any longer.

Reply to this email directly or view it on GitHubhttps://github.com//issues/8#issuecomment-36729633
.

from powerlaw.

jeffalstott avatar jeffalstott commented on June 24, 2024

All data is now casted as floats.

from powerlaw.

robegs avatar robegs commented on June 24, 2024

Hi Jeff,

I'm getting a similar error, I casted the data to float. But the error is still there:

fit.distribution_compare('power_law', 'truncated_power_law'

[...]

/home/myuser/anaconda/lib/python2.7/site-packages/powerlaw.pyc in _pdf_discrete_normalizer(self)
1383 from mpmath import exp # faster /here/ than numpy.exp
1384 C = ( float(exp(self.xmin * self.Lambda) /
-> 1385 lerchphi(exp(-self.Lambda), self.alpha, self.xmin)) )
1386 if self.xmax:
1387 Cxmax = ( float(exp(self.xmax * self.Lambda) /

/home/myuser/anaconda/lib/python2.7/site-packages/mpmath/ctx_mp_python.pyc in div(self, other)

/home/myuser/anaconda/lib/python2.7/site-packages/mpmath/libmp/libmpf.pyc in mpf_div(s, t, prec, rnd)
932 return fzero
933 if t == fzero:
--> 934 raise ZeroDivisionError
935 s_special = (not sman) and sexp
936 t_special = (not tman) and texp

ZeroDivisionError:

any possible solution??

from powerlaw.

jeffalstott avatar jeffalstott commented on June 24, 2024

It looks like lerchphi is returning a 0. Can you:

  1. Provide the data you used that led to this error.
  2. Provide the minimal data needed to create this error.
    OR
  3. return the values of self.Lamba, self.alpha, and self.xmin right before the error?

Thanks!

from powerlaw.

robegs avatar robegs commented on June 24, 2024

Hi Jeff,

the data is too big, about 150M points. I don't know how to share it with you (the csv file is about 900Mb).

I need some hours to load the data an fit it again to give you the value for these variables.

Thanks for your prompt response!!

from powerlaw.

jeffalstott avatar jeffalstott commented on June 24, 2024

How about trying on a smaller, random sample?

On Friday, October 31, 2014, robegs [email protected] wrote:

Hi Jeff,

the data is too big, about 150M points. I don't know how to share it with
you (the csv file is about 900Mb).

I need some hours to load the data an fit it again to give you the value
for these variables.

Thanks for your prompt response!!


Reply to this email directly or view it on GitHub
#8 (comment).

from powerlaw.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.