GithubHelp home page GithubHelp logo

Comments (6)

jeffalstott avatar jeffalstott commented on July 22, 2024

from powerlaw.

 avatar commented on July 22, 2024

When I run the example in a while loop, like this:

import powerlaw
xmax=100
exponent=1.1
dist=powerlaw.Power_Law(xmin=1,xmax=xmax,discrete=True,parameters=[exponent],discrete_approximation="xmax")
while True:
    print(dist.generate_random(n=1,estimate_discrete=False))

it crashes after a while. Alternatively, more samples also (almost always) cause the problem:
print(dist.generate_random(n=100,estimate_discrete=False))

from powerlaw.

jeffalstott avatar jeffalstott commented on July 22, 2024

I don't have a computer at the moment with powerlaw installed/installable, but reading through the code it looks like the doc string for .generate_random states "If xmax is present, it is currently ignored." So, in principle this function shouldn't even generate the thing that you want, anyway. I would just generate data from a Power_Law instance without an xmax, then cut out all the data above your desired xmax.

It appears the bug is that .generate_random calls on _double_search_discrete, which does binary search across values of x to match a random number in [0,1] with the CCDF of the Power_Law instance (thus producing x proportional to its probability in the Power_Law). My guess is that the problem arises when there's xmax and the binary search overshoots past xmax, yielding a candidate x value the cdf function can't handle; it trims out everything that's outside of xmax and xmin with trim_to_range, yielding an empty list and later producing the error you see.

So, the problem is that we don't actually know how to directly generate synthetic data from a discrete power law with an xmax. If there's an elegant solution, I'll happily implement it. In the meantime, please confirm that my suggestion above actually works for you. Thanks!

from powerlaw.

 avatar commented on July 22, 2024

If I understood correctly, it is not possible to use this library to sample the distribution with given xmax?
Thank you for your help.

from powerlaw.

jeffalstott avatar jeffalstott commented on July 22, 2024

I would just generate data from a Power_Law instance without an xmax, then cut out all the data above your desired xmax.
That'll get you what you want.

from powerlaw.

 avatar commented on July 22, 2024

Okay, thank you very much.

from powerlaw.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.