Comments (1)
Thank you very much for considering this use case and writing a solution
for it!
Can you make this edit as a pull request? It may also be worth testing out
some; I can see in the code a reference to xmax2, which is not defined. As
a lower priority, I would also consider if there's a way to do this without
making another copy of the data; some users have very large datasets and
making another copy of the data can get burdensome.
On Sat, May 14, 2016 at 10:38 PM, Cils [email protected] wrote:
I noticed, that the values below 1 are not included in the
pdf-distribution of the data (powerlow.pdf(), fit.plot_pdf()). For some
reason unknown to me, the histogram logarithmic spaced bins boundaries are
transformed to integers.line 1952 in powerlaws.py: bins=unique( floor(logspace( log_min_size,
log_max_size, num=number_of_bins)))
In this way a potential infinite number of bins are eliminated.I modified the function "pdf" to solve the problem. The data are rescaled
by multipling them by xmin before the histogram is computed. At the end the
bins boundareies "edges" are transformed back to the original scale and
returned.Below the code:
`def pdf(data, xmin=None, xmax=None, linear_bins=False, **kwargs):
"""
Returns the probability density function (normalized histogram) of the
data. Version modified by A.Capelli, include values x<1 in the pdf
distribution.Parameters
data : list or array
xmin : float, optional
Minimum value of the PDF. If None, uses the smallest value in the data.
xmax : float, optional
Maximum value of the PDF. If None, uses the largest value in the data.
linear_bins : float, optional
Whether to use linearly spaced bins, as opposed to logarithmically
spaced bins (recommended for log-log plots).Returns
bin_edges : array
The edges of the bins of the probability density function.
probabilities : array
The portion of the data that is within the bin. Length 1 less than
bin_edges, as it corresponds to the spaces between them.
"""
from numpy import logspace, histogram, floor, unique
from math import ceil, log10
if not xmax:
xmax = max(data)
if not xmin:
xmin = min(data)normalize data to xmin, allow to have pdf also from the data below x=1, modification by A.Capelli
data2=data/xmin
xmax=xmax/xmin
xmin_old=xmin
xmin=1if linear_bins:
bins = range(int(xmin), int(xmax2))
else:
log_min_size = log10(xmin)
log_max_size = log10(xmax)
number_of_bins = ceil((log_max_size-log_min_size)*10)
bins=unique(
floor(
logspace(
log_min_size, log_max_size, num=number_of_bins)))
hist, edges = histogram(data2, bins, density=True)transform data back to original
xmax=xmax_xmin_old
xmin=xmin_old
edges=edges_xmin
print(xmin)
return edges, hist`—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#33
from powerlaw.
Related Issues (20)
- `estimate_discrete` should be False by default or raise a warning for x_min < 6 HOT 1
- Finding xmin for a truncated power law HOT 3
- Alpha exponent less than 1? HOT 4
- python 3.7 HOT 1
- Version label
- Added xmin computation does not work for distributions != power_law/truncated_power_law HOT 1
- power law plot showing fit and all data, not just data from xmin HOT 1
- New user: Why the curvature in power_law.plot_ccdf fit? HOT 14
- Defunct scipy import HOT 1
- threshold in powerlaw fit HOT 1
- Remove or make optional xmin fitting print
- Fitting a powerlaw with the xmax parameter HOT 17
- How to improve the efficiency of the fit.
- Get the estimates when i only have an probability distribution from empirical data
- Some issues in lognormal fit
- how to calculate the R value properly for discrete data
- Feature Request: Return the normalization constant HOT 9
- Please remove print statement on line 341 of powerlaw.py
- parameter1 attribute not set for fit.powerlaw HOT 1
- Can not pass 'bins' keyword to `plot_pdf` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from powerlaw.