GithubHelp home page GithubHelp logo

jeyakumar-iopex / benfordslaw Goto Github PK

View Code? Open in Web Editor NEW

This project forked from erdogant/benfordslaw

0.0 0.0 0.0 3.05 MB

benfordslaw is about the frequency distribution of leading digits.

License: Other

Python 72.30% Shell 2.90% Jupyter Notebook 24.79%

benfordslaw's Introduction

benfordslaw

Python PyPI Version License Coffee Github Forks GitHub Open Issues Project Status Downloads Downloads Open In Colab

  • benfordslaw is Python package to test if an empirical (observed) distribution differs significantly from a theoretical (expected, Benfords) distribution. The law states that in many naturally occurring collections of numbers, the leading significant digit is likely to be small. This method can be used if you want to test whether your set of numbers may be artificial (or manipulated). If a certain set of values follows Benford's Law then model's for the corresponding predicted values should also follow Benford's Law. Normal data (Unmanipulated) does trend with Benford's Law, whereas Manipulated or fraudulent data does not.

  • Assumptions of the data:

    1. The numbers need to be random and not assigned, with no imposed minimums or maximums.
    2. The numbers should cover several orders of magnitude
    3. Dataset should preferably cover at least 1000 samples. Though Benford's law has been shown to hold true for datasets containing as few as 50 numbers.

Installation

  • Install benfordslaw from PyPI (recommended). benfordslaw is compatible with Python 3.6+ and runs on Linux, MacOS X and Windows.
  • It is distributed under the MIT license.

Installation

pip install benfordslaw
  • Alternatively, install benfordslaw from the GitHub source:
git clone https://github.com/erdogant/benfordslaw.git
cd benfordslaw
pip install -U .

Import benfordslaw package

from benfordslaw import benfordslaw

# Initialize
bl = benfordslaw(alpha=0.05)

# Load elections example
df = bl.import_example(data='USA')

# Extract election information.
X = df['votes'].loc[df['candidate']=='Donald Trump'].values

# Print
print(X)
# array([ 5387, 23618,  1710, ...,    16,    21,     0], dtype=int64)

# Make fit
results = bl.fit(X)

# Plot
bl.plot(title='Donald Trump')

Analyze second digit-distribution

from benfordslaw import benfordslaw

# Initialize and set to analyze the second digit postion
bl = benfordslaw(pos=2)
# USA example
df = bl.import_example(data='USA')
results = bl.fit(X)
# Plot
bl.plot(title='Donald Trump', barcolor=[0.5, 0.5, 0.5], fontsize=12, barwidth=0.4)

Analyze last digit-distribution

from benfordslaw import benfordslaw

# Initialize and set to analyze the last postion
bl = benfordslaw(pos=-1)
# USA example
df = bl.import_example(data='USA')
results = bl.fit(X)
# Plot
bl.plot(title='Donald Trump', barcolor=[0.5, 0.5, 0.5], fontsize=12, barwidth=0.4)

Analyze second last digit-distribution

from benfordslaw import benfordslaw

# Initialize and set to analyze the last postion
bl = benfordslaw(pos=-2)
# USA example
df = bl.import_example(data='USA')
results = bl.fit(X)
# Plot
bl.plot(title='Donald Trump', barcolor=[0.5, 0.5, 0.5], fontsize=12, barwidth=0.4)

Citation

Please cite benfordslaw in your publications if this is useful for your research. Here is an example BibTeX entry:

@misc{erdogant2020benfordslaw,
  title={benfordslaw},
  author={Erdogan Taskesen},
  year={2019},
  howpublished={\url{https://github.com/erdogant/benfordslaw}},
}

References

Maintainer

  • Erdogan Taskesen, github: erdogant
  • This work is created and maintained in my free time. If you wish to buy me a Coffee for this work, it is very appreciated.
  • Contributions are welcome.
  • Star it if you like it!

benfordslaw's People

Contributors

erdogant avatar andrewlane avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.