GithubHelp home page GithubHelp logo

jeffswt / pydatagen Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 0.0 34 KB

Create random data easier with pydatagen.

License: GNU Lesser General Public License v3.0

Python 100.00%
oi random-data-generation library package

pydatagen's Introduction

pydatagen

Create random data easier with pydatagen.

Now you can use Python to generate random data for OI problems, in a really Pythonified way!

Ports

You can use printf just like the way you use it on Codeforces, USACO, BZOJ and pretty much every Online Judge. It's just that simple, only the fancy formats are ignored in this version.

If you want to print something, simply write:

printf("%d %d\n", n, m);

The simple applications are really of no differences with C++.

We are currently working on the porting of scanf.

Random Generator

We provide two generators, one is called rand and the other xrand. The only difference between the two is, when the same function is called regularly, xrand functions 10x faster than rand.

For instance, the two snippets of code functions exactly the same:

for i in range(0, 10000000):
    rand(int, 64, 128)

And the optimized version as follows:

g = xrand(int, 64, 128)
for i in range(0, 10000000):
    next(g)

Detailed usage of these functions can be referred in the Python interactive help console by typing help('pydatagen.rand'), had you installed this package.

Installation

Clone the repository into an empty folder, and execute the following command:

pip install setuptools # If you have already installed this before, ignore it
python setup.py install # Install pydatagen

You can also build your Wheel package by executing the following command:

pip install setuptools # If you have already installed this before, ignore it
pip install wheel # Install bdist_wheel provider
python setup.py build bdist_wheel # Build .whl package

Alternatively you could download our official release from the releases panel, and install the compiled 'wheel' to your computer. After you have downloaded the release, you may install it in the command line:

pip install ./pydatagen-version-py3-none-any.whl # Referring to the downloaded file

Installation is as fast as creating random data!

Pros and Cons

The pros of using pydatagen:

  • Easier to write than C++ random generators
  • Shorten your code by 80%
  • Allows more advanced ways of data generation
  • Extensible modules makes adding data type support easier

However, there are a few drawbacks that cannot be avoided at present:

  • Slow generation speed, around 3x slower than general Python approaches
  • Way slower than NumPy and C++ implementations

If you don't care much about performance, than we believe that pydatagen is the best choice you have. It also works well with pyjudge and pyMatcher.

pydatagen's People

Contributors

jeffswt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pydatagen's Issues

UnionFindSet shouldn't use classmethod decorator

the attributes are shared across instances in classmethods, causing unexpected bug.

for example,

>>> class Test:
...   @classmethod
...   def __init__(cls,x):
...     cls.x=x
...
>>> a=Test(1)
>>> a.x
1
>>> b=Test(2)
>>> a.x
2
>>>

stop using `queue`

queue is designed for multithreading uses and acquire a lock in every method call. in a word, it's really slow.

just use list instead. it's much (about 50x) faster.

>>> import timeit
>>> timeit.timeit('q.put(123)', 'import queue; q=queue.Queue()', number=10**6)
24.48412793993996
>>> timeit.timeit('q.append(123)', 'q=[]', number=10**6)
0.5576901608900329

Graph only succeeds with less than 20% probability

Using master

Python 3.4.3 (default, Jun 29 2015, 12:16:01) 
[GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pydatagen import *
>>> def test():
...   try:
...     Graph(6, 12, connected=True, weighed=True, weight_range=range(10, 99))
...     return True
...   except:
...     return False
... 
>>> results = [test() for _ in range(20000)]
>>> results.count(True) / len(results)
0.18835

improve `printf`

try this:

f=open('1.txt','w')
from contextlib import redirect_stdout
with redirect_stdout(f):
    print('hello world')

or

f=open('1.txt','w')
print('hello world', file=f)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.