joke2k / faker Goto Github PK

View Code? Open in Web Editor NEW

17.5K 17.5K 1.9K 10.47 MB

Faker is a Python package that generates fake data for you.

Home Page: https://faker.readthedocs.io

License: MIT License

Python 99.99% Makefile 0.01% Shell 0.01%

dataset fake fake-data faker faker-generator python test-data test-data-generator testing

faker's People

Contributors

Stargazers

Watchers

Forkers

peterichardson sysgrove goodtune rvnovaes kriestof silky ashvetsov majochi greggg230 tosta-mista cstrutton steven-mercatante nguyendn josephmisiti wjt johngoodleaf smeggingsmegger tuanva juanrebolledog simonw srikanth-gandi basicsbeauty zonca mattl202 mattyhall masegaloeh warrenmcquinn looping84 aficionado gskielian yanlinaung waytai danielshir attilaolah marctc dotpot durden bsandrow rayleyva pragith mrjawfree zhymin77 aeglethemis mbtech axraf marek34 mschoebel smith4170 bfagundez fashtimedotcom chainn sghosh73 niclefebvre157 bussiere dheeraj510 milliquet stavrossk jacinda hatahet willk mubbashir afthill prakashpp jayhawk starenka frkodes jordanjambazov philadams-zz siddu0071 ir00man evanhahn dhdrucker palei asanjeevak cybernetics atykhonov daachi jz3lada jehzlau gomerx devlato griha41 tpn pmgupte bluetreeir osaatcioglu saxix g00fy- jasonlai256 frandman sq9mev forabetter makefu huntinux kaleissin aklaver uho tayamarn controltable lukasszz

faker's Issues

Honor Environment LANG

I am currently using a wrapper for fake-factory to be able to choose the output but it would be great it would become part of fake-factory core.

This is the script i have in my path: https://gist.github.com/makefu/9101269

usage:
$ LANG=de_DE.utf-8 faker address
Davide-Kaul-Weg 175
94892 Königs Wusterhausen

Prepare a release

faker has so many great new changes in git, I think you guys should release all of them onto pypi soon, perhaps after pulling in the pull request with the docs.

Extract provider logic from provider data?

The provider data and provider logic are pretty tightly intertwined.

It'd be nice if they were separated out--then it'd be a lot easier to port some of the other provider lists out there.

For example, look at how ForgeryPy structures the data separate from the logic--ForgeryPy dictionaries are the equivalent of Faker's Providers: https://github.com/tomekwojcik/ForgeryPy/tree/master/forgery_py/dictionaries

He's got a generic loader that kicks in when a custom function isn't defined for a provider.

That project seems relatively abandoned, so it'd be nice to pull that clean functionality into this project.

It'd also probably make it easier for people to localize their providers because they just change the data files without having to think about the attached python code.

is faking time series data in the scope of this project ?

I wanted some fake time series data for a project and couldn't find anything suitable for my needs.
Is something like this in the scope of this project ?

Feature request: add generation of random corporate bullshit

See http://cbsg.sourceforge.net/cgi-bin/live for an example.

faker.providers.miscelleneous is spelt wrong

"miscelleneous" is not a word. It should be "miscellaneous".

Transform CamelCase to lower_case_with_underscores

Minor problem, but inconvenient when integrating in apps that follow PEP8 more closely.

How about a release?

Last release was in March. Perhaps a new release would be in order to make people using pypi get it as well? Would be appreciated! Keeps the ecosystem going and all that.

Windows installing falls

Executing pip install fake-factory leads to:
http://pastebin.com/Vy9erGF0

Windows 7 x64, python 2.7.4, pip 1.2.1

Latitude range should be [90,-90] degrees assuming faker uses WGS-84 datum (ESPG:4326)

Tests fail (on Xubuntu 14.04) due to timestamp out of range issue

Running the tests fail on my Xubuntu 14.04 virtual machine (32-bit with Python 2.7.6) due to a ValueError: timestamp out of range for platform time_t in L246 of faker/providers/date_time.py; see below for the output:

$ python setup.py test
running test
running egg_info
writing dependency_links to fake_factory.egg-info/dependency_links.txt
writing fake_factory.egg-info/PKG-INFO
writing top-level names to fake_factory.egg-info/top_level.txt
reading manifest file 'fake_factory.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'fake_factory.egg-info/SOURCES.txt'
running build_ext
test_add_provider_gives_priority_to_newly_added_provider (faker.tests.FactoryTestCase) ... ok
test_command (faker.tests.FactoryTestCase) ... 6588 Shasta Locks
South Tamikaville, CO 72509-4971


ok
test_documentor (faker.tests.FactoryTestCase) ... ERROR
test_format_calls_formatter_on_provider (faker.tests.FactoryTestCase) ... ok
test_format_transfers_arguments_to_formatter (faker.tests.FactoryTestCase) ... ok
test_get_formatter_returns_callable (faker.tests.FactoryTestCase) ... ok
test_get_formatter_returns_correct_formatter (faker.tests.FactoryTestCase) ... ok
test_get_formatter_throws_exception_on_incorrect_formatter (faker.tests.FactoryTestCase) ... ok
test_magic_call_calls_format (faker.tests.FactoryTestCase) ... ok
test_magic_call_calls_format_with_arguments (faker.tests.FactoryTestCase) ... ok
test_parse_returns_same_string_when_it_contains_no_curly_braces (faker.tests.FactoryTestCase) ... ok
test_parse_returns_string_with_tokens_replaced_by_formatters (faker.tests.FactoryTestCase) ... ok

======================================================================
ERROR: test_documentor (faker.tests.FactoryTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mdxs/dev/gh/faker/faker/tests.py", line 65, in test_documentor
    print_doc()
  File "/home/mdxs/dev/gh/faker/faker/cli.py", line 77, in print_doc
    formatters = doc.get_formatters(with_args=True, with_defaults=True)
  File "/home/mdxs/dev/gh/faker/faker/documentor.py", line 28, in get_formatters
    (provider, self.get_provider_formatters(provider, **kwargs))
  File "/home/mdxs/dev/gh/faker/faker/documentor.py", line 78, in get_provider_formatters
    example = self.generator.format(name)
  File "/home/mdxs/dev/gh/faker/faker/generator.py", line 56, in format
    return self.get_formatter(formatter)(*args, **kwargs)
  File "/home/mdxs/dev/gh/faker/faker/providers/date_time.py", line 246, in date_time_ad
    return datetime.fromtimestamp(random.randint(-62135600400, int(time())))
ValueError: timestamp out of range for platform time_t

----------------------------------------------------------------------
Ran 12 tests in 0.201s

FAILED (errors=1)

Storing a data in files

Why not to store all data in files?
See person.py in https://gist.github.com/kissarat/755a2d39546dc828ae37
You may use dump.py to make it easy
If you want to store in this way I would convert the project

There are no need to implement child classes if no specifics in implementation. May be data files only

Are images outside of the scope of this project?

Hi,

We're using this currently in our tests to generate test data. However we'd also like to use it to generate sample HTML pages (blog posts - for example). For this it would be great if faker could have an image provider (or maybe a file provider as a lower level).

Would you be averse to this idea? If not I'm more than happy to work on the provider and submit a pull request.

Cheers,
Ben

Put docs onto readthedocs

I think that the current way of documenting everything on Github only doesn't scale very well. I suggest you put the docs onto readthedocs.

`.prefix`/`.suffix` returns a tuple instead of a single value

.prefix (and .suffix) can occasionally return a tuple of values instead of a single value when prefixes_male and prefixes_female (or suffixes_*) are present in the provider.

See here for the code responsible.

I wasn't sure if this was intentional (it's documented to do so -- then again, the documentation is autogenerated, isn't it?), so I didn't make a PR yet, but it's certainly counterintuitive.

Submit a new provider?

How do I submit a new provider for inclusion into a future build?

Only getting APO addresses with system python 2.7.7

With system python:

☄ python --version
Python 2.7.7

☄ which faker
/usr/local/bin/faker

☄ python -c "import faker; print faker.VERSION"
0.4.2

☄ faker address -r 10
PSC 7159, Box 2889
APO AP 50457

PSC 5924, Box 3842
APO AA 79576-2701

PSC 4394, Box 0547
APO AA 13834-3973

PSC 1353, Box 2874
APO AE 17295

PSC 8492, Box 6715
APO AE 89299-8347

PSC 0676, Box 5745
APO AA 45384

PSC 7082, Box 0817
APO AE 39616

PSC 9015, Box 5179
APO AP 79298

PSC 3885, Box 3107
APO AA 97447

PSC 3078, Box 3599
APO AE 16713-0587

In virtualenv:

☄ python --version
Python 3.4.1

☄ which faker
/Users/kyl/Code/Playground/faker/.venv/bin/faker

☄ python -c "import faker; print(faker.VERSION)"
0.4.2

☄ faker address -r 10
94283 Jewell Shoal Suite 192
West Cade, TN 16897-7888

93143 Runolfsdottir Summit Suite 471
Lilliamouth, KS 80170-8892

PSC 5138, Box 8808
APO AE 12600-9380

787 Rohan Drive Apt. 652
Port Ebertport, FL 84541-9565

12609 Gulgowski Club
Waelchihaven, VT 93071

Unit 6204 Box 4740
DPO AA 61620-2499

0791 Daxton Avenue
Chaneltown, TN 87248-1822

6046 Emard Camp
Lennyborough, FM 79310

83026 Kane Shore
Lake Casie, SD 63881-1429

881 Davis Walks Suite 491
McKenziehaven, TX 35051-3973

Clarify using from the shell docs

In the using from shell section of the docs, I understand how to display the result of a fake. There is an example:

$ python -m faker address

However, it is not clear to me how to give a provider's name, for example 'Lorem' (should that be lowercase 'lorem'?), and display all of the provider's fakes. It would be good if there was an example provided.

differentiate between male and female first names

As I can see, fake.first_name() can return either a male or female first name. Do you plan to make a difference between them? Like fake.first_name(gender='male'), where the default value could be 'any'.

I ask it because I want to add support for Hungarian names. I have an up-to-date list with all the Hungarian names, put in two files: males and females. I could put them in two sets, or I could add them in one set.

Random job provider

It would be useful to have a job provider together with the company provider. If anyone could point me to a good list, i would work on it.

Improve color provider

see: https://github.com/davidmerfield/randomColor

Error loading faker library via pip

I got the pip install to finish but python wont recognize anything from the faker library upon use.

Capital O missing an umlaut

Hello, I noticed in faker/Providers/De_de/internet.py in the _to_ascii method, the capital O is missing an umlaut.

It should be: ('Ö', 'Oe')

Currently:
replacements = (
('ä', 'ae'), ('Ä', 'Ae'),
('ö', 'oe'), ('O', 'Oe'),
('ü', 'ue'), ('Ü', 'Ue'),
('ß', 'ss')

Default locale to language if no territory given.

It would be great that if faker was initialized with only a locale and no territory, that it would use a sensible default.

For example I currently have to do the following if using something such as "en" instead of "en_US".

from faker import Factory
from faker import AVAILABLE_LOCALES

locale = 'en'
if locale not in AVAILABLE_LOCALES:
    locale = next(l for l in AVAILABLE_LOCALES if l.startswith(locale))

factory = Factory.create(locale)

This happens when using dynamic mock data in local development where django sets the locale to "en" because we do not define territories.

Providers autodiscovery

Currently, every time a provider is added, we need to update the lists in __init__.

This is error-prone and it would be more sustainable if we could discover providers automatically.

cls.random_sample

I wanted to add a method to BaseProvider that allows for sampling n unique elements.
There are situations in which I want to grab several random things, but I want those results to be unique. I just forked and added this to my own fork, but I wanted to run it by you before making a pull request.

    # in faker/provides/__init__.py BaseProvider
    @classmethod
    def random_sample(cls, array=('a','b','c'), number=2):
        """ Returns $number unique elements from $array"""
        return random.sample(array, number)

Provide random gender

I may be missing something but I don't think faker spits out random genders in the person provider. While trivial to write, I think this should still be included in faker.

Parameter for disallowed characters?

It would be useful to have a parameter that would disallow a set of characters from a provider's output.

# Don't use outputs that have /, %, or &
fake.bs(disallowed_characters=['/', '%', '&'])

The use case I ran into was that we needed fake strings that could safely be put into URIs and therefore cannot contain /.

Thoughts on this?

Unformatted PyPI page

This is what a user sees on https://pypi.python.org/pypi/fake-factory:

This is caused by PyPI not understanding Markdown.

You can use pandoc to convert Markdown to ReStructuredText that PyPI understands.

timezone() randomly throws an exception

fake.timezone() sometimes throws an exception, possibly when a country doesn't have any timezones defined:

>>> from faker import Faker
>>> f = Faker()
>>> f.timezone()
'Africa/Mogadishu'
>>> f.timezone()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/vagrant/.python/lib/python3.3/site-packages/faker/providers/date_time.py", line 378, in timezone
    return cls.random_element(cls.countries)['timezones'].pop(0)

This is with Python 3.3 using fake-factory 0.4.0 from pypi.

Integrate with Factory Boy

Factory Boy provides easy replacement for fixtures. It allows for an easy definition of factories, various build factories, factory inheritance etc.
It has a FuzzyAttribute mechanism which suites perfectly for faker.

Improve image_url with format parameters

see #106 (diff)

Refactor Profile to be used with locale: how

I got this idea but i'm not sure it would be the simplest: the actual profile.py becomes something like "internal_profile.py", its methods are renamed "internal_simple_profile()" and "internal_profile()", and is removed from the list of standard providers. Then we will have a standard profile.py that simply calls self.generator.internal_profile(). For each locale instead, we will be able to add more logic, for example to customize field names and eventually values.

Do you think there would be a simpler way to do it?

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 100: ordinal not in range(128)

I'm having problems to install Faker 0.4.2 on Python 3.4.2:

$ pip install fake-factory
Collecting fake-factory
  Using cached fake-factory-0.4.2.tar.gz
    Traceback (most recent call last):
      File "<string>", line 20, in <module>
      File "/private/var/folders/98/hxvgjtd93ql1s1c4695y6w2h0000gq/T/pip-build-e5pfmuys/fake-factory/setup.py", line 9, in <module>
        NEWS = open(os.path.join(here, 'CHANGELOG.rst')).read()
      File "/Users/pedro.teixeira/.virtualenvs/cave/bin/../lib/python3.4/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 100: ordinal not in range(128)
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

      File "<string>", line 20, in <module>

      File "/private/var/folders/98/hxvgjtd93ql1s1c4695y6w2h0000gq/T/pip-build-e5pfmuys/fake-factory/setup.py", line 9, in <module>

        NEWS = open(os.path.join(here, 'CHANGELOG.rst')).read()

      File "/Users/pedro.teixeira/.virtualenvs/cave/bin/../lib/python3.4/encodings/ascii.py", line 26, in decode

        return codecs.ascii_decode(input, self.errors)[0]

    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 100: ordinal not in range(128)

    ----------------------------------------
    Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/98/hxvgjtd93ql1s1c4695y6w2h0000gq/T/pip-build-e5pfmuys/fake-factory

Pip install fails in 0.4.1

Downloading/unpacking fake-factory from https://pypi.python.org/packages/source/f/fake-factory/fake-factory-0.4.1.tar.gz#md5=27ac002a6f3a4b46d8996b5ef6ad5a7c
  Downloading fake-factory-0.4.1.tar.gz (306kB): 306kB downloaded
  Running setup.py egg_info for package fake-factory
    Traceback (most recent call last):
      File "<string>", line 16, in <module>
      File "/Users/gkisel/.virtualenvs/faker/build/fake-factory/setup.py", line 9, in <module>
        NEWS = open(os.path.join(here, 'CHANGELOG.rst')).read()
    IOError: [Errno 2] No such file or directory: '/Users/gkisel/.virtualenvs/faker/build/fake-factory/CHANGELOG.rst'
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

  File "<string>", line 16, in <module>

  File "/Users/gkisel/.virtualenvs/faker/build/fake-factory/setup.py", line 9, in <module>

    NEWS = open(os.path.join(here, 'CHANGELOG.rst')).read()

IOError: [Errno 2] No such file or directory: '/Users/gkisel/.virtualenvs/faker/build/fake-factory/CHANGELOG.rst'

Support Python 3

pip installation under Python 3 fails:

$ python --version
Python 3.3.5
$ pip install faker
Downloading/unpacking faker
  Downloading Faker-0.0.4.tar.gz
  Running setup.py (path:/home/abcde/temp/faker_test/env3/build/faker/setup.py) egg_info for package faker
    Traceback (most recent call last):
      File "<string>", line 17, in <module>
      File "/home/abcde/temp/faker_test/env3/build/faker/setup.py", line 5, in <module>
        import faker
      File "./faker/__init__.py", line 11, in <module>
        import data
    ImportError: No module named 'data'
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

  File "<string>", line 17, in <module>

  File "/home/abcde/temp/faker_test/env3/build/faker/setup.py", line 5, in <module>

    import faker

  File "./faker/__init__.py", line 11, in <module>

    import data

ImportError: No module named 'data'

----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in /home/abcde/temp/faker_test/env3/build/faker
Storing debug log for failure in /home/abcde/.pip/pip.log

Enhance random generator repeatibility

The docs mention being able to call the seed() method so you can use a generated dataset as part of a unit test.

Due to the way Faker uses the random module, this usecase is a bit fragile. Any modification to the data requested, or any outside uses of the random module during generation will diverge the dataset.

Here is a quick script demonstrating the problem along with a couple of potential solutions:

import random
from faker import Faker
fake = Faker()

# initial run
fake.seed(1234)
print fake.name()
print fake.name()
print fake.name()

# repeated run with same data
fake.seed(1234)
print fake.name()
print fake.name()
print fake.name()

# adding new fake calls prevent us from getting the same names we had originally
fake.seed(1234)
print fake.name(), fake.email()
print fake.name(), fake.email()
print fake.name(), fake.email()

# One way is to implement a preserve/restore mechanism so that the user can get back to the previous trail of data
fake.seed(1234)
print fake.name()
r = random.getstate()
print fake.email()
random.setstate(r)
print fake.name()
r = random.getstate()
print fake.email()
random.setstate(r)
print fake.name()
r = random.getstate()
print fake.email()
random.setstate(r)

# A similar problem arises if the program using faker happens to use a non-instance random call during generation.
# The best way to prevent this issue is to have faker use an instance of random rather than the module version.

# If faker used an instance version of random, you could also resolve the original problem by using different faker instances
fake.seed(1234)
fake2 = Faker()
fake2.seed(1234)
print fake.name(), fake2.email()
print fake.name(), fake2.email()
print fake.name(), fake2.email()

faker coding style guide/standard?

What do you think about adding a coding style guide/standard for this project?
I can see that style differs a lot from file to file. As a result it needs a lot of cleanup work to do.

Make faker extensible

Provider a way for the users to add their own custom provider on runtime.

Change project name to avoid confusion

I just received this feedback talking about the original Faker in PHP:

"I would recommend some sort of distinguishing name then. They both have the same name, that is going to be really confusing. Even something like FakerPy or something."

I think it makes sense and FakerPy is a good option.

generate username and password too

It would be nice if one could generate random usernames and passwords too. I have a tool for that ( https://github.com/jabbalaci/jabbapylib/blob/master/jabbapylib/apps/userpass.py ) that I use for online registrations.

If you like the idea, I can make a pull request in order to integrate it to faker.

clean up people names from en_US provider

Currently, the en_US.person provider contains a long list of names, many of which are actually pretty rare in the US (eg: 'Eusebio' or 'Filiberto').

We can populate the list using data from http://ssa.gov/oact/babynames/decades/names2000s.html (or any other decade).

Related: #69

Added fake-factory to Ohloh

I've added fake-factory to ohloh.net at https://www.ohloh.net/p/fake-factory to keep some statistics on the code base and to allow contributors to claim/track their commits.

At the moment, there is no "Manager" ... Who should register as a project manager Someone who works on the project. Ideally the owner, founder, lead developer, or release manager.

So I guess either @joke2k or @fcurella should claim that role by clicking on the "Become the first manager for fake-factory" on the https://www.ohloh.net/p/fake-factory page.

get ideas from http://www.fakenamegenerator.com/

You can also check out http://www.fakenamegenerator.com/ too to get some new ideas. You can select your gender, name set and country, and it generates a complete fake identity. Maybe some parts of it could be integrated in faker too.

US_en phone number formats

The US_en phone_number() provider includes formats that can generate invalid phone numbers (i.e. numbers which can't be parsed as standard US numbers by phonenumbers.py):

import phonenumbers
from fake import Faker
faker = Faker()
number = faker.phone_number()
phonenumber.parse(number,'US')

The above code will return a NumberParseException if the phone number is generated using the first format, '+##(#)##########' with an invalid country code (e.g. +08(1)111111111). One possibility is to try and force this format to always use a valid country code following the +. However, because other providers/localizations can already be used to generate specific international number formats including leading country codes, etc... I think it'd be simpler to only include valid US numbers in the US_en provider. In this case, it'd be easiest to simply remove the '+##(#)##########' formats from the provider?

E.g. '+##(#)##########' appears twice, as does '0##########'.

Will be happy to submit a PR for a fix myself. Wanted to confirm the duplicate is a bug.

Best,
Anthony

No module named faker (v 0.3)

$ pip install fake-factory==0.3
Downloading/unpacking fake-factory==0.3
  Downloading fake-factory-0.3.tar.gz (86kB): 86kB downloaded
  Running setup.py egg_info for package fake-factory

Installing collected packages: fake-factory
  Found existing installation: fake-factory 0.2
    Uninstalling fake-factory:
      Successfully uninstalled fake-factory
  Running setup.py install for fake-factory

Successfully installed fake-factory
Cleaning up...

$ python
Python 2.7.5+ (default, Jun  2 2013, 13:26:34) 
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from faker import Factory
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named faker
>>>

Version 0.2 works great.

AttributeError: 'Generator' object has no attribute 'password'

This error occurs when attempting to use the password method on a Factory object.

Python 2.7.6 (default, Feb 26 2014, 12:07:17) 
[GCC 4.8.2 20140206 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from faker import Factory
>>> fake = Factory.create()
>>> fake.password()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Generator' object has no attribute 'password'

joke2k / faker Goto Github PK

faker's People

Contributors

Stargazers

Watchers

Forkers

faker's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs