GithubHelp home page GithubHelp logo

lk-geimfari / mimesis Goto Github PK

View Code? Open in Web Editor NEW
4.3K 61.0 326.0 23.15 MB

Mimesis is a powerful Python library that empowers developers to generate massive amounts of synthetic data efficiently.

Home Page: https://mimesis.name

License: MIT License

Python 99.77% Shell 0.14% Makefile 0.09%
mimesis fake data generator fixtures dummy schema testing python json-generator

mimesis's Introduction

Mimesis

Mimesis: The Fake Data Generator

Test Coverage Package version Package version Supported Python versions


Documentation: https://mimesis.name/


Mimesis (/mɪˈmiːsɪs) is a robust data generator for Python that can produce a wide range of fake data in various languages.

The key features are:

  • Multilingual: Supports 35 different locales.
  • Extensibility: Supports custom data providers and custom field handlers.
  • Ease of use: Features a simple design and clear documentation for straightforward data generation.
  • Performance: Widely recognized as the fastest data generator among Python solutions.
  • Data variety: Includes various data providers designed for different use cases.
  • Schema-based generators: Offers schema-based data generators to effortlessly produce data of any complexity.
  • Intuitive: Great editor support. Fully typed, thus autocompletion almost everywhere.

Installation

To install mimesis, use pip:

~ pip install mimesis

To work with Mimesis on Python versions 3.8 and 3.9, the final compatible version is Mimesis 11.1.0. Install this specific version to ensure compatibility.

Documentation

You can find the complete documentation on the Read the Docs.

It is divided into several sections:

You can improve it by sending pull requests to this repository.

Usage

The library is exceptionally user-friendly, and it only requires you to import a Data Provider object that corresponds to the desired data type.

For instance, the Person provider can be imported to access personal information, including name, surname, email, and other related fields:

from mimesis import Person
from mimesis.locales import Locale

person = Person(Locale.EN)

person.full_name()
# Output: 'Brande Sears'

person.email(domains=['example.com'])
# Output: '[email protected]'

person.email(domains=['mimesis.name'], unique=True)
# Output: '[email protected]'

person.telephone(mask='1-4##-8##-5##3')
# Output: '1-436-896-5213'

License

Mimesis is licensed under the MIT License. See LICENSE for more information.

mimesis's People

Contributors

aprasanna avatar auyer avatar axce1 avatar battleroid avatar dependabot-preview[bot] avatar dependabot-support avatar dependabot[bot] avatar destag avatar duckyou avatar eumiro avatar hoefling avatar jasonwaiting-dev avatar jlwt90 avatar jorisdevrede avatar jwilk avatar lk-geimfari avatar marcosvafg avatar mipaaa avatar ngnpope avatar paulwaltersdev avatar pyup-bot avatar sinecode avatar sobolevn avatar uvegla avatar valerievich avatar vlangf avatar wikkiewikkie avatar willsthompson avatar wooza avatar yn-coder avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mimesis's Issues

Add json-minifer

@caspian-seagull It would be great if you can write gulp-task that will minify all *.json files in data/*locale*/. All dist files will be saved in release/data in root of elizabeth.

Add support of custom providers.

We need to realize something like this:

>>> from elizabeth import Generic

>>> generic = Generic('en')

>>> class SomeProvider():
        def hello(self):
            return "Hello!"

>>> class Another():
        def bye(self):
            return "Bye!"

>>> generic.add_provider(SomeProvider)
>>> generic.add_provider(Another)

>>> generic.someprovider.hello()
>>> generic.another.bye()
# Hello!
# Bye!

Add abbreviation to State field

Add option for abbreviation of state name i.e. something like:

address.state(abbrev=True)

to return 'WA' for state, vs. 'Washington'.

Probably want to do something similar for states/provinces in other countries?

Proposal - store generated object field value to generate depended fields

This is a proposal.

Right now I type

from elizabeth import Personal
p = Personal('en')
print( p.age() )
print( p.age() )

And got output

25
40

Because age is generated by request and doesn't store in object p. What if I want to add the field child_count or work experience, depend on previously generated age value?

Add Title/Prefix, Suffix options to Personal

As an example, have something like person.title() that would randomly pull from values such as 'Dr.', 'Sir', 'Honorable' (or ' '), or person.prefix(gender=) that returns 'Mr.', 'Mrs.', 'Ms.' as appropriate.

Also, similar for suffix to the surname... either person.surname(suffix=True) or person.suffix() to return values such as 'Sr', 'Jr', 'III', 'PhD', etc.

Add department to Business

Add department() to Business().

Example:

>>> from elizabeth import Business

>>> business = Business('en')
>>> business.department()
'Sports & Outdoors'

Add the ability to test all locales at once.

We need test all locales at one moment without manual changes file tests.py. Because by default tests.py will test only en locale. If we want to check other locales then we need manually change value of LANG in file tests.py and it's not good. One of the best solution is a pytest.fixture

So if anyone can help us with this problem, please let me know. Thanks!

Python2 support

As we discussed earlier adding legacy python version support is not much of an effort.
The main difficulty is the manual labour to ensure every .decode() and .encode() calls are in place.

However, there are several questions to answer:

  • Is python2 support even needed?
  • How important it is?
  • How should it be tested?

What do you think?

Add __str__ to providers.

How it will look:

>>> from elizabeth import Personal
>>> p = Personal('pt-br')
>>> p
'Personal:pt-br:Brazilian Portuguese'

>>> Personal('en-gb')
'Personal:en-gb:British English'

Example available here

Support for Declension

Support for declension. This is very usefull for russian and some other languages.

I can suggest some examples for russian, if it would be helpfull.

For example from address.json

"suffix": [
      "Аллея",
      "ул."
    ]

If I'll add the бульвар (Boulevard) to suffix list, then Авангардная from streets list will be incorrect - it should be Авангардный.

Add range to Datetime

Please add a feature to Datetime so as to be able to generate dates within a given range e.g. for birthdates, where dates that are too old or too young aren't terribly useful.

Possibly merge some ideas from radar (https://github.com/barseghyanartur/radar) ...which then with some help from str() can generate dates suitable for using in a test DB such as sqlite:

>>> str(radar.random_date(start='1960-01-01', stop='2000-12-31'))
'1985-11-28'

Test Path in MS Windows.

Today has been added class Path that provides methods and property for generate the dummy paths. I tested it only on Linux. It would be great if someone can run all tests on MS Windows.

Usage

>>> from elizabeth import Path

>>> path = Path()

>>> path.root
/
>>> path.home
/home/

>>> path.user(gender='female')
/home/mariko

>>> path.users_folder(user_gender='male')
/home/john/Documents

>>> path.dev_dir()
/home/fidelia/Development/Erlang

# etc.

Update docs/guide.rst

With version 0.1.9, there have been many changes therefore we need to update guide.

Compatibility issue with utils.download_image()

I am trying to write a test to cover situations where unverified_ctx is true and the build is failing on 3.3 and 3.4 with the following error:

E AttributeError: 'module' object has no attribute '_create_unverified_context'

I did some research and discovered that ssl._create_unverified_context was renamed from ssl._create_stdlib_context in 3.4, but does not exist in 3.3.

Changing _create_unverified_context to _create_stdlib_context will allow the build to pass on 3.4, but it will still fail on 3.3.

See PEP 476 for info on ssl._create_unverified_context and ssl._create_stdlib_context.

Question: Subclassing a locale

I thought about creating a de-ch locale. And, as de-ch is just a special version of de, it would be nice to subclass the de locale and replace only the needed fields instead of copy the whole data and maintain a separate data set. Is this possible?

pip installation error

Collecting elizabeth
  Using cached elizabeth-0.3.15.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-5xf3_d/elizabeth/setup.py", line 3, in <module>
        from elizabeth import __version__, \
      File "elizabeth/__init__.py", line 25, in <module>
        from elizabeth.core import *
      File "elizabeth/core/__init__.py", line 1, in <module>
        from elizabeth.core.providers import (
      File "elizabeth/core/providers.py", line 34, in <module>
        from elizabeth.core import interdata as common
      File "elizabeth/core/interdata/__init__.py", line 16
    SyntaxError: Non-ASCII character '\xd0' in file elizabeth/core/interdata/__init__.py on line 16, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-5xf3_d/elizabeth/

I get this error when i try install elizabeth both in venv or global.

test_internet: dubious regexp

tests/test_data/test_internet.py includes the following regexp fragment:

[$-_@.&+]

But $-_ is a character range that includes digits, uppercase letter and a bunch of punctuation characters.
You probably wanted this instead:

[$_@.&+-]

The dubious regexp was found using pydiatra.

Images

Could eliz generate the random images? I do not found any (only personal.avatar linking).

Are you have a such plans?

Address.street_address() not working

Trying to use church, ran into a snag with the following:

>>> from church import Address

>>> address = Address('en')

>>> address.street_address()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
AttributeError: 'Address' object has no attribute 'street_address'

...which seems pretty much identical to what is shown here (http://church.readthedocs.io/en/latest/guide.html#address):

address = Address('en')
...
# Get a random address.
#786 Clinton Lane
street_address = address.street_address()

Thanks!

Problem during installation

Error message:

$ pip install elizabeth
Collecting elizabeth
  Using cached elizabeth-0.3.11.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-lElGl2/elizabeth/setup.py", line 3, in <module>
        import elizabeth
      File "elizabeth/__init__.py", line 27, in <module>
        from elizabeth.core import *
      File "elizabeth/core/__init__.py", line 1, in <module>
        from .elizabeth import (
      File "elizabeth/core/elizabeth.py", line 38, in <module>
        from . import interdata as common
      File "elizabeth/core/interdata/__init__.py", line 5, in <module>
        from .code import *
      File "elizabeth/core/interdata/code.py", line 56
    SyntaxError: Non-ASCII character '\xc4' in file elizabeth/core/interdata/code.py on line 56, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

I suggest encoding: utf-8 was forgotten in comment at first line.

Add image downloader

If we want to save avatars on our local machine, then we should have that opportunity.

It will looks like:

>>> from elizabeth import Personal
>>> from elizabeth.utils import download_image

>>> p = Personal('en')
>>> avatar_url = p.avatar()
>>> avatar = download_image(avatar_url, save_path)

Problems with unicode on Windows.

JeStoneDev from a habr has following:

Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:01:18) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from elizabeth import Personal
>>> user = Personal('is')
>>> for _ in range(0, 9):
...     print(user.full_name(gender='male'))
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "C:\Users\mainj\AppData\Local\Programs\Python\Python35-32\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\xf0' in position 5: character maps to <undefined>

So, if your machine on Windows 10 then, please, try to fix it.

Data which must be added

We need to add all data to version 0.4.0, namely:

nl/text.json:88:	"Test"
is/personal.json:4491:	"Test"
cs/food.json:3:	"Test"
cs/food.json:6:	"Test"
cs/food.json:9:	"Test"
cs/food.json:12:	"Test"
cs/food.json:15:	"Test"
cs/personal.json:3:	"Test"
cs/personal.json:6:	"Test"
cs/personal.json:1992:	"Test"
cs/personal.json:1995:	"Test"
cs/personal.json:1998:	"Test"
cs/personal.json:2001:	"Test"
cs/personal.json:2351:	"Test"
cs/personal.json:2354:	"Test"
cs/personal.json:2359:	"Test"
cs/personal.json:2362:	"Test"
cs/personal.json:2398:	"Test"
cs/text.json:119:	"Test"
cs/text.json:122:	"Test"
cs/text.json:126:	"Test"
cs/science.json:3:	"Test"
da/personal.json:12: "Test"
da/text.json:93:         "Test"
da/text.json:96:	 "Test"
da/text.json:100:	 "Test"
da/science.json:3:    "Test"
pl/personal.json:8316:	"Test"
pl/text.json:99:	"Test"
es/personal.json:886:	"Test"
es/personal.json:1171:	"Test"
es/personal.json:1174:	"Test"
es/personal.json:1179:	"Test"
es/personal.json:1182:	"Test"
es/text.json:98:	       "Test"
es/address.json:382:      "Test"
es/science.json:3:          "Test",
es/science.json:4:          "Test"

If you see your locale in list, then please, help us. It's really important to add all these data.

Check correctness of all data for all locales.

We have the support of 33 languages and it would be great if native-speakers of one's will check the correctness of data for his own language.

For example. I'm Russian and I'm sure of the correctness of the data for this language. But we also want to be sure of the correctness of German (de, de-ch), Italian (it) and other languages.

Checked locales:

  • cs
  • el
  • es-mx
  • es
  • et
  • fa
  • fi
  • hu
  • is
  • ja
  • kk
  • ko
  • nl-be
  • nl
  • no
  • sv
  • zh

Refusal of unstructured data storage.

I think that text files as a storage is not a better solution. We can use JSON for structured data storage.

Example:

.
├── personal.json
├── business.json
├── datetime.json
├── food.json
├── address.json
├── science.json
├── text.json

We need to jsonify all data for all locales. @sobolevn what do you think about this idea?

Run tests on macOS

I think that it's should work without problems, but i need make sure.

Attempting install, getting SyntaxError Non-ASCII character error

On Mac OSX 10.12.2 using zsh, I run:

$ pip install elizabeth

And get the following error message:

Collecting elizabeth
  Using cached elizabeth-0.3.4.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/var/folders/vg/c9ncn1fs5xzdqccf9mxf24vr0000gn/T/pip-build-lnl_QD/elizabeth/setup.py", line 3, in <module>
        import elizabeth
      File "elizabeth/__init__.py", line 10, in <module>
        from elizabeth.core import *
      File "elizabeth/core/__init__.py", line 1, in <module>
        from .elizabeth import (
      File "elizabeth/core/elizabeth.py", line 35, in <module>
        from . import interdata as common
      File "elizabeth/core/interdata.py", line 720
    SyntaxError: Non-ASCII character '\xe2' in file elizabeth/core/interdata.py on line 720, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/vg/c9ncn1fs5xzdqccf9mxf24vr0000gn/T/pip-build-lnl_QD/elizabeth/

Rewrite tests using pytest

Our current tests are too complicated and we need to fix it. I suggest rewrite tests using pytest testing framework.

If someone has skills with pytest then i would like to hear where we can start.

Wrong display of a french address

>>> from church import Address
>>> addr = Address('fr')
>>> addr.address()
'371 Bezout Rue du'

The output should be:
'371 Rue du Bezout'

Add builtins specific data providers.

Every language has specific data that suit only for ones. For example SSN for en (USA) or CPF for pt-br. CPF can be useful only for brazilians.

If user want to use this providers then he must be imported explicitly.

Here's how it will look:

>>> from elizabeth import Generic
>>> from elizabeth.builtins import Brazil

>>> generic = Generic('pt-br')

>>> class BrazilProvider(Brazil):
        class Meta:
            name = "brazil_provider"
>>> generic.add_provider(BrazilProvider)
>>> generic.brazil_provider.cpf()
'001.137.297-40'

Remove useless methods from the providers.

So, we need to find and remove useless methods.

For example:
I once added the names of scientists to science.json, but now I'm not sure that these data can be useful.
So, what do you think about it?

Need checking the correctness of the data.

It would be nice to check the correctness of the data for all locales. So, if you are a carrier of a language from list, then please help us.

List of locales whose data need to check:

  • 🇪🇸 - Español (es)
  • 🇩🇪 - Deutsch (de)
  • 🇫🇷 - Français (fr)
  • 🇮🇹 - Italiano (it)
  • 🇧🇷 - Português (pt-br)
  • 🇳🇴 - Norsk (no)
  • 🇸🇪 - Svenska (sv)

P.S Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.