GithubHelp home page GithubHelp logo

django-anonymizer's Introduction

Django Anonymizer

image

image

This was originally a fork of https://bitbucket.org/spookylukey/django-anonymizer/

Intro

This app helps you anonymize data in a database used for development of a Django project.

It is common practice in develpment to use a database that is very similar in content to the real data. The problem is that this can lead to having copies of sensitive customer data on development machines. This Django app helps by providing an easy and customizable way to anonymize data in your models.

The basic method is to go through all the models that you specify, and generate fake data for all the fields specified. Introspection of the models will produce an anonymizer that will attempt to provide sensible fake data for each field, leaving you to tweak for your needs.

Please note that the methods provided may not be able to give full anonymity. Even if you anonymize the names and other details of your customers, there may well be enough data to identify them. Relationships between records in the database are not altered, in order to preserve the characteristic structure of data in your application, but this may leave you open to information leaks which might not be acceptable for your data. This application should be good enough for simpler policies like 'remove all real telephone numbers from the database'.

An alternative approach to the problem of realistic amounts of test data for development/tests is to populate a database from scratch - see django-poseur, django-mockups and django-autofixture. The disavantage of that method is that the structure of the data - in terms of related models - can be unrealistic.

Usage

Quick overview (see docs for more information, either in docs/ or on <http://packages.python.org/django-anonymizer>).

  • Install using setup.py or pip/easy_install.
  • Add 'anonymizer' to your INSTALLED_APPS setting.
  • Create some stub files for your anonymizers:

    ./manage.py create_anonymizers app_name1 [app_name2...]

    This will create a file anonymizers.py in each of the apps you specify. (It will not overwrite existing files).

  • Edit the generated anonymizers.py files, adjusting or deleting as necessary, using the functions in module anonymizer.replacers or custom functions.
  • Run the anonymizers:

    ./manage.py anonymize_data app_name1 [app_name2...]

    This will DESTRUCTIVELY UPDATE all your data. Make sure you only do this on a copy of your database, use at own risk, yada yada.

  • Note: your database may not actually delete the changed data from the disk when you update fields. For Postgresql you will need to VACUUM FULL to delete that data.

    And even then, your operating system may not delete the data from the disk. Properly getting rid of these traces is left as an excercise to the reader :-)

Tests

To run the test suite, do the following inside the folder containing this README:

django-admin.py test --settings=anonymizer.test_settings

django-anonymizer's People

Contributors

bradyoo avatar diwu1989 avatar gaffney avatar luflow avatar mgeist avatar quinox avatar terite avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

django-anonymizer's Issues

Latest version number isn't compatible with pip 20.3

pip 20.3 started to enforce packages to use versioning described in PEP 440
Latest released version of django-anonymizer is 0.5.0.16-bw which isn't compatible with PEP 440

There are workarounds but they aren't look good

Can you please release a new version compatible with PEP 440?

Dependency problem when trying to install

I wanted to test your management command, because I really like the method of keeping relations as they are and just faking a given set of values on the model instances.

Since it didn't work in the project I was working on, I tried installing and using it in a completely empty test-project.

I did:
pip install django-anonymizer

Then I tried importing anonymizer in the python interpreter:

>>> import anonymizer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mmuth/anonymizer-test/env/local/lib/python2.7/site-packages/anonymizer/__init__.py", line 1, in <module>
    from anonymizer.base import Anonymizer
  File "/home/mmuth/anonymizer-test/env/local/lib/python2.7/site-packages/anonymizer/base.py", line 7, in <module>
    from faker import data
ImportError: cannot import name data

To me it looks like django-anonymizer isn't properly working with Faker 0.7.3? Is this the case or am I doing something wrong here?

pip freeze:

Django==1.10.3
django-anonymizer==0.5.1
Faker==0.7.3
ipaddress==1.0.17
python-dateutil==2.5.3
six==1.10.0
wheel==0.24.0

Python 2.7.6

All installed packages are just (direct / indirect) dependencies of django-anonymizer

Please update or correct pypi version

I just installed the package and it shows:

  File "/usr/local/lib/python3.6/site-packages/anonymizer/base.py", line 7, in <module>
    from faker import data
ImportError: cannot import name 'data'

I've installed django-anonymizer-0.5.1 but your setup.py file shows a different version number.

Seems the package on pypi is outdated.

Lambda anonymizer

I'm trying to do:

    attributes = [
        ('id', "SKIP"),
        ('password', lambda *args: 'test'),
        ('last_login', "SKIP"),
    ]

But I get this error:

Running userprofile.anonymizers.UserAnonymizer... Traceback (most recent call last):
  File "manage.py", line 9, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.6/site-packages/django/core/management/__init__.py", line 364, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.6/site-packages/django/core/management/__init__.py", line 356, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 283, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 330, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 476, in handle
    app_output = self.handle_app_config(app_config, **options)
  File "/usr/local/lib/python3.6/site-packages/anonymizer/management/commands/anonymize_data.py", line 38, in handle_app_config
    instance.run(chunksize=chunksize, parallel=parallel)
  File "/usr/local/lib/python3.6/site-packages/anonymizer/base.py", line 317, in run
    future.get()
  File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 608, in get
    raise self._value
  File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 385, in _handle_tasks
    put(task)
  File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/local/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function UserAnonymizer.<lambda> at 0x7f654d0bd7b8>: attribute lookup UserAnonymizer.<lambda> on userprofile.anonymizers failed

What is the correct way to define a static value?

Error when executing ./manage.py anonymize_data

I get this error when running the command:

Traceback (most recent call last):
  File "/usr/lib64/python3.5/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/mirelsol/pro/projets/virtualenvs/ll_fidecom/lib/python3.5/site-packages/django_anonymizer-0.5.0.14_bw-py3.5.egg/anonymizer/base.py", line 367, in _run
    retval = anonymizer.alter_object(obj)
  File "/home/mirelsol/pro/projets/virtualenvs/ll_fidecom/lib/python3.5/site-packages/django_anonymizer-0.5.0.14_bw-py3.5.egg/anonymizer/base.py", line 285, in alter_object
    self.alter_object_attribute(obj, attname, replacer)
  File "/home/mirelsol/pro/projets/virtualenvs/ll_fidecom/lib/python3.5/site-packages/django_anonymizer-0.5.0.14_bw-py3.5.egg/anonymizer/base.py", line 299, in alter_object_attribute
    replacement = replacer(self, obj, field, currentval)
  File "/home/mirelsol/pro/projets/virtualenvs/ll_fidecom/lib/python3.5/site-packages/django_anonymizer-0.5.0.14_bw-py3.5.egg/anonymizer/replacers.py", line 221, in choice
    return anon.faker.choice(field=field)
  File "/home/mirelsol/pro/projets/virtualenvs/ll_fidecom/lib/python3.5/site-packages/django_anonymizer-0.5.0.14_bw-py3.5.egg/anonymizer/base.py", line 200, in choice
    return self.get_allowed_value(lambda: random.choice(choices), field)
  File "/home/mirelsol/pro/projets/virtualenvs/ll_fidecom/lib/python3.5/site-packages/django_anonymizer-0.5.0.14_bw-py3.5.egg/anonymizer/base.py", line 72, in get_allowed_value
    retval = retval[:max_length]
TypeError: 'int' object is not subscriptable
"""

Python version: 3.5.2
Django version: 1.10.2
I installed django-anonymizer from the sources on GitHub

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.