GithubHelp home page GithubHelp logo

Comments (5)

uznog avatar uznog commented on June 16, 2024

The way I tackled this issue before issuing #61 was to code a simple app that would check MySQL database for the tables list and then search base configuration directory for YAMLs that mentioned those tables. Needed YAMLs were then bundled into a separate 'final' directory and Masquerade was run based on that final configuration.

This resolved errors that occured when table didn't exist, as I were only using Masquerade against tables I was sure that existed. A downside to that - I had to separate custom config YAMLS into YAML per table, as it was possible that not every table in a group exists - i.e. email_table1 and email_table2 have configuration inside one YAML (email.yaml), but only email_table1 exists in the database - Masquerade would still try to run anonymization for non-existing email_table2.

As of the way to share these YAMLs, they should be easily accessible and maintained. Storing them inside a separate repo may introduce more management issues - how to retrieve it for usage easily? how to let app itself obtain them for usage?

It would be nice if app could provide default configuration YAMLs on demand for users to choose which ones they want to use, or even provide only those configs that are suitable for user's database. This would both let configs be stored in app's repo for maintaining and be built into application, while letting users to either run as-is or customize the way anonymization process will run.

Those are just my thoughts though, and I'm not a PHP dev myself - let me know if they make any sense, or if some implementation issues would be a problem.

from masquerade.

johnorourke avatar johnorourke commented on June 16, 2024

I have some ideas! I have two criteria for this:

  • Optionally throw an exception - for example, we have made promises to be GDPR-complient for corporate clients, so I want to know that missing configs or invalid table names will not fail silently.
  • Easy to add in multiple projects
  • Easy to add in a docker container - eg. a Bitbucket pipeline or Gitlab CI

I suggest one of these:

  • Composer modules, in a few ways:
    • composer module has an autoload.php which adds its own folder to the php include path - this is how "deployer/recipes" does it - see https://github.com/deployphp/recipes
    • composer module has an autoload.php which creates or updates a global object to 'register' masquerade configs
    • composer module with a post-install script which symlinks the files to .masquerade/config/ or similar
    • composer module which masquerade will look for in the current folder - eg. look for vendor/*/*/composer.json files which match certain criteria
    • composer module with type "masquerade-config" and we use the composer API to find those modules in the current folder
  • git submodules clones into .masquerade/config/:
    • git submodule add [email protected]:INITECH/MASQUERADE_MODULE.git .masquerade/config/initech
  • give shell commands in a README for downloading a zip and unpacking into .masqerade/config/, eg
    • wget -O - https://github.com/xxxxx/xxxxx/releases/xxxx.zip |unzip - .masquerade/config/

Maybe there's even another option - the masquerade phar file could 'require' the vendor/autoload.php from the current folder, and scan it for classes - but believe me scanning all possible classes causes various problems and requires composer dumpautoload --optimize which isn't the default.

I like the first one - simple and can be used with "require-dev" to ensure unnecessary modules don't go into production environments.

from masquerade.

peterjaap avatar peterjaap commented on June 16, 2024

@johnorourke

We could introduce a --strict-mode flag to throw an exception on missing configs / missing tables / missing columns. Seems easy enough.

I'd be in favor of the composer repo as well. I'd suggest elgentos/masquerade-configs. Then we could add a console command to this repo that can be run with composer's post-install-cmd (when the config package is present) to ask which files should be copied from that repository. It could then create a .masquerade-installed file to make sure this isn't run automatically on each install (and assume it is when --no-interaction is passed).

from masquerade.

johnorourke avatar johnorourke commented on June 16, 2024

That's a great idea @peterjaap - the single repo would keep them all tidy, easy to fork, allow management of PRs and issues etc, and the post-install-cmd hook would make it really simple to use.

It would need to know the 'platform' config folder the user wants them in - in masquerade core there are several config file locations - perhaps auto-detect to see if any are in use, and/or let the user choose that too?

We'd need to consider updates too - eg. you run and install it, but then later an update to one of the vendor-specific files is released in the composer module - perhaps just warn the user during the post-update-cmd hook if they might be running out of date files?

from masquerade.

peterjaap avatar peterjaap commented on June 16, 2024

Trying to move this to Discussions but can't find the option? https://docs.github.com/en/discussions/managing-discussions-for-your-community/managing-discussions-in-your-repository#converting-issues-based-on-labels

from masquerade.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.