GithubHelp home page GithubHelp logo

shgtkshruch / embulk-masking-sample Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 0.0 120 KB

study for loading concealed data with Embulk

Ruby 31.85% Dockerfile 1.79% HCL 54.67% JavaScript 11.69%
embulk terraform rds lambda step-functions docker

embulk-masking-sample's Introduction

study for loading concealed data with Embulk.

Diagram

Requirements

Setup

# Download test data
# ref: https://dev.mysql.com/doc/employee/en/
$ gh repo clone datacharmer/test_db

# Lunch MySQL server on 4306 port
$ dip provition

# Import test data to MySQL
$ docker-compose exec db /bin/bash -c 'mysql -u root -p"$MYSQL_ROOT_PASSWORD" < employees.sql'

# Install embulk gems
$ docker-compose exec -w /tmp/embulk/bundle embulk bash
$ embulk bundle install

Example

Official example

$ embulk example ./try1
$ embulk guess ./try1/seed.yml -o ./try1/config.yml

$ embulk preview ./try1/config.yml
+---------+--------------+-------------------------+-------------------------+----------------------------+
| id:long | account:long |          time:timestamp |      purchase:timestamp |             comment:string |
+---------+--------------+-------------------------+-------------------------+----------------------------+
|       1 |       32,864 | 2015-01-27 19:23:49 UTC | 2015-01-27 00:00:00 UTC |                     embulk |
|       2 |       14,824 | 2015-01-27 19:01:23 UTC | 2015-01-27 00:00:00 UTC |               embulk jruby |
|       3 |       27,559 | 2015-01-28 02:20:02 UTC | 2015-01-28 00:00:00 UTC | Embulk "csv" parser plugin |
|       4 |       11,270 | 2015-01-29 11:54:36 UTC | 2015-01-29 00:00:00 UTC |                            |
+---------+--------------+-------------------------+-------------------------+----------------------------+

$ embulk run ./try1/config.yml
1,32864,2015-01-27 19:23:49,20150127,embulk
2,14824,2015-01-27 19:01:23,20150127,embulk jruby
3,27559,2015-01-28 02:20:02,20150128,Embulk "csv" parser plugin
4,11270,2015-01-29 11:54:36,20150129,

MySQL

$ docker-compose exec embulk bash
$ embulk guess -b bundle -o ./mysql/config.yml ./mysql/seed.yml
$ embulk preview -b bundle ./mysql/config.yml
+-------------+-------------------------+-------------------+------------------+---------------+-------------------------+
| emp_no:long |    birth_date:timestamp | first_name:string | last_name:string | gender:string |     hire_date:timestamp |
+-------------+-------------------------+-------------------+------------------+---------------+-------------------------+
|      10,001 | 1953-09-01 15:00:00 UTC |            Georgi |          Facello |             M | 1986-06-25 15:00:00 UTC |
|      10,002 | 1964-06-01 15:00:00 UTC |           Bezalel |           Simmel |             F | 1985-11-20 15:00:00 UTC |
|      10,003 | 1959-12-02 15:00:00 UTC |             Parto |          Bamford |             M | 1986-08-27 15:00:00 UTC |
|      10,004 | 1954-04-30 15:00:00 UTC |         Chirstian |          Koblick |             M | 1986-11-30 15:00:00 UTC |
|      10,005 | 1955-01-20 15:00:00 UTC |           Kyoichi |         Maliniak |             M | 1989-09-11 15:00:00 UTC |
...
...

$ embulk run -b bundle ./mysql/config.yml

AWS

Create IAM user for Terraform

create .env-aws file.

AWS_ACCESS_KEY_ID=xxx
AWS_SECRET_ACCESS_KEY=xxx
AWS_DEFAULT_REGION=xxx
$ dip aws iam create-user --user-name embulk-mysql-rds-masking
$ dip aws iam create-access-key --user-name embulk-mysql-rds-masking
$ dip aws iam attach-user-policy \
  --policy-arn arn:aws:iam::aws:policy/AdministratorAccess \
  --user-name embulk-mysql-rds-masking

Terraform

create .env-tf file with embulk-mysql-rds-masking credential.

AWS_ACCESS_KEY_ID=xxx
AWS_SECRET_ACCESS_KEY=xxx
AWS_DEFAULT_REGION=xxx
# generage zip files of lambda
$ cd terraform/lambda && zip -r create-onetime-rds.zip create-onetime-rds.js && zip -r delete-onetime-rds.zip delete-onetime-rds.js && cd -

$ dip terraform init
$ dip terraform plan
$ dip terraform apply

Load test_data to RDS

$ docker-compose exec db /bin/bash -c 'mysql -h HOST -u dbuser -ppassword < employees.sql'

Create onetime RDS

$ dip aws stepfunctions start-execution \
  --state-machine-arn <value> \
  --input '{ "DBInstanceIdentifier": "RDS_IDENTIFIER" }'

Transfer data from RDS to local MySQL

  1. Set db host to mysql/seed.yml
  2. Run embulk
$ docker-compose exec embulk bash
$ embulk guess -b bundle -o ./mysql/config.yml ./mysql/seed.yml
$ embulk preview -b bundle ./mysql/config.yml
$ embulk run -b bundle ./mysql/config.yml

Cleaning

Remove aws resources.

$ dip terraform destroy

embulk-masking-sample's People

Contributors

shgtkshruch avatar

Watchers

James Cloos avatar  avatar  avatar

embulk-masking-sample's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.