Comments (10)
What would you propose we use for threading? pthreads is only safe in PHP 7.2+ which will restrict usage to only 7.2, which isn't very widely adopted yet.
Another but hacky way would be to let masquerade call itself through Symfony Process + one of the Parallel Process packages.
from masquerade.
While pthreads is a really nice extension, I think it's better to go for Symfony Process. I've seen several applications where runtime calls itself to launch the subprocesses.
Aside from the php 7.2+ support, php-pthreads hasn't been adopted by most distros. I think the extension needs a bit more time to get adopted into the php ecosystem.
from masquerade.
I've done some preliminary work in the 24-subprocesses branch. See commit 09ce2ac. I start a parallel process per group here, not per table.
It works, but it immediately outputs the progressbar for every running process (not to mention that it shows the logo and the 'Done anonymizing' outputs for every process as well), making a mess;
_ _. _ _. _ .__. _| _
| | |(_|_>(_||_|(/_|(_|(_|(/_
|
by elgentos
v0.1.1
._ _ _. _ _. _ .__. _| _
| | |(_|_>(_||_|(/_|(_|(_|(/_
|
by elgentos
v0.1.1
Updating admin_user
0/26750 [>---------------------------] 0%
._ _ _. _ _. _ .__. _| _
| | |(_|_>(_||_|(/_|(_|(_|(/_
|
by elgentos
v0.1.1
Updating sales_creditmemo
Updating email_contact
Updating newsletter_subscriber
0/14 [>---------------------------] 0% 0/280 [>---------------------------] 0% 0/11616 [>---------------------------] 0% 0/11670 [>---------------------------] 0%
._ _ _. _ _. _ .__. _| _
| | |(_|_>(_||_|(/_|(_|(_|(/_
|
by elgentos
v0.1.1
._ _ _. _ _. _ .__. _| _
| | |(_|_>(_||_|(/_|(_|(_|(/_
|
by elgentos
v0.1.1
._ _ _. _ _. _ .__. _| _
| | |(_|_>(_||_|(/_|(_|(_|(/_
|
by elgentos
v0.1.1
Updating sales_invoice
Updating review_detail
0 [>---------------------------]
Done anonymizing
0/32941 [>---------------------------] 0%
._ _ _. _ _. _ .__. _| _
| | |(_|_>(_||_|(/_|(_|(_|(/_
|
by elgentos
v0.1.1
Updating sales_order
0/38379 [>---------------------------] 0%
Updating quote
0/78297 [>---------------------------] 0%
Done anonymizing
14/14 [============================] 100%^
from masquerade.
Cool stuff! It might be better to work with return values in the subprocesses, read them in the master process and create progress bars based on that.
Not sure if that's possible, this is new stuff for me haha.
from masquerade.
https://github.com/krakjoe/parallel
This is really nice but unfortunately a PECL extension which limits usage. So a no-go.
from masquerade.
FWIW, I wrote a little bash script to make it possible for now:
DATABASE="your_db_name"
MASQUERADE_PLATFORM="magento2"
MASQUERADE_GROUPS=($(bin/masquerade groups --platform "${MASQUERADE_PLATFORM}" | grep '|' | grep -v 'Group' | awk -F'|' '{print $3}' | uniq))
for group in "${MASQUERADE_GROUPS[@]}"
do
echo "Starting process for group $group"
screen -d -m -S "anonymize_${group}" bin/masquerade run --platform "${MASQUERADE_PLATFORM}" --database "${DATABASE}" --group "${group}"
done
from masquerade.
I created a subprocess per group/table combi;
from masquerade.
@tdgroot could you test it?
You can clone the https://github.com/elgentos/masquerade/tree/24-subprocesses branch and run bin/masquerade
to test it
from masquerade.
This is a nicer package with some more options; https://github.com/graze/parallel-process. Using this, we could do something like;
<?php
namespace Elgentos\Masquerade\Commands;
use Graze\ParallelProcess\Event\RunEvent;
use Graze\ParallelProcess\PriorityPool;
use Graze\ParallelProcess\RunInterface;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Input\InputOption;
use Symfony\Component\Console\Output\OutputInterface;
use Symfony\Component\Process\Process;
class ExampleCommand extends Command
{
/**
* @var OutputInterface
*/
private OutputInterface $output;
protected function configure()
{
$this
//
->addOption('subprocess', 's', InputOption::VALUE_OPTIONAL, 'Whether command is ran as subprocess', false);
}
protected function execute(InputInterface $input, OutputInterface $output)
{
$this->output = $output;
if ($input->getOption('subprocess')) {
sleep(2);
$output->write(json_encode(['date' => date('d-m-Y H:i:s')]));
return 0;
}
$pool = new PriorityPool();
$pool->setMaxSimultaneous(5);
for ($i = 0; $i < $pool->getMaxSimultaneous() * 4; $i++) {
$pool->add(new Process(['php', 'application.php', '--subprocess=1']));
}
array_map([$this, 'addCallback'], $pool->getAll());
$pool->run();
return 0;
}
public function addCallback(RunInterface $run)
{
$run->addListener(
RunEvent::SUCCESSFUL,
function (RunEvent $event) {
$data = json_decode($event->getRun()->getLastMessage(), true);
$this->output->writeln('The date is ' . $data['date']);
}
);
}
}
from masquerade.
The main problem this issue tried to solve was speed, and that was fixed in version 0.3.0. So closing this issue.
from masquerade.
Related Issues (20)
- Unable to process a line with a primary key set to zero HOT 1
- Create Github action to generate phar automatically
- Identify generating yaml with braces HOT 1
- Table does not have primary key configured, which makes impossible table anonymization. HOT 1
- The default command does nothing
- Skipping a group/table/column doesn't work
- Exceptions are not caught in the configuration context HOT 1
- PHP Warning: Wrong COM_STMT_PREPARE response size on masquerade run.
- "maximum number of steps is not set." in verbose mode HOT 1
- Provide PHP 8 support HOT 1
- download of masquerade.phar broken after 0.3.5 releae HOT 2
- Shopware 6 anonymization does not anonymize completely HOT 4
- Configuration ignored or wrong usage on my side HOT 6
- Update Faker with new FakerPHP lib HOT 1
- Fixing default package version HOT 1
- tableRanges accepts only integer PK HOT 2
- Adding config option to specify seed HOT 1
- Unable to use with Magento 2.4.4 (latest) HOT 10
- PostgreSQL compatibility HOT 1
- Run masquerade against a .sql file instead of database HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from masquerade.