maum-ai / faceshifter Goto Github PK
View Code? Open in Web Editor NEWUnofficial PyTorch Implementation for FaceShifter (https://arxiv.org/abs/1912.13457)
License: BSD 3-Clause "New" or "Revised" License
Unofficial PyTorch Implementation for FaceShifter (https://arxiv.org/abs/1912.13457)
License: BSD 3-Clause "New" or "Revised" License
The old linked file (https://drive.google.com/file/d/1TAb6WNfusbL2Iv3tfRCpMXimZE9tnSUn/view?usp=sharing) no longer exists. Thanks a lot.
Can you please provide the identity encoder code
Hi :
I‘m wondering how much GPU memory is needed for training with batch size 1. I'm training with 2080Ti 11G, It always CUDA out of memory even when batch size is 1
when run aei_inference.py,the error occur:
init() got multiple values for argument 'hp'
Excuse me, my training data set has only 1000 pictures, And I found len(AEI_Dataset) = len(self.files) * 5 = 5000 my batch size = 20. So train step should be 250,but why train step is 125000250 。
I tried to use the "Colab Example" but could not find the "FaceShifter.pth" file. From where I can find this?
The link is expired. Can someone provide the pre-trained Arcface model weights? Thanks!!!
Thanks for the awesome code! I am training my own model right now and have a few questions:
shuffle
for the training dataloader is not set to True
, did you use the same setting?Thanks!
Hi, Thanks for your great implementation!
In the original paper, batch normalization is conducted without affine parameters (they are replaced with attributes and identities modulation parameters).
So in this way, we should explicitly set the flag of affine of BN in the ADD layers as false as follows,
self.BNorm = nn.BatchNorm2d(h_inchannel, affine=False)
Thanks for your response.
i have aproblem that what is the valset_dir?there is only one data file after process is that output_dir
Hello! First of all thank you for your work, is incredible! 🥇
Is there any final pre-trained model available for download? or do you plan to add it in the close future? Thanks a lot! :)
Hello again, thanks for sharing code. I am trying to train. I have 200k dataset but tqdm shows process finish in 125 hours. Because 1 epoch need 2 millions 206k iterations. How may i change this?
how many images are necessary to have good results.
how many images are necessary to train and have good results.
Can I train with images at 128 * 128?
Can anyone share a pretrained model?
Thanks for your great work! I want to know how did you train your arcface. It was trained only for this task or could be used as a general face recognition model!
I'd like to play with this but I don't have a hardware to train with.
someone who has trained this, can share their weights, I would appreciate it
Thanks for the implementation! Well done!
I am about to use your code to train my own model. I am curious how long it took your model to be trained? How many epochs did you use?
Thanks!
hi, thanks for your great code.
I have a problem when using 2 GPUs.
with 1 GPU the speed of the training process is about 0.75 s/it (according to progress bar)
and with 2 GPUs it is about 1.33 s/it. and since the whole iterations are halved with 2 GPUs, consequently, one epoch would take almost the same time in both cases (1 and 2 GPUs)
would you please help me to find out what the problem is.
thanks alot
Thank you for sharing such an excellent project. I would like to ask what are the coefficients of loss? I tried the coefficients in the author's paper, but the effect is not very satisfactory. Could you please share your coefficient below? Thank you very much! ! !
At first, thank you for your nice work, it helped me a lot!
But when I was reading your code, I found a problem: in the original paper, id loss is calculated by the cos similarity, while in this implementation, it's calculated by bmm or inner product.
Would like to know the reason why change cosine loss to bmm, and the influence on the final result. much appreciate it if you would reply.
Hello, thanks for sharing codes. I tried to train with batch size 2, 4, 8 and 16 seperately on v100 16GB but i got an error "Validation sanity check: 0it [00:00, ?it/s]Killed". Do you know what is the problem?
You can provide one for colab , I would appreciate it.
Thanks for the great work!
When I try to train the AEI-Net with 30k images from celebHQ dataset using 6 P40 32G GPUs, I got the training curve as below:
All the other setting are set by default and the generated swap faces are also weird:
Should I continue training or any sugguestions? Thanks in advance!
Hi, can someone tell me what can I do to fix this issues?
This is what I get when running it on local machine or google colab.
2020-11-05 11:56:07.097272: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
CUDA_VISIBLE_DEVICES: [0]
/usr/local/lib/python3.6/dist-packages/pytorch_lightning/utilities/distributed.py:37: UserWarning: WORLD_SIZE environment variable (2) is not equal to the computed world size (1). Ignored.
warnings.warn(*args, **kwargs)
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/1
----------------------------------------------------------------------------------------------------
distributed_backend=ddp
All DDP processes registered. Starting ddp with 1 processes
----------------------------------------------------------------------------------------------------
/usr/local/lib/python3.6/dist-packages/pytorch_lightning/utilities/distributed.py:37: UserWarning: Could not log computational graph since the `model.example_input_array` attribute is not set or `input_array` was not given
warnings.warn(*args, **kwargs)
| Name | Type | Params
---------------------------------------------------------
0 | G | ADDGenerator | 372 M
1 | E | MultilevelAttributesEncoder | 67 M
2 | D | MultiscaleDiscriminator | 8 M
3 | Z | ResNet | 43 M
4 | Loss_GAN | GANLoss | 0
5 | Loss_E_G | AEI_Loss | 0
Validation sanity check: 0it [00:00, ?it/s]/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:2494: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
Epoch 0: 0% 0/5000250 [00:00<?, ?it/s] Traceback (most recent call last):
File "aei_trainer.py", line 62, in <module>
main(args)
File "aei_trainer.py", line 40, in main
trainer.fit(model)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/states.py", line 48, in wrapped_fn
result = fn(self, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 1058, in fit
results = self.accelerator_backend.spawn_ddp_children(model)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/ddp_backend.py", line 123, in spawn_ddp_children
results = self.ddp_train(local_rank, mp_queue=None, model=model, is_master=True)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/ddp_backend.py", line 224, in ddp_train
results = self.trainer.run_pretrain_routine(model)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 1239, in run_pretrain_routine
self.train()
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/training_loop.py", line 394, in train
self.run_training_epoch()
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/training_loop.py", line 491, in run_training_epoch
batch_output = self.run_training_batch(batch, batch_idx)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/training_loop.py", line 844, in run_training_batch
self.hiddens
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/training_loop.py", line 1015, in optimizer_closure
hiddens)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/training_loop.py", line 1197, in training_forward
output = self.model(*args)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/overrides/data_parallel.py", line 170, in forward
output = self.module.training_step(*inputs[0], **kwargs[0])
File "/content/faceshifter/aei_net.py", line 54, in training_step
output, z_id, output_z_id, feature_map, output_feature_map = self(target_img, source_img)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/content/faceshifter/aei_net.py", line 44, in forward
output = self.G(z_id, feature_map)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/content/faceshifter/model/AEINet.py", line 132, in forward
x = self.model["layer_7"](x, z_att[7], z_id)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/content/faceshifter/model/AEINet.py", line 98, in forward
x1 = self.activation(self.add1(h_in, z_att, z_id))
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/content/faceshifter/model/AEINet.py", line 72, in forward
h_out = (1-m)*a + m*i
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 14.73 GiB total capacity; 12.56 GiB already allocated; 45.88 MiB free; 1.17 GiB cached)
Epoch 0: 0%| | 0/5000250 [00:36<?, ?it/s]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.