yaoyi-li / gca-matting Goto Github PK
View Code? Open in Web Editor NEWOfficial repository for Natural Image Matting via Guided Contextual Attention
License: MIT License
Official repository for Natural Image Matting via Guided Contextual Attention
License: MIT License
in dataloader/data_gernerator.py _composite_fg function
alpha_tmp = 1 - (1 - alpha) * (1 - alpha2)
fg = fg.astype(np.float32) * alpha[:,:,None] + fg2.astype(np.float32) * (1 - alpha[:,:,None])
when compute new fg, the fg2 multiple (1-alpha) , but should fg2 muliple (1-alpha2) ?
https://github.com/Yaoyi-Li/GCA-Matting/blob/master/dataloader/data_generator.py#L520
Good day
I want to try to train the model with a new dataset I´m creating,
In this new dataset, for each input image, I have a ground truth alpha mask.
In here:
train_image_file = ImageFileTrain(alpha_dir=CONFIG.data.train_alpha,
fg_dir=CONFIG.data.train_fg,
bg_dir=CONFIG.data.train_bg)
It seems to ask for 3 folders, one for the ground truth alpha it seems, and then two for fg and bg?
what are these?
if I have input images and ground truth alphas, how can I adapt them to work with the input of your model?
thank you very much
Thanks for this great work!
Any suggestions on how we can improve the performance of the model on real world images? I have seen that almost all of the image matting models perform pretty well on the data which is present in the example images, but don't give good results on actual real world data.
When I want to modify the config.py in my datasets , the error is FileNotFoundError: [Errno 2] No such file or directory: '/home/liyaoyi/dataset/coco_bg'
But I modified the path and printed this function:
{'data': {'augmentation': True, 'crop_size': 512, 'random_interp': False, 'test_alpha': '/data/Adobe_Dataset/Test_set/comp/alpha', 'test_merged': '/data/Adobe_Dataset/Test_set/comp/image', 'test_trimap': '/data/Adobe_Dataset/Test_set/comp/trimap', 'train_alpha': '/data/Adobe_Dataset/Training_set/all/alpha', 'train_bg': '/data/Adobe_Dataset/background_all', 'train_fg': '/data/Adobe_Dataset/Training_set/all/fg', 'workers': 0}, 'dist': False, 'gpu': [0, 1], 'is_default': True, 'local_rank': 0, 'log': {'checkpoint_path': './checkpoints', 'checkpoint_step': 10000, 'logging_level': 'DEBUG', 'logging_path': './logs/stdout', 'logging_step': 10, 'tensorboard_image_step': 500, 'tensorboard_path': './logs/tensorboard', 'tensorboard_step': 100}, 'model': {'arch': {'decoder': 'res_shortcut_decoder_22', 'discriminator': None, 'encoder': 'res_shortcut_encoder_29'}, 'batch_size': 16, 'imagenet_pretrain': True, 'imagenet_pretrain_path': './pretrain/gca-dist.pth', 'trimap_channel': 3}, 'phase': 'train', 'test': {'alpha': None, 'alpha_path': None, 'batch_size': 1, 'checkpoint': 'best_model', 'cpu': False, 'fast_eval': True, 'merged': None, 'scale': 'origin', 'trimap': None}, 'train': {'G_lr': 0.001, 'beta1': 0.5, 'beta2': 0.999, 'clip_grad': True, 'comp_weight': 0, 'gabor_weight': 0, 'grad_weight': 0, 'rec_weight': 1, 'reset_lr': False, 'resume_checkpoint': None, 'smooth_l1_weight': 0, 'total_step': 100000, 'val_step': 1000, 'warmup_step': 5000}, 'version': 'baseline', 'world_size': 1} {'data': {'augmentation': True, 'crop_size': 512, 'random_interp': False, 'test_alpha': '/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/alpha_copy', 'test_merged': '/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/merged', 'test_trimap': '/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/trimaps', 'train_alpha': '/home/liyaoyi/dataset/Adobe/train/alpha', 'train_bg': '/home/liyaoyi/dataset/coco_bg', 'train_fg': '/home/liyaoyi/dataset/Adobe/train/fg', 'workers': 4}, 'dist': True, 'gpu': [0, 1], 'is_default': False, 'local_rank': 0, 'log': {'checkpoint_path': './checkpoints', 'checkpoint_step': 2000, 'logging_level': 'INFO', 'logging_path': './logs/stdout', 'logging_step': 10, 'tensorboard_image_step': 2000, 'tensorboard_path': './logs/tensorboard', 'tensorboard_step': 100}, 'model': {'arch': {'decoder': 'res_gca_decoder_22', 'discriminator': None, 'encoder': 'resnet_gca_encoder_29'}, 'batch_size': 10, 'imagenet_pretrain': True, 'imagenet_pretrain_path': 'pretrain/model_best_resnet34_En_nomixup.pth', 'trimap_channel': 3}, 'phase': 'train', 'test': {'alpha': '/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/alpha_copy', 'alpha_path': 'prediction', 'batch_size': 1, 'checkpoint': 'gca-dist', 'cpu': False, 'fast_eval': True, 'merged': '/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/merged', 'scale': 'origin', 'trimap': '/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/trimaps'}, 'train': {'G_lr': 0.0004, 'beta1': 0.5, 'beta2': 0.999, 'clip_grad': True, 'comp_weight': 0, 'gabor_weight': 0, 'grad_weight': 0, 'rec_weight': 1, 'reset_lr': False, 'resume_checkpoint': None, 'smooth_l1_weight': 0, 'total_step': 200000, 'val_step': 2000, 'warmup_step': 5000}, 'version': 'gca-dist', 'world_size': 1}
There are two different result in path , what's the error ?
Thank you for the excellent work, i use your pretrained model gca-dist and test in composition-1k dataset, but i get the sad, mse like bellow, it's a little different from the paper
[03-12 15:15:46] INFO: TEST NUM: 1000
[03-12 15:15:46] INFO: MSE: 0.010313104958540355
[03-12 15:15:46] INFO: SAD: 37.83985428515623
I try to train a model by my own dataset. But it is interrupt.
How can I resume trainging
It will return a error when using resume training:
RuntimeError: Caught RuntimeError in replica 0 on device 0.
I want to train a matting model by my own dataset. It will be used in high resolution imgs.
Why is it not clear. How can I train my model to use in high resolution imgs.
By the way, why the img size use 640?
Thangs for your great job!
I have a question.I see that ‘’gca-dist-all-data: Model of the GCA Matting trained on both Adobe Image Matting Dataset and the Composition-1K testing set for alphamatting.com online benchmark. Save to ./checkpoints/gca-dist-all-data/.''
So you trained the model on the Composition-1K testing set rather than training set?
Where is the ground truth of testing set?
hi, Thanks for your reply. I have one more question similar to lfxx.
How to produce files in directory test_merged ( test_merged = "/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/merged" ).
Could you share shell script as the way that you dealt with copy_testing_alpha.sh.
Hope for your kindly reply!@Yaoyi-Li
Thanks for your awesome work.Now i want use my own datasets to train the model.When i am working on making datasets,i am confused on the datasets structure.When i read your code,i find these paths:
train_fg = "/home/liyaoyi/dataset/Adobe/all/fg"
train_alpha = "/home/liyaoyi/dataset/Adobe/all/alpha"
train_bg = "/home/liyaoyi/dataset/coco_bg"
test_merged = "/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/merged"
test_alpha = "/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/alpha_copy"
test_trimap = "/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/trimaps"
What are these paths mean?Could your upload some demo pictures to make these clearer?Hope for your kindly reply!@Yaoyi-Li
Hi Yaoyi,
Thank you for your great work and I hope to discuss it with you in AAAI-20 Technical Program on Feb. if applicable!
One question is that I am curious about the total time you need for training the whole DIM training set. It seems like you only include the total iterations for the training phase in your paper.
Regards,
Mingfu Liang
Hi, thanks for sharing your work. Just wondering -- why do you normalize the alpha image but not the trimap when training?
Can you help compute the FLOPs of your network? I have tried the Third-party libraries like thop, ptflops to get the flops,but it seems ignore a lot of operations like spectralnorm. Thanks.
closing because of sorted issue
Hi Yaoyi,
Thank you for your remarkable work!
I noticed in the README it says that 8GB is needed when testing on conposite-1k dataset. However, when I run tests with a GPU with 11GB memory (RTX2080Ti), the program raised an out of memory error. So I want to ask that do I need to resize the image first before running or change the 'scale' option in the toml file? Thank you!
My testing dataset is generated by the Composition_code.py
provided with the Adobe Image Matting dataset.
AIM testset,
one fg has mutiple trimaps, and this is the reason I can't load test dataloader because your code would find alpha, fg, trimaps with
the same name, how could I fix it instead of renaming trimaps?
hi,dear
In the config/gca-dist.toml,how to set the params down if i use the dataset from the link?
train_fg = "/home/liyaoyi/dataset/Adobe/train/fg"
train_alpha = "/home/liyaoyi/dataset/Adobe/train/alpha"
train_bg = "/home/liyaoyi/dataset/coco_bg"
test_merged = "/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/merged"
test_alpha = "/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/alpha_copy"
test_trimap = "/home/liyaoyi/dataset/Adobe/Combined_Dataset/Test_set/trimaps"
if I use the dataset ,how to modify the config?
thx
If bg_num < batch_size, there is a error like title.
And it will always one image, when bg_num = 1, batch_size = 1.
Hi Yaoyi,
Thank you very much for this extremely powerful work!
I would like to reproduce your result using the provided pretrained model. I use your code Composition_code.py to generate the test set. After running your test.sh, however, I only get SAD 39.
The only clue is that it has warnings by libpng like:
libpng warning: iCCP: known incorrect sRGB profile
But I think this warning should not influence the matting result.
I note that you have emphassis that matlab evaluate code is different from your python. So what's the SAD should it be for python evaluation?
I am wondering whether you included the 27 images(from the provided training set of alphamatting.com ) in your training data and What kind of trimaps you generated for these images during the training?
Thanks so much!
closing because of sorted issue
It is possible to use another encoder (with resnet101 base for example) with gca matting network? How to pretrain (in Imagenet) another encoder that would be suitable for your model?
In you readme text, you write "Default training requires 4 GPUs with 11GB memory, and the batch size is 10 for each GPU. "
But I think it‘s 10 for all GPU. May not right, so I consult you.
I try to use composition loss in my training.
And then, I find the merged img's color is different from image.
Why the color jitter is just be used on fg? Is it the reason for color difference?
By the way, I find regression_loss just use unkonwn area.
It will be work well when the composition loss used in unknowd area?
In the paper and code, are the original merged image and original image the same image?
Thanks for the great work and code :) Your implementation is really elegant and impressive. However, I found that some part in the RandomCrop function is a little bit confusing:
GCA-Matting/dataloader/data_generator.py
Line 296 in 92c40b0
Based on this line, I assume that you are sampling unknown point as the center of the crop patch (based on that you deal with the border at 4 sides)
However, the later code using the sampled unknown point seems to treat it as the left-top point of the crop patch:
GCA-Matting/dataloader/data_generator.py
Line 306 in 92c40b0
Am I misunderstanding something? Thanks!
Hi, thanks for sharing this great research.
Can you kindly share a pre-trained model?
Hi,
A great work.
My question is , why shall we only perform rotation augmentation on large fg images ( width/height > 1024 pix) during training ?
https://github.com/Yaoyi-Li/GCA-Matting/blob/master/dataloader/data_generator.py
How much GPU memory do we need to train the model@Yaoyi-Li
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.