dong03 / mvss-net Goto Github PK

View Code? Open in Web Editor NEW

263.0 263.0 50.0 6.17 MB

code for Image Manipulation Detection by Multi-View Multi-Scale Supervision

Python 9.48% Jupyter Notebook 90.38% Shell 0.14%

mvss-net's People

Contributors

Stargazers

Watchers

mvss-net's Issues

training questions

Hello!
Thanks for your inspiring work. And I 'm very intersted in it，Among them, there is a little doubt in the training,How is batchsize set during training? After tampering and non-tampering images are shuffled in the dataloader, the loss function used by the images in batch will be different. My training setting, when the batchsize is 8, I don’t know how to calculate the loss，Is it to traverse or set batchsize to 1 or some other method?

Questions about the detail of ESB.

Hi,
Thanks for your great work. I read your paper and released code these days, and I have questions about the Edge-Supervised Branch.
In your paper, the predicted manipulation edge map, denoted as {Gedge(xi)}, obtained by transforming the output of the last ERB with a sigmoid layer.
But in your code, I didn't see the sigmoid layer, only the last ERB as the output.
if self.sobel:
res1 = self.erb_db_1(run_sobel(self.sobel_x1, self.sobel_y1, c1))
res1 = self.erb_trans_1(res1 + self.upsample(self.erb_db_2(run_sobel(self.sobel_x2, self.sobel_y2, c2))))
res1 = self.erb_trans_2(res1 + self.upsample_4(self.erb_db_3(run_sobel(self.sobel_x3, self.sobel_y3, c3))))
res1 = self.erb_trans_3(res1 + self.upsample_4(self.erb_db_4(run_sobel(self.sobel_x4, self.sobel_y4, c4))), relu=False)

    else:
        res1 = self.erb_db_1(c1)
        res1 = self.erb_trans_1(res1 + self.upsample(self.erb_db_2(c2)))
        res1 = self.erb_trans_2(res1 + self.upsample_4(self.erb_db_3(c3)))
        res1 = self.erb_trans_3(res1 + self.upsample_4(self.erb_db_4(c4)), relu=False)

    if self.constrain:
        x = rgb2gray(x)
        x = self.constrain_conv(x)
        constrain_features, _ = self.noise_extractor.base_forward(x)
        constrain_feature = constrain_features[-1]
        c4 = torch.cat([c4, constrain_feature], dim=1)

    outputs = []

    x = self.head(c4)
    x0 = F.interpolate(x[0], size, mode='bilinear', align_corners=True)
    outputs.append(x0)

    if self.aux:
        x1 = F.interpolate(x[1], size, mode='bilinear', align_corners=True)
        x2 = F.interpolate(x[2], size, mode='bilinear', align_corners=True)
        outputs.append(x1)
        outputs.append(x2)

    return res1, x0

I think 'res1' is the output, with no sigmoid layer.
Did I miss something? Could you help me? Thank you very much.
Best regards.

Questions about the Implementation and Time to release model.

Hello!
Thanks for your inspiring work. And I 'm very intersted in it.
Last two weeks, I read your great work and decided to re-Implement your network according to the paper.
However, after two weeks, the network is roughly finished, and I trained it in the CASIA_v2 dataset. But the performance is not so good. I thought there must be something wrong with my code. So I have a few questions here.

The red arrow in the Network structure fig means upsample by bilinear interpolate? It is a little confusing as the legend not mentioned it.
What is the epochs and scheduler of the training process? Following your paper, I set the epoch to 75, and [20, 35, 50] the learing rate will multiple 10^-1.
Questions about the loss. When it comes to authenic img, the loss only consists class-supervision, will this loss be multiplied by 0.04?
The network structure implemented by myself may contain other questions, so when will you release the model? I'm very excited and appreciated if you could opensource the model.
Thanks for your patience to read my questions. Sinerely hope that your research could comes to next level.

Ground truth edge maps for training

Dear authors,

Thanks for sharing the source code of MVSS-Net and it is really a great work!

I am actually wondering how the edge loss is calculated during training.
As CASIV v2 does not provide ground truth edge maps, did you extract the edge maps from the ground truth pixel maps on your own? If so, could you please share how you did it?

Thank you!

generalization problem

hi, cause there is no training code , i try pretrain model on my own dataset(data from real world may experience resample, resize and muti-compression) , however i have tried MVSS pretrained model trained CAISA and DEFACTO but results are really disappointed (worse than Mantra-net),so may be the model should trained on my dataset first? or i should change some predict parameters?

Pixel ground truth mask available?

Dear @dong03
Thank you for the code!
Is the pixel ground truth mask available for casia2, or we have to create it from the edge mask?

columbia groundtruth

Hello!May I ask whether you converted the groundtruth of Columbia dataset into a binary mask graph for training?After converting, I found that the conversion of tamper boundary was not thorough and the effect was not good enough.Can you share the groundtruth after your transformation?

Implement details of Sobel Layer?

Thanks for your work.
As the paper show, it consists four sublayer. But I wonder how the sobel result of x and y be fusioned together?

DOES THE CODE BELOW RIGHT?

def forward(self, x):
      sobel_x = F.conv2d(x, sobel_kernel_x, padding=1, bias=False)
      sobel_y = F.conv2d(y, sobel_kernel_y, padding=1, bias=False)
      sobel_rs = torch.pow(torch.pow(sobel_x, 2) + torch.pow(sobel_y, 2), 0.5)
      sobel_rs = F.normalize(x, p=2)
      sobel_rs = self.bn(sobel_rs)
      sobel_rs = F.sigmoid(sobel_rs)
      sobel_rs = x * sobel_rs

Questions about the Bayar Convolution(i.e. Constrainted Convolution) Layer in Noise Sensitive Branch.

Hi! Sorry to disturb you again.
I download the pretrain model from your link, and I found that the constraint conv layer only have one param and it's size is [1, 3, 24]. But I thought that it would be [3, 1, 5, 5] according to your figure.
Could you plz figure out where the wrong lied in?
Thanks sincerely.

15:32 Add
Does it mean: the center pixel of the conv kernel is always -1. And you implemented it by a array, which is further filled into the kernel matrix?

some questions about mvss-net, sincerely

Hello, highly appreciate your recent work achieved SOTA, but there is still something I want to figure out, the questions are listed as follows:

(1)In the SOTA comparison, I noticed you have used the CASIA V2 as the training datasets and other public datasets as the testing datasets, but what is the composition of the validation set?

(2)In the paper, the data online augmentation method was too short and brief, but the clear thing is that you simultaneously embrace the tampered images and non-tampered images. I wonder how to implement that "naive manipulations either by cropping and pasting a squared area", is it just cropping and pasting on the same image from the non-tampered datasets?

(3)Recently, another questions bother me greatly——since the previous work didn't take the non-tampered datasets into account, like GSR-Net, with this extra non-tampered datastes, mvss-net achieve greater progress(some metric even more that 10 percent points). However, is it just a fair comparison with the amount more than doubled?

Hope you fine, looking forward to your detailed answers!

about mvssnet++

Hello! Thanks for your inspiring work. Could you provide the model code for mvssnet++?

about the dataset

hello, i have just downloaded the casia dataset, however, the dataset don't have the mask files, i have tried to get the mask according to the pixel difference, but those are not very accurate. Could you please tell me how do you get the mask gt? Thanks a lot.

Is this work?

I am have my own small test image corpus. I am run this project. And... Result is around a ZERO! Casia or Defacto - no changes. May be MVSS work only with special cases? For example (original, fake, map, map on fake) and ignore text, please:

Could not find the TianChi Model

In the Readme it said:
Update: 22.02.17, Pretrained model for Real-World Image Foregery Localization Challange
But I could not find the model anywhere, could you please help me on that? Thank you!

columbia dataset

It seems like the names of the images of the columbia dataset are different from the version downloaded from here https://www.ee.columbia.edu/ln/dvmm/downloads/AuthSplicedDataSet/AuthSplicedDataSet.htm
Is this the right version of the dataset?

about the datasets

Hello!
Thanks for your inspiring work. And I 'm very intersted in it.
Since the DEFACTO dataset can't be download, could you share the DEFACTO dataset?

Have you ever considered releasing your train code? Want to train my own dataset to check the model

Cannot to reach mvssnet model link

your baidu link is not working can you give google drive link

image

MVSS-Net++ Implementation

Is the code for MVSS-net++ or the training code going to be available? Specifically there seems to be a non-trainable layer between the ReLU activation and the 1x1 Conv in the ConvGEM module based on the checkpoint you shared. Could you maybe share what this layer is?
Thank you!

questions about pixel-f1

Hi,
Thank you very much for your work. I read your paper and tested your code (casia1.0) these days, but there are some doubts about the pixel F1 score
As shown in the following image,In the case of fixed threshold , the image F1 and pixel f1 are consistent, but I want to get the best threshold test results,What can I do to get it? thanks

MVSS-Net++ Pretrained Weight

Hello, thank you for your works

can i ask for the pretrained weight of mvss-net++ to be uploaded in google drive too?

Thank you in advance

training

Hello! Thanks for your inspiring work. Could you provide the training code?

NIST16 dataset

Thank you for your very meaningful work. I have learned a lot. Could you please share the processed NIST16 data set? Some pictures of the NIST16 data set I downloaded were damaged.

WRONG metrics implementation when calculating the image level F1 score!

There is a severe error when calculating the image level F1 score. This question can be exactly found here:

MVSS-Net/common/utils.py

Lines 44 to 45 in cc2aed7

 spe = true_neg / (true_neg + false_pos + 1e-6) 

 f1 = 2 * sen * spe / (sen + spe)

The correct metric would be F1 = 2 * true_pos / (2 * true_pos + false_pos + false_neg + eps), where eps is a numerical stability factor.

Or calculate with the Recall in the coming formula instead of the Specificity:

And your implementation for pixel level F1 score is correct, you could refer to this part of the codes:

MVSS-Net/common/utils.py

Lines 57 to 59 in cc2aed7

 f1 = 2 * true_pos / (2 * true_pos + false_pos + false_neg + 1e-6) 

 precision = true_pos / (true_pos + false_pos + 1e-6) 

 recall = true_pos / (true_pos + false_neg + 1e-6)

I hope you can recalculate and revise the metrics of the image level F1 score results, and at least publish a patch Table of metrics on GitHub, otherwise, it will be very unfair for future research, as this calculation will bring higher results to your F1 score values.

To create a better research environment, we hope that you can value research integrity! Thank you very much for your inspiring work!!

question on :mvssnetplus casia.pt was used to inference

Traceback (most recent call last):
File "D:/PyCharmProject/MVSS-Net-main/inference.py", line 53, in
model.load_state_dict(checkpoint, strict=True)
File "D:\Anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1051, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for MVSSNet:
Missing key(s) in state_dict: "model.conv1.weight", "model.bn1.weight", "model.bn1.bias", "model.bn1.running_mean", "model.bn1.running_var", "model.layer1.0.conv1.weight", "model.layer1.0.bn1.weight", "model.layer1.0.bn1.bias", "model.layer1.0.bn1.running_mean", "model.layer1.0.bn1.running_var", "model.layer1.0.conv2.weight", "model.layer1.0.bn2.weight", "

questions about training in the paper

Hi, @dong03
Thanks for your nice work!
After reading your paper, I still have some questions:

You see 'We re-train FCN (Seg) and MVSS-Net(full setup) from scratch on CASIAv2.', do you pre-train ResFCN first and then use this model to init your MVSS-Net? or just train these two models independently?
What's the valid set when training?

Questions about model training process in TIANCHI

Hi, @dong03
sorry to disturb you, I find that MVSSNet is baseline model in TIANCHI challenge round1.
I wonder if checkpoint provided is only training on this dataset.
If so, there are all fake images, how you train model in this condition.
By the way, when will MVSSNet's training script is available? (open source when TIANCHI challenge round is over?)

To reproduce this repo, we need to generate edge map first to train edge_loss, but even if we have know cv2.findContours is used, it's still hard to reproduce.
I use default setting in opencv functions, but the edge extraction result is so bad. The normal process of edge extract as follow:

bgr2gray
generate binary image, use cv2.threshold to do this.
cv2.findContours also has some setting.

Considering above these setting unknown, it is hard to reproduce your result.
Can you share some code only about edge map extraction process?
I don't think the disclosure of these pre-processing codes will directly affect your benefit.

wishing for your reply and thanks again.

About the optimal threshold

From this picture, most models have relatively clear judgments on the tampered area in most cases. Why can the F1 scores of most models be doubled or even tripled only by adjusting the optimal threshold? Effect of the threshold seem excessive?

issue in loading new mvssnetplus_casia.pth weights

Hi,
Im trying to load mvssnetplus_casia.pth in demo.py but model.load_state_dict(checkpoint['model_dict'],strict=True), throws Unexpected Keys in state dict (below)
Do i need to ignore these or are there any changes in model for mvssnetplus?,
Thank you:

The images coming from the corel dataset that are included in CASIAv1plus.txt have different names with the corel images downloaded from here https://sites.google.com/site/dctresearch/Home/content-based-image-retrieval (this link is included in the data/README.md)

How to calculate edge loss, dataset has relevent annotation?

edge feature map down to (w/4, h/4) for dice loss calculating,

however, how can i get the relevent edge groud truth for loss deducting?

it is not mentioned, by edge search or already given by dataset?

	spe = true_neg / (true_neg + false_pos + 1e-6)
	f1 = 2 * sen * spe / (sen + spe)

	f1 = 2 * true_pos / (2 * true_pos + false_pos + false_neg + 1e-6)
	precision = true_pos / (true_pos + false_pos + 1e-6)
	recall = true_pos / (true_pos + false_neg + 1e-6)

dong03 / mvss-net Goto Github PK

mvss-net's People

Contributors

Stargazers

Watchers

Forkers

mvss-net's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs