GithubHelp home page GithubHelp logo

endlesssora / deeperforensics-1.0 Goto Github PK

View Code? Open in Web Editor NEW
526.0 526.0 69.0 41.51 MB

[CVPR 2020] A Large-Scale Dataset for Real-World Face Forgery Detection

Python 95.90% Shell 4.10%
benchmark cvpr2020 dataset deepfakes face-forensics face-forgery-detection face-manipulation method perturbations real-world videos

deeperforensics-1.0's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeperforensics-1.0's Issues

敬请期待

啥时候才能真正发布出来呀?要是过年的时候开放下载,那我这年都过不踏实了。。。。哈

Script to download them all

Dear DeeperForensics authors,
great work! Thank you so much. In the effort of seamlessly download automatically all the dataset I create a bash script using gdown it worked at the beginning but apparently after a while it brakes with the message for large files only.

Access denied with the following error:

        Too many users have viewed or downloaded this file recently. Please
        try accessing the file again later. If the file you are trying to
        access is particularly large or is shared with many people, it may
        take up to 24 hours to be able to view or download the file. If you
        still can't access a file after 24 hours, contact your domain
        administrator. 

You may still be able to access the file from the browser:

Update: I realized that I was posting the downloading script here which is not correct. @EndlessSora let me know if you want I can share with you the script privately so you can provide it to people that access the dataset

Datasets split for stand, and training

Hi, thank you for your work.
According to your paper, the standard set only includes 1k Youtube videos and 1k manipulated videos(end_to_end), right?
And if one wants to train their model on "std+std/sing", he or she need to apply the same pertubation on real videos(1k from ff++ and 100 actors' videos) , since the provided real videos have no pertubations, right?

Looking forward to your reply

Another question about dataset split

Thanks for your interest in our work.

  1. Yes. The standard set only includes 1k Youtube videos and 1k manipulated videos (end_to_end).
  2. Almost correct. The perturbations with a similar distribution should be applied to the real (1k Youtube videos from ff++, but no need for 100 actors' videos since they are source videos used for face manipulation) and fake videos if one would like to train his model on "std+std/sing".

Originally posted by @EndlessSora in #10 (comment)

About std/x dataset in your experiments in the paper

hi!
Thank you for your wonderful work~
I have noticed that you have used different data settings in your experiments: std/sing 、std/rand 、std/mix, and I am confused that whether you add the same perturbations to the original video data as you did to the manipulated data in the experiments?

About DF-VAE

Thank you for your work!

Do you intend to release the relevant code and training scripts about the DF-VAE?

Questions of benchmark

As the title shows , I have difficulty in reproducing the Results of XceptionNet Baseline .
I hope you could show me some ”Not private“ details of your experiments if you still remember them. Or point out the errors in my own process.

Thank you anyway.

Our total process is shown as follows:

  1. Using face detection method(MTCNN) to detect all frames in FF++_C23 videos, to get original face bounding box --【Boxes only from FF++_c23】;

  2. With the scale (=1.3), enlarge the bounding box(also trying to be a rectangle box ); Then I use the boxes to extract faces in both FF++_C23 videos and DF1.0--end2end--the corresponding fake videos ; --【1.3 faces from Both】;

2.5) Then we have two big folders, each has 1,000 sub-folders of images (1000+1000 == 2000)

  1. The the XceptionNet is trained from the two Folders (train:val: test is about 7: 1:2, so about 0.7x2000 ==1400 sub-folders), and each video/sub-folder produce 270 frames at regular intervals(like frame_0, frame_2, ..., frame_538, if total frames is larger than 540)--【270 frames from each video】

  2. The parameter of XceptionNet is
    4.1) batch_size = 32 , epoches = 40
    4.2) optimizer_ft = optim.Adam(model.parameters(), lr=0.0002) #Other Default
    4.3) exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=2, gamma=0.9)
    4.4)Val is done after each epoch has trained

  3. The test process is done with all the images of the test sub-folders (about 0.2 x 2000 == 400)

  4. If test on other dataset, like end_to_end_level_1, the test set is also like above(about 0.2 x 2000 == 400 sub-folders)

Unable to reproduce the experimental results of the paper

Can you provide the training log corresponding to the experiment of the paper?

I performed the same distortions on the source videos of Faceforensics++, and used this dataset to train the face detection model. The model can quickly converge during training, but its performance on the hidden dataset is very poor? Do you know what the problem is?

In addition, this is my training log. thank you very much!

2020-09-22 10:27:08,954 - INFO: Epoch:0 || Iter:0/549 || Loss:0.69330(0.69330) || Accuracy:0.56250(0.56250) 2020-09-22 10:27:13,801 - INFO: Epoch:0 || Iter:10/549 || Loss:0.33249(0.54624) || Accuracy:0.83594(0.72301) 2020-09-22 10:27:18,548 - INFO: Epoch:0 || Iter:20/549 || Loss:0.10217(0.38276) || Accuracy:0.96875(0.81659) 2020-09-22 10:27:23,361 - INFO: Epoch:0 || Iter:30/549 || Loss:0.09741(0.29400) || Accuracy:0.93750(0.86139) 2020-09-22 10:27:28,143 - INFO: Epoch:0 || Iter:40/549 || Loss:0.09310(0.24562) || Accuracy:0.96094(0.88472) 2020-09-22 10:27:32,883 - INFO: Epoch:0 || Iter:50/549 || Loss:0.05807(0.20966) || Accuracy:0.98438(0.90227) 2020-09-22 10:27:37,603 - INFO: Epoch:0 || Iter:60/549 || Loss:0.07660(0.18957) || Accuracy:0.98438(0.91342) 2020-09-22 10:27:42,391 - INFO: Epoch:0 || Iter:70/549 || Loss:0.05925(0.17513) || Accuracy:0.96875(0.92066) 2020-09-22 10:27:47,103 - INFO: Epoch:0 || Iter:80/549 || Loss:0.07028(0.16092) || Accuracy:0.96875(0.92824) 2020-09-22 10:27:51,832 - INFO: Epoch:0 || Iter:90/549 || Loss:0.06247(0.14881) || Accuracy:0.96094(0.93389) 2020-09-22 10:27:56,724 - INFO: Epoch:0 || Iter:100/549 || Loss:0.09728(0.13896) || Accuracy:0.96875(0.93858) 2020-09-22 10:28:01,467 - INFO: Epoch:0 || Iter:110/549 || Loss:0.04423(0.13061) || Accuracy:0.97656(0.94264) 2020-09-22 10:28:06,290 - INFO: Epoch:0 || Iter:120/549 || Loss:0.09134(0.12317) || Accuracy:0.96875(0.94602) 2020-09-22 10:28:11,023 - INFO: Epoch:0 || Iter:130/549 || Loss:0.02772(0.11808) || Accuracy:0.99219(0.94853) 2020-09-22 10:28:15,779 - INFO: Epoch:0 || Iter:140/549 || Loss:0.02995(0.11276) || Accuracy:0.98438(0.95063) 2020-09-22 10:28:20,518 - INFO: Epoch:0 || Iter:150/549 || Loss:0.02055(0.10843) || Accuracy:1.00000(0.95266) 2020-09-22 10:28:25,399 - INFO: Epoch:0 || Iter:160/549 || Loss:0.04992(0.10441) || Accuracy:0.96094(0.95414) 2020-09-22 10:28:30,255 - INFO: Epoch:0 || Iter:170/549 || Loss:0.02497(0.10071) || Accuracy:0.99219(0.95587) 2020-09-22 10:28:35,166 - INFO: Epoch:0 || Iter:180/549 || Loss:0.03729(0.09727) || Accuracy:0.98438(0.95740) 2020-09-22 10:28:39,957 - INFO: Epoch:0 || Iter:190/549 || Loss:0.03673(0.09374) || Accuracy:0.97656(0.95877) 2020-09-22 10:28:44,687 - INFO: Epoch:0 || Iter:200/549 || Loss:0.03946(0.09064) || Accuracy:0.99219(0.96028) 2020-09-22 10:28:49,426 - INFO: Epoch:0 || Iter:210/549 || Loss:0.02468(0.08788) || Accuracy:0.98438(0.96131) 2020-09-22 10:28:54,239 - INFO: Epoch:0 || Iter:220/549 || Loss:0.04746(0.08512) || Accuracy:0.98438(0.96249) 2020-09-22 10:28:58,963 - INFO: Epoch:0 || Iter:230/549 || Loss:0.03039(0.08289) || Accuracy:0.98438(0.96341) 2020-09-22 10:29:03,685 - INFO: Epoch:0 || Iter:240/549 || Loss:0.08809(0.08134) || Accuracy:0.95312(0.96398) 2020-09-22 10:29:08,470 - INFO: Epoch:0 || Iter:250/549 || Loss:0.02432(0.07950) || Accuracy:0.97656(0.96473) 2020-09-22 10:29:13,233 - INFO: Epoch:0 || Iter:260/549 || Loss:0.02534(0.07781) || Accuracy:1.00000(0.96558) 2020-09-22 10:29:18,048 - INFO: Epoch:0 || Iter:270/549 || Loss:0.03035(0.07645) || Accuracy:0.97656(0.96616) 2020-09-22 10:29:22,891 - INFO: Epoch:0 || Iter:280/549 || Loss:0.01610(0.07478) || Accuracy:0.99219(0.96694) 2020-09-22 10:29:27,696 - INFO: Epoch:0 || Iter:290/549 || Loss:0.02178(0.07328) || Accuracy:0.98438(0.96770) 2020-09-22 10:29:32,495 - INFO: Epoch:0 || Iter:300/549 || Loss:0.01254(0.07157) || Accuracy:1.00000(0.96846) 2020-09-22 10:29:37,226 - INFO: Epoch:0 || Iter:310/549 || Loss:0.01840(0.06979) || Accuracy:0.99219(0.96928) 2020-09-22 10:29:41,951 - INFO: Epoch:0 || Iter:320/549 || Loss:0.02171(0.06842) || Accuracy:0.98438(0.96982) 2020-09-22 10:29:46,696 - INFO: Epoch:0 || Iter:330/549 || Loss:0.00467(0.06701) || Accuracy:1.00000(0.97035) 2020-09-22 10:29:51,466 - INFO: Epoch:0 || Iter:340/549 || Loss:0.01416(0.06609) || Accuracy:1.00000(0.97063) 2020-09-22 10:29:56,222 - INFO: Epoch:0 || Iter:350/549 || Loss:0.01028(0.06513) || Accuracy:1.00000(0.97106) 2020-09-22 10:30:00,983 - INFO: Epoch:0 || Iter:360/549 || Loss:0.02263(0.06392) || Accuracy:0.99219(0.97156) 2020-09-22 10:30:05,720 - INFO: Epoch:0 || Iter:370/549 || Loss:0.03179(0.06285) || Accuracy:0.97656(0.97197) 2020-09-22 10:30:10,445 - INFO: Epoch:0 || Iter:380/549 || Loss:0.03527(0.06230) || Accuracy:0.98438(0.97240) 2020-09-22 10:30:15,216 - INFO: Epoch:0 || Iter:390/549 || Loss:0.00949(0.06134) || Accuracy:1.00000(0.97279) 2020-09-22 10:30:20,041 - INFO: Epoch:0 || Iter:400/549 || Loss:0.05724(0.06046) || Accuracy:0.97656(0.97317) 2020-09-22 10:30:24,888 - INFO: Epoch:0 || Iter:410/549 || Loss:0.00370(0.05961) || Accuracy:1.00000(0.97354) 2020-09-22 10:30:29,716 - INFO: Epoch:0 || Iter:420/549 || Loss:0.04780(0.05885) || Accuracy:0.96875(0.97391) 2020-09-22 10:30:34,467 - INFO: Epoch:0 || Iter:430/549 || Loss:0.04402(0.05810) || Accuracy:0.96875(0.97419) 2020-09-22 10:30:39,184 - INFO: Epoch:0 || Iter:440/549 || Loss:0.05830(0.05733) || Accuracy:0.98438(0.97456) 2020-09-22 10:30:43,892 - INFO: Epoch:0 || Iter:450/549 || Loss:0.02611(0.05658) || Accuracy:0.97656(0.97490) 2020-09-22 10:30:48,628 - INFO: Epoch:0 || Iter:460/549 || Loss:0.02152(0.05582) || Accuracy:0.98438(0.97519)

About training set and test set

Do you use the entire DeeperForensics_1.0\source_videos as the training set, so when generating the swapped dataset, the model has actually been trained on all source images. The whole model is not subject agnostic. Is it true?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.