GithubHelp home page GithubHelp logo

fnd-bootstrap's Introduction

Bootstrapping Your Own Representations for Fake News Detection

This repo is built upon "Masked Autoencoders: A PyTorch Implementation"

  • Data Preparation. You need to prepare the data using the scripts in ./data_prepare. We only support Weibo/Weibo-21/GossipCop so far, and the data should be downloaded exactly from the following sources. Weibo/Weibo-21: send email to Dr Qiong Nan. GossipCop: send email to the original authors of GossipCop (say sorry to Dr Singhal for my previous wrong direction and the caused confusion and borthering). They will kindly help (according to our experience)

  • After you process the data, run the .sh scripts for training or testing.

  • We alternatively provide an alternative in the network design in ./models/UAMFDv2_Net.py, where the differences are trivial: 1) we replaced ELU with SimpleGate where the tensors are split into two halves and the second half is used for reweighing the first half, which also ensures non-linearlity. 2) We use AdaIn to control the mean and std of the refined representations, where the original reweighing MLPs are therefore replaced. Note that if you wish to exactly implement the network design reported in the paper, use UAMFD_Net instead of UAMFDv2_Net, though the latter will be slightly even better according to our later tests.

Pre-training

The pre-training models of MAE can be downloaded from "Masked Autoencoders: A PyTorch Implementation".

Because of the restriction on upload size, we are unable to upload pretrained models and the processed data. We will further open-source them on GitHub after the anonymous reviewing process.

License

We have been granted permisson to use Weibo/Weibo-21/GossipCop datasets for academic studies only.

fnd-bootstrap's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

fnd-bootstrap's Issues

关于twitter数据集

感谢您的工作!对于在twitter数据集上我有几个问题,首先是在Twitter_dataset.py文件以下两个路径的文件我不知道是如何得到的:
train_path = root_path + '/train_twitters.xlsx
test_path = root_path+'/test_datasets.xlsx'

然后是process_data_twitter.py中的“Data/twitter-merged/”文件和 path = "/home/groupshare/mae-main/ztrain_en.xlsx"中的文件,

最后请问能不能直接上传root_path中Twitter文件,我想它应该包含我所需要的所有文件,非常感谢!!!

数据标签

作者你好,我想问一下这篇论文单视图的标签是怎么定ground-truth的,是利用文中说得阈值还是一整条信息的标签。

MAE pre-train

请问MAE预训练是在模型之外训练好之后,再插入mae_pretrain_vit_base.pth到模型中吗?
请问MAE预训练使用的是实验所用数据集吗?

关于验证集与测试集

您好,感谢您出色的工作!我注意到您将数据集划分为训练集与测试集,当模型验证时,直接加载了测试集进行验证,而没有单独的测试集,请问论文中的结果是如何得到的?是多次训练对最高的准确率进行平均吗?

关于iMMoE

1.在使用iMMoE细化特征ris时,生成了eis0和eis1,请eis0和eis1有什么区别吗,它们在经过iMMoE时,是从两个Gate输出的吗,如果是,那它们分别从不同Gate经过是由什么依据,标准是什么?
2.在使用iMMoE细化融合特征[eis1,et1]以及自举阶段引导细化多视图表示[wis; wip; wm; wx; wt]时特征表示只需要经过一个Gate网络即可吗?
3.iMMoE中的三个专家网络负责的部分有何区别?
希望得到你的解答,十分感谢。

数据集

请问可以提供一下预训练模型和处理后的数据集吗?

gossip数据集

我想问一下你的gossip怎么处理数据长度的,bert模型根本处理不了那么长的文本,你是选用了固定长度的文本进行截断处理还是将文本分成多个段处理的,非常感谢解答。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.