Light

pyupya / pre-trained-vl-model Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jaeyun95/pre-trained-vlk-model

0.0 0.0 0.0 616 KB

pre-trained vision and language model summary

pre-trained-vl-model's Introduction

Pretrained model summary

pretrained language model
pretrained image model
pretrained video model
pretrained image and language model
pretrained video and language model
pretrained knowledge and language model

pretrained language model

title	paper link	code link
Improving Language Understanding by Generative Pre-Training	[paper]	[code(pytorch)]
ELMo : Deep contextualized word representations	[paper]	[code(tensorflow)]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	[paper]	[code(tensorflow)][code(pytorch)]
ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS	[paper]	[code(tensorflow)][code(pytorch)]
RoBERTa: A Robustly Optimized BERT Pretraining Approach	[paper]	[code[pytorch]]
Language Models are Unsupervised Multitask Learners	[paper]	[code(tensorflow)]
Language Models are Few-Shot Learners	[paper]	[code]
XLNet: Generalized Autoregressive Pretraining for Language Understanding	[paper]	[code(tensorflow)]

pretrained image model

title	paper link	code link
Identity Mappings in Deep Residual Networks	[paper]	[code]
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks	[paper]	[code(pytorch)]
Mask R-CNN	[paper]	[code(tensorflow)][code(pytorch)]
You Only Look Once: Unified, Real-Time Object Detection	[paper]	[code(tensorflow)]
YOLOv3: An Incremental Improvement	[paper]	[code(tensorflow)][code(pytorch)]
YOLOv4: Optimal Speed and Accuracy of Object Detection	[paper]	[code(tensorflow)]

pretrained video model

title	paper link	code link
Looking Fast and Slow: Memory-Guided Mobile Video Object Detection	[paper]	[code(tensorflow)]
Context R-CNN: Long Term Temporal Context for Per-Camera Object Detection	[paper]	[code(tensorflow)]
Optimizing Video Object Detection via a Scale-Time Lattice	[paper]	[code(pytorch)]
Mobile Video Object Detection with Temporally-Aware Feature Maps	[paper]	[code(pytorch)]
X3D: Expanding Architectures for Efficient Video Recognition	[paper]	[code(pytorch)]

pretrained image and language model

summary table

papaer and code

title	paper link	code link
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks	[paper]	[code(pytorch)]
12-in-1: Multi-Task Vision and Language Representation Learning	[paper]	[code(pytorch)]
LXMERT: Learning Cross-Modality Encoder Representations from Transformers	[paper]	[code(pytorch)]
VISUALBERT: A SIMPLE AND PERFORMANT BASELINE FOR VISION AND LANGUAGE	[paper]	[code(pytorch)]
VL-BERT: Pre-training of Generic Visual-Linguistic Representations	[paper]	[code(pytorch)]
UNITER: LEARNING UNIVERSAL IMAGE-TEXT REPRESENTATIONS	[paper]	[code(pytorch)]
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training	[paper]	[code(pytorch)]
Large-Scale Adversarial Training for Vision-and-Language Representation Learning	[paper]	[code(pytorch)]
Fusion of Detected Objects in Text for Visual Question Answering	[paper]	[code(tensorflow)]
ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph	[paper]	[code]
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers	[paper]	[code]

pretrained video and language model

title	paper link	code link
VideoBERT: A Joint Model for Video and Language Representation Learning	[paper]	[code]
UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation	[paper]	[code]
Multi-modal Circulant Fusion for Video-to-Language and Backward	[paper]	[code]
Video-Grounded Dialogues with Pretrained Generation Language Models	[paper]	[code]
Deep Extreme Cut: From Extreme Points to Object Segmentation	[paper]	[code(pytorch)]
Integrating Multimodal Information in Large Pretrained Transformers	[paper]	[code(pytorch)]
Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text	[paper]	[code(caffe)]

pretrained knowledge and language model

title	paper link	code link
Knowledge Enhanced Contextual Word Representations	[paper]	[code(pytorch)]
Why Do Masked Neural Language Models Still Need Commonsense Repositories to Handle Semantic Variations in Question Answering?	[paper]	[code]
SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge	[paper]	[code]
Acquiring Knowledge from Pre-trained Model to Neural Machine Translation	[paper]	[code]
Knowledge-Aware Language Model Pretraining	[paper]	[code]
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model	[paper]	[code]

pre-trained-vl-model's People

Contributors

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs