misery0424 Goto Github PK

followers: 1.0 following: 0.0 repos: 40.0 gists: 0.0

Type: User

misery0424's Projects

asymmetric_vqgan

awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

clip

Contrastive Language-Image Pretraining

clip-1

CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)

clip-featurevis

code for reproducing some of the diagrams in the paper "Multimodal Neurons in Artificial Neural Networks"

clipbert

[CVPR 2021 Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning for image-text and video-text tasks.

cross-modal-retrieval

Activity image-based video retrieval

decord

An efficient video loader for deep learning with smart shuffling that's super easy to digest

diffae

Official implementation of Diffusion Autoencoders

diffusion-models-papers-survey-taxonomy

Diffusion model papers, survey, and taxonomy

dit

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

ernie

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

grok-1

Grok open release

llmspeculativesampling

Fast inference from large lauguage models via speculative decoding

lostgans

Official implementation of our ICCV19 paper "Image Synthesis From Reconfigurable Layout and Style"

mil-nce_howto100m

PyTorch GPU distributed training code for MIL-NCE HowTo100M

ml-visuals

🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.

mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

navit

My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"

ofa

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

open-sora

Open-Sora: Democratizing Efficient Video Production for All

paintmind

Fast and controllable text-to-image model.

poster_template

some academic posters as references. May we have in-person poster session soon!

pytorch-image-models

PyTorch image models, scripts, pretrained weights -- (SE)ResNet/ResNeXT, DPN, EfficientNet, MixNet, MobileNet-V3/V2, MNASNet, Single-Path NAS, FBNet, and more

pytorchvideo

A deep learning library for video understanding research.

rq-transformer

Implementation of RQ Transformer, proposed in the paper "Autoregressive Image Generation using Residual Quantization"

rq-vae-transformer

The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)

slowfast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

soho

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

misery0424 Goto Github PK

misery0424's Projects

Recommend Projects

Recommend Topics

Recommend Org

Jobs