thinhhnt's Projects
Angular Jsoneditor that works with angular 4 to angular 15
Official implementation of AnimateDiff.
Forked from https://github.com/deepinsight/insightface
A collection of literature after or concurrent with Masked Autoencoder (MAE) (Kaiming He el al.).
Official codebase used to develop Vision Transformer, MLP-Mixer, LiT and more.
CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
DocDiff: Document Enhancement via Residual Diffusion Models. The first diffusion-based models designed for diverse document enhancement tasks. This model is lightweight, efficient, flexible and can also be used for img2img tasks in natural scenes.
DocEnTr: An end-to-end document image enhancement transformer - ICPR 2022
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
A Bulletproof Way to Generate Structured JSON from Language Models
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
[ACL 2022] LinkBERT: A Knowledgeable Language Model š Pretrained with Document Links
Inference code for LLaMA models
Customizable admin dashboard template based on Angular 10+
Implementation of Nougat Neural Optical Understanding for Academic Documents
An open-source framework for training large multimodal models
Operating LLMs in production
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
[3DV 22] Official implementation of Robust RGB-D Fusion Network for Saliency Detection
RGB-D Salient Object Detection: A Survey
Forked from https://github.com/lvwerra/trl