Guangyan Sun's Projects
Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).
Chain-of-Spot: Interactive Reasoning Improves Large Vision-language Models
Latex Template for Top-Tier Conference for fastest adpatation
UCB CS61B 2018 summer
Grounded-SAM: Marrying Grounding-DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Config files for my GitHub profile.
A launch point for your personal nvim configuration
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Unify Efficient Fine-Tuning of 100+ LLMs
[NeurIPS 2023 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards GPT-4V level capabilities.
Accelerating the development of large multimodal models (LMMs) with lmms-eval
**Official** æćźæŻ
(Hung-yi Lee) æ©ćšćžçż Machine Learning 2022 Spring
Collecting Ideas from Zhihu, Twitter, etc and some my thoughts
Latex Beamer Theme for RIT
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"