Topic: vlm Goto Github
Some thing interesting about vlm
Some thing interesting about vlm
vlm,Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detection and segmentation.
User: adithya-s-k
vlm,Python companion to Low Speed Aerodynamics by Joseph Katz and Allen Plotkin
User: alwinw
vlm,Computational Aerodynamics Lab
User: andreagalle
vlm,The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.
Organization: baai-agents
Home Page: https://baai-agents.github.io/Cradle/
vlm,This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges".
Organization: baai-agents
vlm,DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Organization: baaivision
Home Page: https://huggingface.co/datasets/BAAI/DenseFusion-1M
vlm,EVE: Encoder-Free Vision-Language Models from BAAI
Organization: baaivision
vlm,A system for prompted weak supervision.
Organization: batsresearch
vlm,Vortex lattice method for inviscid lifting-surface aerodynamics
Organization: byuflowlab
vlm,Ptera Software is a fast, easy-to-use, and open-source software package for analyzing flapping-wing flight.
User: camurban
vlm,An Implementation of the Vortex Lattice (VLM) and the Doublet Lattice Method (DLM) for aeroelasticity.
Organization: dlr-ae
vlm,[ICLR 2024 Spotlight π₯ ] - [ Best Paper Award SoCal NLP 2023 π] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models
User: erfanshayegani
Home Page: https://iclr.cc/virtual/2024/poster/17767
vlm,[ICRA 2024] Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models
User: flycole
Home Page: https://www.robot-learning.uk/dream2real
vlm,Official implementation of paper "Unveiling the Tapestry of Consistency in Large Vision-Language Models".
Organization: foundation-multimodal-models
vlm,A toolbox meant for aircraft design analyses.
User: godotmisogi
Home Page: https://godotmisogi.github.io/AeroFuse.jl/
vlm,Famous Vision Language Models and Their Architectures
User: gokayfem
vlm,Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
User: gokayfem
vlm,A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
User: haorand
vlm,[MICCAI 2024 π₯] HLSS, the first study to explore hierarchical information inherent in histopathology images and their language descriptions for strong multi-modal representation learning
User: hasindri
vlm,Vision Document Retrieval (ViDoRe): Benchmark π. Evaluation code for the "ColPali: Efficient Document Retrieval with Vision Language Models" paper.
Organization: illuin-tech
Home Page: https://huggingface.co/vidore
vlm,Phi-3 for Mac: Locally-run Vision and Language Models for Apple Silicon
User: josefalbers
vlm,Matlab implementation to simulate the non-linear dynamics of a fixed-wing unmanned areal glider. Includes tools to calculate aerodynamic coefficients using a vortex lattice method implementation, and to extract longitudinal and lateral linear systems around the trimmed gliding state.
Organization: jrgenerative
vlm,Fluid-Structure Interaction Analysis Using FEM and UVLM
User: krproject-tech
vlm,[CVPR'24] Code for Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models
User: letitiabanana
vlm,LLaRA: Large Language and Robotics Assistant
User: lostxine
vlm,Seamlessly integrate state-of-the-art transformer models into robotics stacks
Organization: mbodiai
Home Page: https://mbodi.ai/
vlm,[CVPR 2024 π₯] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
Organization: mbzuai-oryx
Home Page: https://mbzuai-oryx.github.io/GeoChat
vlm,ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)
User: niuzaisheng
Home Page: https://arxiv.org/abs/2402.07945
vlm,PsyDI: A MBTI agent that helps you understand your personality type through a relaxed multi-modal interaction.
Organization: opendilab
Home Page: https://psydi.opendilab.org.cn
vlm,M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts. Furthermore, M3DBench provides a new benchmark to assess large models across 3D vision-centric tasks.
Organization: openm3d
Home Page: https://m3dbench.github.io/
vlm,ezaero - Easy aerodynamics in Python :airplane:
User: partmor
Home Page: https://ezaero.readthedocs.io
vlm,Aircraft design optimization made fast through modern automatic differentiation. Composable analysis tools for aerodynamics, propulsion, structures, trajectory design, and much more.
User: peterdsharpe
Home Page: https://peterdsharpe.github.io/AeroSandbox/
vlm,Python scripts to use for captioning images with VLMs
User: progamergov
vlm,Okra, your all in one personal AI assistant
User: s4mpl3r
vlm,Awesome LLM-related papers and repos on very comprehensive topics.
User: shure-dev
Home Page: https://shorturl.at/bmuwC
vlm,Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"
User: sid2697
Home Page: https://sid2697.github.io/hoi-ref/
vlm,π§π»ββοΈKarmaVLM (ηΈη)οΌA family of high efficiency and powerful visual language model.
User: thomas-yanxin
vlm,A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
User: thuccslab
Home Page: https://github.com/ThuCCSLab/Awesome-LM-SSP
vlm,Official code for Paper "Mantis: Multi-Image Instruction Tuning"
Organization: tiger-ai-lab
Home Page: https://tiger-ai-lab.github.io/Mantis/
vlm,This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"
Organization: ucsc-vlaa
vlm,[NeurIPS 2023 Oral] Quilt-1M: One Million Image-Text Pairs for Histopathology.
User: wisdomikezogwo
Home Page: https://quilt1m.github.io/
vlm,Customizing Visual-Language Foundation Models for Multi-modal Anomaly Detection and Reasoning
User: xiaohao-xu
Home Page: https://arxiv.org/pdf/2403.11083.pdf
vlm,OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Organization: xlang-ai
Home Page: https://os-world.github.io
vlm,Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Organization: xlang-ai
Home Page: https://spider2-v.github.io
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.