Topic: cross-modal Goto Github

Some thing interesting about cross-modal

👇 Here are 49 public repositories matching this topic...

bitreidgroup / dscnet

cross-modal,DSCNet Visible-Infrared Person ReID (TIFS 2022)

User: bitreidgroup

Home Page: https://ieeexplore.ieee.org/document/9963944

computer-vision cross-modal deep-learning re-identification visible-infrared

buaadreamer / ccrk

cross-modal,[KDD 2024] Improving the Consistency in Cross-Lingual Cross-Modal Retrieval with 1-to-K Contrastive Learning

User: buaadreamer

Home Page: https://arxiv.org/abs/2406.18254

cross-lingual cross-modal retrieval iglue swin-transformer xlm-roberta mscoco multi30k wit xflickrco

caoyue10 / aaai17-cdq

cross-modal,The implementation of AAAI-17 paper "Collective Deep Quantization of Efficient Cross-modal Retrieval"

User: caoyue10

quantization deep-learning cross-modal similarity-search

catalina17 / xflow

cross-modal,Generalized cross-modal NNs; new audiovisual benchmark (IEEE TNNLS 2019)

User: catalina17

deep-neural-networks audiovisual-classification multimodal cross-modal keras multimedia audiovisual classification

clt29 / semantic_neighborhoods

cross-modal,Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval [ECCV 2020]

User: clt29

Home Page: http://www.cs.pitt.edu/~chris/semantic_neighborhoods

eccv2020 retrieval computer-vision cross-modal-retrieval cross-modal visual-semantic-embedding code goodnews politics mscoco-dataset

crossmodallearning / crossmodallearning.github.io

cross-modal,Website for Cross Modal Learning and Application workshop - ACM ICMR 2019

User: crossmodallearning

webiste cross-modal deep-learning

docarray / docarray

cross-modal,Represent, send, store and search multimodal data

Organization: docarray

Home Page: https://docs.docarray.org/

docarray data-structures multimodal cross-modal neural-search deep-learning nested-data qdrant weaviate nearest-neighbor-search

drsy / motis

cross-modal,[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)

User: drsy

ios-swift ai image-search clip vector-search knn lsh semantic-search random-projection knowledge-distillation retrieval cross-modal k-means k-means-clustering naacl

eaphan / upidet

cross-modal,Unleash the Potential of Image Branch for Cross-modal 3D Object Detection [NeurIPS2023]

User: eaphan

3d-object-detection cross-modal multi-modal

gorjanradevski / vsepp_tensorflow

cross-modal,Implementation of "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives" in Tensorflow.

User: gorjanradevski

tensorflow vse cross-modal multimodal-deep-learning image-text-search cross-modal-retrieval vsepp

gt-ripl / xmodal-ctx

cross-modal,Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning

Organization: gt-ripl

Home Page: https://sites.google.com/view/xmodal-context

clip cross-modal image-captioning vision-and-language

haihuangcode / cmg

cross-modal,The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)

User: haihuangcode

cross-modal multimodal pretrained-models cross-modal-generalization

jina-ai / discoart

cross-modal,🪩 Create Disco Diffusion artworks in one line

Organization: jina-ai

creative-ai disco-diffusion cross-modal dalle generative-art multimodal diffusion prompts midjourney imgen

jizhizili / rim

cross-modal,[CVPR 2023] Referring Image Matting

User: jizhizili

cross-modal image-matting image-segmentation multimodal matting

krantiparida / awesome-audio-visual

cross-modal,A curated list of different papers and datasets in various areas of audio-visual processing

User: krantiparida

awesome audio-visual cross-modal mutli-modal localization source-separation awesome-list

kuanghuei / scan

cross-modal,PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)

User: kuanghuei

visual-semantic cross-modal image-captioning neural-network deep-learning pytorch computer-vision

kywen1119 / dsran

cross-modal,Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.

User: kywen1119

pytorch image-text-matching tcsvt cross-modal computer-vision

mako443 / text2pos-cvpr2022

cross-modal,Code, dataset and models for our CVPR 2022 publication "Text2Pos"

User: mako443

pytorch deep-learning localization nlp language-processing cross-modal cross-modal-retrieval cross-modal-learning computer-vision cvpr

marslanm / multimodality-representation-learning

cross-modal,This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .

User: marslanm

cross-modal multimodal-datasets multimodal-deep-learning multimodal-pre-trained-model transformer-models vision-language-pretraining multimodal-applications multimodal-pretext

mesnico / aladin

cross-modal,Official implementation of the paper "ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval"

User: mesnico

computer-vision cross-modal cross-modal-retrieval deep-learning language-and-vision natural-language-processing pytorch

mhsangar / im-multimodal-bounce

cross-modal,

User: mhsangar

processing cross-modal html css

mmact19 / 2019

cross-modal,MMAct: A Large-Scale Dataset for Cross Modal Learning on Human Action Understanding

User: mmact19

Home Page: https://mmact19.github.io/2019/

dataset action-recognition cross-modal deep-learning action-detection

nataliakoliou / music-visualization-network

cross-modal,cDCGAN model for audio-to-image generation: a cross-modal analysis using deep-learning techniques

User: nataliakoliou

audio-encoder audio-to-image cdcgan cross-modal deep-learning generative-adversarial-network image-generation music-visualization pytorch

neuronelab / rs-datasetshub

cross-modal,A hub hosting essential remote sensing datasets.

Organization: neuronelab

captioning-images classification cross-modal datasets deep-learning remote-sensing satellite-data satellite-imagery zero-shot-classification

ovshake / cobra

cross-modal,Code for COBRA: Contrastive Bi-Modal Representation Algorithm (https://arxiv.org/abs/2005.03687)

User: ovshake

representation-learning cross-modal contrastive-learning machine-learning pytorch

paranioar / sherl

cross-modal,[ECCV2024] The code of "SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning"

User: paranioar

cross-modal memory-efficient-learning memory-efficient-tuning parameter-efficient-fine-tuning parameter-efficient-learning parameter-efficient-tuning transfer-learning

paranioar / unipt

cross-modal,[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"

User: paranioar

cross-modal parameter-efficient-learning parameter-efficient-tuning transfer-learning memory-efficient-learning memory-efficient-tuning parameter-efficient-fine-tuning

petarv- / x-cnn

cross-modal,Cross-modal convolutional neural networks

User: petarv-

cross-modal convolutional-neural-networks python keras

prithivirajdamodaran / whatthefood

cross-modal,An intentionally simple Image to Food cross-modal search. Created by Prithiviraj Damodaran.

User: prithivirajdamodaran

cross-modal-retrieval cross-modal cross-modal-learning multimodal

qcraftai / distill-bev

cross-modal,DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation (ICCV 2023)

User: qcraftai

3d-object-detection autonomous-driving bev cross-modal distillation knowledge-distillation lidar multi-camera multi-modal nuscenes point-cloud self-driving

qizhipei / biot5

cross-modal,BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)

User: qizhipei

Home Page: https://arxiv.org/abs/2310.07276

bioinformatics computational-biology cross-modal machine-learning nlp nlp-applications

roboflow / multimodal-maestro

cross-modal,Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥

Organization: roboflow

Home Page: https://maestro.roboflow.com

lmm multimodality segment-anything instance-segmentation object-detection gpt-4 gpt-4-vision llava prompt-engineering visual-prompting

rohitrango / objects-that-sound

cross-modal,Unofficial Implementation of Google Deepmind's paper `Objects that Sound`

User: rohitrango

machine-learning deep-learning audio-video embeddings deeplearning deep-neural-networks deepmind audioset cross-modal

rsmbyk / objects-that-sound

cross-modal,Implementation of `Objects that Sound` and `Look, Listen, and Learn` papers by Relja Arandjelovi´c and Andrew Zisserman

User: rsmbyk

Home Page: https://deepmind.com/blog/objects-that-sound/

deep-learning audio-visual-correspondence cross-modal self-supervision convolutional-neural-networks

sarahesl / alignclip

cross-modal,AlignCLIP: Improving Cross-Modal Alignment in CLIP

User: sarahesl

alignment clip cross-modal

shaoxiongji / knowledge-graphs

cross-modal,A collection of research on knowledge graphs

User: shaoxiongji

Home Page: https://shaoxiongji.github.io/knowledge-graphs/

commonsense cross-modal dialogue-systems information-retrieval knowledge-graph knowledge-graph-completion meta-relational-learning natural-language-processing ner paper question-answering reasoning recommendation-systems relation-extraction representation-learning survey temporal-knowledge-graph

smallflyingpig / speech-to-image-translation-without-text

cross-modal,Code for paper "direct speech-to-image translation"

User: smallflyingpig

Home Page: https://smallflyingpig.github.io/speech-to-image/main

speech-to-image gan cross-modal

smil-spcras / mer

cross-modal,Multi-Corpus Emotion Recognition Method based on Cross-Modal Gated Attention Fusion

Organization: smil-spcras

Home Page: https://smil-spcras.github.io/MER/

attention cross-modal emotion-recognition pattern-recognition

towhee-io / examples

cross-modal,Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc.

Organization: towhee-io

audio-classification cross-modal embeddings image-classification machine-learning nlp video-tagging

viresh-r / ml-cca

cross-modal,Implementation of Fast ml-CCA from the ICCV-2015 work "Multi-Label Cross-Modal Retrieval"

User: viresh-r

cross-modal image-to-text text-to-image canonical-correlation-analysis iccv

yangli18 / vltvg

cross-modal,Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022

User: yangli18

visual-grounding vision-language visual-linguistic cross-modal

yangliu9208 / divafn

cross-modal,[IEEE T-IP 2020] Deep Image-to-Video Adaptation and Fusion Networks for Action Recognition

User: yangliu9208

Home Page: https://yangliu9208.github.io/DIVAFN/

action-recognition cross-modal deep-learning domain-adaptation video-understanding

yangliu9208 / sakdn

cross-modal,[IEEE T-IP 2021] Semantics-aware Adaptive Knowledge Distillation for Cross-modal Action Recognition

User: yangliu9208

action-recognition cross-modal domain-adaptation knowledge-distillation transfer-learning video-understanding

yisun98 / solc

cross-modal,Remote Sensing Sar-Optical Land-use Classfication Pytorch Pytorch高分辨率遥感语义分割/地物分割/地物分类

User: yisun98

land-use-classification pytorch remote-sensing segmentation deeplabv3 oa-kappa cross-modal multi-modal multi-source sar-optical

yolo2233 / cross-modal-hasing-playground

cross-modal,Python implementation of cross-modal hashing algorithms

User: yolo2233

python cross-modal hashing-algorithm tensorflow

zengyi-qin / weakly-supervised-3d-object-detection

cross-modal,Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020

User: zengyi-qin

3d-object-detection vs3d ws3d weakly-supervised-detection kitti point-cloud cross-modal transfer-learning object-proposals tensorflow

zerovl / zerovl

cross-modal,[ECCV2022] Contrastive Vision-Language Pre-training with Limited Resources

User: zerovl

clip cross-modal deep-learning vision-and-language

zilliz-bootcamp / pedestrian_search

cross-modal,Search targeted pedestrians with the text.

Organization: zilliz-bootcamp

milvus deep-learning pedestrian cross-modal

zjukg / duet

cross-modal,[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning

Organization: zjukg

Home Page: https://arxiv.org/abs/2207.01328

pretrained-language-model pytorch transformer zero-shot-learning cross-modal grounding semantic knowledge-transfer visual-grounding

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble