Light

pritamqu / crisscross Goto Github PK

[AAAI 2023 (Oral)] CrissCross: Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity

Home Page: https://www.pritamsarkar.com/

License: Other

Python 100.00%

action-recognition audioset dcase esc50 hmdb51 kinetics-datasets kinetics400 representation-learning self-supervised-learning sound-classification ucf101

crisscross's Introduction

👋 Hi, I’m Pritam, please check my website for more information www.pritamsarkar.com
👀 I’m interested in machine learning 🧠, photography 📷, and film making 🎞️ .
💞️ I’m always open to coffee and discuss research.
📫 reach me: pritam[dot]sarkar[at]queensu[dot]ca.

crisscross's People

Contributors

Stargazers

Watchers

Forkers

expert68 qq332067275

crisscross's Issues

Category list for Kinetics-Sound dataset

Hi, I am planning to do research based on your model. I found that in the paper that you cited (Look, listen and
learn), there are 34 classes in Kinetics-Sound. Among these classes, 32 classes are used in your research. Could you provide the category list? Many thanks for considering my request.

upload the all code

Hello!
I am really interested in your work by studying your paper.
When will you upload the all code?

Could I ask when training code is released?

Hello, I am planning to do research based on this model. Could you release the entire code..? Would it be available within this month? Thank you in advance.

upload training code

May I ask when will you upload training code?

About training and testing dataset

Hi Pritam, thank you very much for your amazing work. I have some questions about the dataset you used in this work. The pretrained dataset : K400, AudioSet and Kinetics-Sound, do you always use both audio and visual information, and do they always contain audio stream? Because I am trying k400, but I found some videos miss audio stream. In addition, the downstream dataset like UCF-101 and HMDB-51, do you use both audio and visual pairs , or just use visual information for evaluation? It seems that videos files in UCF-101 do not always contain the audio stream. Thank you very much.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble