xinhaomei / act Goto Github PK
View Code? Open in Web Editor NEWSource code for the paper 'Audio Captioning Transformer'
Source code for the paper 'Audio Captioning Transformer'
Hi thanks for the code. I want to use the pretrained model for making inference on my own data. So the data is only audio files (and no csv files for training). I set up the environment, then put data files in data folder (waveforms). Updated settings.yaml file as instructed (eval mode, pretrained model etc.). Could you please guide me about making inference now? (I tried running train.py but it gives errors about missing train.h5 file etc.). Thanks.
Hello, Dr. Mei! Thank you for open-sourcing this awesome work.
While the procedure to reproduce ACT_m_DeiT and ACT_m_scratch was sufficiently explained, I was wondering if it would be possible to upload them / otherwise obtain from you the model you used to produce the Results Table - Just like you did with the ACT_*_AudioSet_DeiT series.
Thank you!
Hi,
can you provide another way for us to download audiocaps data? I can't use Baidu.
Could i download word_list.p ?
Hello, nice work!
After I replicate your work, and I got the model trained from scratch by myself ( the encoder used pretrained_model -- audioset_deit).
And when I want to evaluate the model, I set the config.path.eval_model to the path where my model is, but when I load the state_dict, there is something wrong.
I do not know why there is nothing about decoder? Could you help me with this problem? Thank you for your early reply!!!
Hi, thank you for this valuable resource.
The train.zip provided on the google drive, upon unzipping gives the following error message:
file #9024: bad zipfile offset (local header sig): 4819319277
error: invalid zip file with overlapped components (possible zip bomb)
How shall I resolve this issue?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.