GithubHelp home page GithubHelp logo

soumith / lua---audio Goto Github PK

View Code? Open in Web Editor NEW
67.0 10.0 15.0 208 KB

Module for torch to support audio i/o as well as do common operations like dFFT, generate spectrograms etc.

License: Other

CMake 2.77% C 68.32% Lua 28.91%

lua---audio's People

Contributors

linusu avatar samehkhamis avatar soumith avatar wydwww avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lua---audio's Issues

load_and_save_example.lua not working

my code:

require 'audio'

aud, sample_rate = audio.load('what_a_wonderful_world.mp3')

print(#aud)
print(sample_rate)

audio.save('test_out.mp3', aud, sample_rate)

output:

 6094080
       2
[torch.LongStorage of size 2]

test_out.mp3 ends up being an empty mp3 file. input file is 6.2mb, test_out.mp3 is 417 bytes, and duration is 00:00

install problem

After I installed the first two package,when I use” luarocks install https://raw.githubusercontent.com/soumith/lua---audio/master/audio-0.1-0.rockspec“ , it comes error like

Scanning dependencies of target audio
[ 50%] Building C object CMakeFiles/audio.dir/audio.c.o
Linking C shared module libaudio.so
/usr/bin/ld: /usr/local/lib/libfftw3.a(mapflags.o): relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/local/lib/libfftw3.a: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
make[2]: *** [libaudio.so] Error 1
make[1]: *** [CMakeFiles/audio.dir/all] Error 2
make: *** [all] Error 2

Error: Build error: Failed building.

I tried to figure this out but failed.

formats: no handler for file extension `mp3'

I was following the steps in the example on readme when I encountered this error.

require 'audio'
require 'image'
voice = audio.samplevoice()
formats: no handler for file extension `mp3'
[read_audio_file] Failure to read file
Aborted (core dumped)

error with libfftw3 with installing audio

I'm getting the following error when installing with luarocks install https://raw.githubusercontent.com/soumith/lua---audio/master/audio-0.1-0.rockspec

Using https://raw.githubusercontent.com/soumith/lua---audio/master/audio-0.1-0.rockspec... switching to 'build' mode
Cloning into 'lua---audio'...
remote: Counting objects: 16, done.
remote: Compressing objects: 100% (15/15), done.
remote: Total 16 (delta 1), reused 8 (delta 0), pack-reused 0
Receiving objects: 100% (16/16), 156.86 KiB | 0 bytes/s, done.
Resolving deltas: 100% (1/1), done.
Checking connectivity... done.
   cmake -E make_directory build && cd build && cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/home/greg/torch/install/bin/.." -DCMAKE_INSTALL_PREFIX="/home/greg/torch/install/lib/luarocks/rocks/audio/0.1-0" && make

-- The C compiler identification is GNU 4.8.4
-- The CXX compiler identification is GNU 4.8.4
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Found Torch7 in /home/greg/torch/install
SOX_INCLUDE_DIR: /usr/include
SOX_LIBRARIES: /usr/lib/x86_64-linux-gnu/libsox.so
FFTW_INCLUDE_DIR: /usr/local/include
FFTW_LIBRARIES: /usr/local/lib/libfftw3.a
-- Configuring done
-- Generating done
-- Build files have been written to: /tmp/luarocks_audio-0.1-0-9939/lua---audio/build
Scanning dependencies of target audio
[ 25%] Building C object CMakeFiles/audio.dir/audio.c.o
Linking C shared module libaudio.so
/usr/bin/ld: /usr/local/lib/libfftw3.a(mapflags.o): relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/local/lib/libfftw3.a: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
make[2]: *** [libaudio.so] Error 1
make[1]: *** [CMakeFiles/audio.dir/all] Error 2
make: *** [all] Error 2

Error: Build error: Failed building.

Extreme range of audio data samples

Hi,
For a 16 bit PCM audio sampled at 11,025 Hz, I would expect the numeric range of audio samples to be (-32768,32768) (i.e. the range of 16 bit signed integer). This is true when I open an audio file using any other library (like using the tuneR in R package) as shown below:

audio_sample_hist_r

However, when I load the same audio file using lua--audio's audio.load() function, the number range is extremely large -- a DoubleTensor with range (-2147483648, 1719271424) as shown below:

audio_sample_hist_lua

Can someone tell me why is this the case? Furthermore, how can I convert the DoubleTensor values to a signed 16 bit integer?

Apple .m4a?

I'm running on Ubuntu 14.04. I was wondering is there a way to add support for the .m4a files that come from iTunes?

Here the sox output:
AUDIO FILE FORMATS: 8svx aif aifc aiff aiffc al amb amr-nb amr-wb anb au avr awb caf cdda cdr cvs cvsd cvu dat dvms f32 f4 f64 f8 fap flac fssd gsm gsrt hcom htk ima ircam la lpc lpc10 lu mat mat4 mat5 maud mp2 mp3 nist ogg paf prc pvf raw s1 s16 s2 s24 s3 s32 s4 s8 sb sd2 sds sf sl sln smp snd sndfile sndr sndt sou sox sph sw txw u1 u16 u2 u24 u3 u32 u4 u8 ub ul uw vms voc vorbis vox w64 wav wavpcm wv wve xa xi
PLAYLIST FORMATS: m3u pls
AUDIO DEVICE DRIVERS: alsa ao oss ossdsp pulseaudio

Converting large mp3 to wav files - error: Unknown length

I am having issue with loading a large wav file.
So what I did was: I loaded a mp3 file with more than 20min music and then saved it as wav.
But when I load the wav that I just generated it gives the error below:

/home/achang/torch/install/share/lua/5.1/audio/init.lua:56: [read_audio] Unknown length at /tmp/luarocks_audio-0.1-0-130/lua---audio/generic/sox.c:45
stack traceback:
    [C]: in function 'load'
    /home/achang/torch/install/share/lua/5.1/audio/init.lua:56: in function 'load'
    mp3_play.lua:16: in function 'load_file'
    [string "x = load_file('music.wav')"]:1: in main chunk
    [C]: in function 'xpcall'
    /home/achang/torch/install/share/lua/5.1/trepl/init.lua:670: in function 'repl'
    ...hang/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
    [C]: at 0x00405d50

I use the code below to convert:

function convert_mp3_2_wav(file_path)                                            
   local x, sample_rate = audio.load(file_path)                                  
   local outpath = file_path:gsub(".mp3", ".wav")                                                                               
   audio.save(outpath, x, sample_rate)                                           
end

When use a player the wav file sounds fine.

torch.CudaTensor doesn't contain field 'libsox'.

When I finished calculating CudaTensor on GPU, I try to save audio (Type() = CudaTensor) with
audio.save(('path'), music, 44100)

Error happened

/home/exp/torch/install/bin/luajit: /home/exp/torch/install/share/lua/5.1/audio/init.lua:74: attempt to index field 'libsox' (a nil value)
stack traceback:
    /home/exp/torch/install/share/lua/5.1/audio/init.lua:74: in function 'save'
    sample_music.lua:155: in main chunk
    [C]: in function 'dofile'
    .../exp/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00405ea0

So I must transform torch.CudaTensor to torch.DoubleTensor.

By the way, I don't know why I can't save music to mp3 correctly. When I save music to mp3, I get a very small file that can't play.

audio.load(...) leaks memory

I've been trying to debug exactly where it's leaking, but it seems like valgrind and the likes don't like luajit a lot. However, my tests show that indeed audio.load() does leak memory. Reproduce with:

require 'audio'
while true do
   data = audio.load( 'voice.mp3' )
   collectgarbage()
end

It will eventually get killed by the OOM-killer if you let it run long enough.

Incompatible with lua 5.3

Builds, but doesn't import properly.

+ lua -e 'require '\''audio'\'''
lua: error loading module 'libaudio' from file '/Users/awiltschko/anaconda/envs/_test/lib/lua/5.3/libaudio.so':
    dlopen(/Users/awiltschko/anaconda/envs/_test/lib/lua/5.3/libaudio.so, 6): Symbol not found: _luaL_checkint
  Referenced from: /Users/awiltschko/anaconda/envs/_test/lib/lua/5.3/libaudio.so
  Expected in: flat namespace
 in /Users/awiltschko/anaconda/envs/_test/lib/lua/5.3/libaudio.so
stack traceback:
    [C]: in ?
    [C]: in function 'require'
    ...ltschko/anaconda/envs/_test/share/lua/5.3/audio/init.lua:37: in main chunk
    [C]: in function 'require'
    (command line):1: in main chunk
    [C]: in ?

audio.decompress from lua i/o?

Is there any way to do the following?

require 'audio'

file = io.open("foo.wav", "r")
-- reading all to get the RIFF header
-- could offset to get the raw audio if sample rate is known
contents = file:read("*all")
file:close()

chars = torch.CharTensor(#contents)
chars:storage():string(contents)

signal = audio.decompress(chars, 'wav')

I tried playing around with it a little but but your audio.compress seems to compress the file down to a larger size CharTensor than I expected. For example, suppose that foo.wav is 1 second long with a sample rate of 16kHz — so, 16000 samples. Each sample is a short, so I expect to have 2 * 16000 = 32000 bytes or chars when I compress it to a CharTensor. However, audio.compress will compress it to a 3 * 16000 = 48000 length CharTensor.

I assume this has something to do with embedding the sample rate.

Thanks!

Dimension is different between lua---audio on librosa!

Thank you very much for your contribution Soumith.

When I read the same audio file (.mp3) with your library and 'librosa' in python, I get different size as an output.

Mono channel, sampling rate : 22050
lua---audio returns (417024x1)
librosa returns (417600x1)

Any idea what would be the reason?

Thank you very much.

Building against 5.2

Getting this error building against Lua 5.2 (LuaJIT 2.0 works fine)

Make any sense?

+ lua -e 'require '\''audio'\'''
lua: error loading module 'libaudio' from file '/Users/Alex/anaconda/envs/_test/lib/lua/5.2/libaudio.so':
    dlopen(/Users/Alex/anaconda/envs/_test/lib/lua/5.2/libaudio.so, 6): Symbol not found: _luaL_register
  Referenced from: /Users/Alex/anaconda/envs/_test/lib/lua/5.2/libaudio.so
  Expected in: flat namespace

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.