wiseman / py-webrtcvad Goto Github PK

View Code? Open in Web Editor NEW

2.0K 2.0K 401.0 250 KB

Python interface to the WebRTC Voice Activity Detector

License: Other

C 79.39% Python 4.75% Objective-C 3.17% C++ 12.68%

py-webrtcvad's People

Stargazers

Watchers

Forkers

dariocazzani hdubey daijinlong xiongyihui borisjineman great-thoughts vlinhd11 talksmart g10dras jhoelzl zomeelee maksymdelta bradparks kuonanhong pinxue bond005 kinjalk cosmicwatcher ymihay rosrad venssy aravieshancar whuzk kbespalov jrgillick yyj2013 timobaumann nichongjia varshithr brendonlynch sprite728 wenwei-dev lwgkzl aitorbajo nemocpp gaoyiyeah mirshahriar praveeny1986 eternityup heypinch brijmohan dlemenage075 stansilas diggerdu zhihaoguo jiefengpeng robert-brais jupinter ubermenschlzy root20 voletiv tanyufei ericsonj cogmeta fanwei918 liuchongming74 denethor1997 plume versatilus maggie0830 kernellabs weiyi-bitw dylancao rahimnathwani jiangnantian jrkolsby kareem-workfit linjucs cveaux thrill007 koseunghee yanchaomars redscv yclan-blue cc-cherie njpinton reinhardhsu runngezhang zjean001 cxmmeg hurinhu hpcn52 saikishor shiwanglei lkfo415579 komejisatori beautifulsumday mveysiyildiz mattanimation wzugang zeevrannon krishnamohan191 suptm pplus xaviergithub whyxzh yuluoqingtian heroking alixdehghani sagar2226

py-webrtcvad's Issues

how to save time stamps of the chunks generated with the file name of that chunk

Will this detect the speech segments in any scenario?

In case I have an audio with
tring_tring ... hold music... speaker_1 ... speaker_2 ... ... speaker_1 .... hold_music end_of_conversation.

If I were to pass the above above audio to this library,
Will this library ignore all music and extract just the speaker_1 speaker_2 segments?
If so how is this to be accomplished?

add parameters `frame shift` & `frame length `

How can I set parameters frame shift & frame length?
example.py line 53 and line 56
In my opinion, the n represents fram length and the duration represents frame shift,
but when I change
n = int(sample_rate * (frame_duration_ms / 1000.0) * 4), it raises the webrtcvad.Error

PIP installation errors and fix (Can be added to readme/faq)

I was getting many errors when I ran pip installation command
Error message was similar to

SSL routines: SSL23_GET_SERVER_HELLO: tlsv1 alert internal error

The fix was to uninstall pip, reinstall openssl from https://github.com/openssl/openssl and then installing pip again.

This fixed the errors during the pip installation of webrtcvad.

Comments for the example.py file?

Hi there,

I'm trying to use the example.py to output files where I've got speech as opposed to no speech. At the moment it outputs files at different intervals

Was wondering whether you have comments to attach to the file, to understand what some of the details in the vad_collector function, for example do, as I'm trying to tweak it a bit to get what I want.

Thanks,
Florin

Enhancement request - Limit audio file chunk sizes

Python 3.5.2

Ran a few small tests today using py-webrtcvad with a 19 second WAV file. I'm impressed with the results as the output WAV's appear to respect what I will call 'word boundaries'. That is, words are not broken up on output.

Can an enhancement be made to limit the audio file chunk sizes, as the audio chunk files created were between 1 second and 48 seconds. Apparently smaller WAV files are required for training purposes in Speech Recognition.

AssertionError

Hello, I tried to run example.py using a wav file but failed with AssertionError on num_channels, what should I do to overcome this issue?

Update
I used from pydub's split_to_mono to separate channels, now I am having another AssertionError with sample_rate

ERROR: Failed building wheel for webrtcvad

Hello friends,

I'm experiencing an error when I try to install webrtcvad via pip. I'm trying to install using the conda CLI. Some information about my environment:

OS: Windows 10 Insider Preview 10.0.19013.1122.
pip Version: pip 19.3.1
Conda version: conda 4.7.12

Basically, the error is:

TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

I'll put the entire stack below, but I imagine that the error is just a 'null' value instead of the string itself.

Also, I'm here to help you with any further information/test that was necessary.

C:\Users\Csorgo\Documents\projects\DeepFake\Real-Time-Voice-Cloning> pip install webrtcvad Collecting webrtcvad
Using cached https://files.pythonhosted.org/packages/89/34/e2de2d97f3288512b9ea56f92e7452f8207eb5a0096500badf9dfd48f5e6/webrtcvad-2.0.10.tar.gz
Building wheels for collected packages: webrtcvad
Building wheel for webrtcvad (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: 'D:\Programas\Anaconda\envs\Voice DeepFake\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Csorgo\AppData\Local\Temp\pip-install-c9idg9gi\webrtcvad\setup.py'"'"'; file='"'"'C:\Users\Csorgo\AppData\Local\Temp\pip-install-c9idg9gi\webrtcvad\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\Csorgo\AppData\Local\Temp\pip-wheel-bg17sy2j' --python-tag cp37
cwd: C:\Users\Csorgo\AppData\Local\Temp\pip-install-c9idg9gi\webrtcvad
Complete output (55 lines):
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-3.7
copying webrtcvad.py -> build\lib.win-amd64-3.7
running build_ext
building 'webrtcvad' extension
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Csorgo\AppData\Local\Temp\pip-install-c9idg9gi\webrtcvad\setup.py", line 87, in
'psutil', 'memory_profiler']
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\setuptools_init.py", line 145, in setup
return distutils.core.setup(**attrs)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\wheel\bdist_wheel.py", line 192, in run
self.run_command('build')
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\command\build.py", line 135, in run
self.run_command(cmd_name)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\setuptools\command\build_ext.py", line 84, in run
_build_ext.run(self)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\command\build_ext.py", line 340, in run
self.build_extensions()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\command\build_ext.py", line 449, in build_extensions
self._build_extensions_serial()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\command\build_ext.py", line 474, in _build_extensions_serial
self.build_extension(ext)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\setuptools\command\build_ext.py", line 205, in build_extension
_build_ext.build_extension(self, ext)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\command\build_ext.py", line 534, in build_extension
depends=ext.depends)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils_msvccompiler.py", line 346, in compile
self.initialize()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils_msvccompiler.py", line 239, in initialize
vc_env = _get_vc_env(plat_spec)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\setuptools\msvc.py", line 171, in msvc14_get_vc_env
return EnvironmentInfo(plat_spec, vc_min_ver=14.0).return_env()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\setuptools\msvc.py", line 1620, in return_env
if self.vs_ver >= 14 and isfile(self.VCRuntimeRedist):
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\genericpath.py", line 30, in isfile
st = os.stat(path)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

ERROR: Failed building wheel for webrtcvad
Running setup.py clean for webrtcvad
Failed to build webrtcvad
Installing collected packages: webrtcvad
Running setup.py install for webrtcvad ... error
ERROR: Command errored out with exit status 1:
command: 'D:\Programas\Anaconda\envs\Voice DeepFake\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Csorgo\AppData\Local\Temp\pip-install-c9idg9gi\webrtcvad\setup.py'"'"'; file='"'"'C:\Users\Csorgo\AppData\Local\Temp\pip-install-c9idg9gi\webrtcvad\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Csorgo\AppData\Local\Temp\pip-record-mriqpw4h\install-record.txt' --single-version-externally-managed --compile
cwd: C:\Users\Csorgo\AppData\Local\Temp\pip-install-c9idg9gi\webrtcvad
Complete output (57 lines):
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.7
copying webrtcvad.py -> build\lib.win-amd64-3.7
running build_ext
building 'webrtcvad' extension
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Csorgo\AppData\Local\Temp\pip-install-c9idg9gi\webrtcvad\setup.py", line 87, in
'psutil', 'memory_profiler']
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\setuptools_init.py", line 145, in setup
return distutils.core.setup(**attrs)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\setuptools\command\install.py", line 61, in run
return orig.install.run(self)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\command\install.py", line 545, in run
self.run_command('build')
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\command\build.py", line 135, in run
self.run_command(cmd_name)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\setuptools\command\build_ext.py", line 84, in run
_build_ext.run(self)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\command\build_ext.py", line 340, in run
self.build_extensions()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\command\build_ext.py", line 449, in build_extensions
self._build_extensions_serial()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\command\build_ext.py", line 474, in _build_extensions_serial
self.build_extension(ext)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\setuptools\command\build_ext.py", line 205, in build_extension
_build_ext.build_extension(self, ext)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils\command\build_ext.py", line 534, in build_extension
depends=ext.depends)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils_msvccompiler.py", line 346, in compile
self.initialize()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\distutils_msvccompiler.py", line 239, in initialize
vc_env = _get_vc_env(plat_spec)
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\setuptools\msvc.py", line 171, in msvc14_get_vc_env
return EnvironmentInfo(plat_spec, vc_min_ver=14.0).return_env()
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\site-packages\setuptools\msvc.py", line 1620, in return_env
if self.vs_ver >= 14 and isfile(self.VCRuntimeRedist):
File "D:\Programas\Anaconda\envs\Voice DeepFake\lib\genericpath.py", line 30, in isfile
st = os.stat(path)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
----------------------------------------
ERROR: Command errored out with exit status 1: 'D:\Programas\Anaconda\envs\Voice DeepFake\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Csorgo\AppData\Local\Temp\pip-install-c9idg9gi\webrtcvad\setup.py'"'"'; file='"'"'C:\Users\Csorgo\AppData\Local\Temp\pip-install-c9idg9gi\webrtcvad\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Csorgo\AppData\Local\Temp\pip-record-mriqpw4h\install-record.txt' --single-version-externally-managed --compile Check the logs for full command output.

I can not validate a chunk in the size of 1024

If you get your test files with whatever audio I'm generating, everything works. Each chunk of his was in the "size" of 480 (variable n). If you put any other value outside of this, the application receives the following error:

chunk = file.read(960)
>>> vad.is_speech(chunk, 8000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/webrtcvad.py", line 27, in is_speech
    return _webrtcvad.process(self._vad, sample_rate, buf, length)
webrtcvad.Error: Error while processing frame

Question about perfomance

Hello @wiseman thx for awesome interface!
I just want to know, how much it hit perfomance, if I send audio to server and detecting speech 100 times per second it's not heavy task, but what if I have 200 users, and this will be 20000 requests per second + calculations, isn't it too much? How many cpu server should have to handle this? Or better to do it on client side in this case thx!

How to install in conda ?

Is there a way to install this module in a conda virtual enviroment ?

AttributeError: module 'webrtcvad' has no attribute 'Vad'

I installed in Ubuntu server with python 3.6.6 using pip install "". But, I am always getting this error "AttributeError: module 'webrtcvad' has no attribute 'Vad'"

How does ASR determine output?

Hello,

i am looking for an VAD that does not only count the energy level in a frame, but also incorporates voice/unvoiced segments (for example filter the frequency bands used for voice and compare the energy level to frequency bands that are not used for voice)

When i use this library (with 10, 20 or 30ms frames) and i tap on my microphone, the VAD returns true. Is there a way to overcome this?

Is there documentation available about the VAD processing insights?

Thanks,
Josef

Frame duration limitation

A frame must be either 10, 20, or 30 ms in duration
but I need rather small value of frame either frame by frame. Is there any possibility to get the library have option of 1 frame duration as well?

Please help.

Why the result is different ?

Dear Wiseman,
When I use webrtcvad2.0.10 to detect some wav,I found the result is different when I comment the code

    num_voiced = len([f for f in ring_buffer
                              if vad.is_speech(f.bytes, sample_rate)])

env:
centos6.5 Python 2.7.14

before comment:

000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

after comment:

000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001111100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

is_speech returns 1 even when there is just noise and no speech in the attached wav file

I am running example.py as it is on my Windows 10 machine with Python 3.6. The sample wav file being used is extracted from a video recorded on a phone. There is no speech frames in this 8 seconds audio but still the output from the example.py is exactly like the input and seem to mark most of the frames as is_speech = 1. I am puzzled as the output should have not marked any frame as speech! Could there be some issue with this wav file?

Sample wav file 11.wav

Executed with params -
example.py 3 11.wav

Terminal output is as below:
00001111111111+(0.12)1111111111111111111111111110111111111111111111111011100111101111111111111111111011110000000011111111111111111111111111111111111111111111111111101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111-(8.280000000000015)
Writing chunk-00.wav

Inconsistent result of is_speech

I am using vad algorithm by modifying the 'example.py' code you have uploaded.
While running the code, I found something strange.

I added some lines to the example.py to print is_speech() result of the same frame for several times.

def vad_collector(sample_rate, frame_duration_ms, padding_duration_ms, vad, frames):
    num_padding_frames = int(padding_duration_ms / frame_duration_ms)
    ring_buffer = collections.deque(maxlen=num_padding_frames)
    triggered = False
    voiced_frames = []
    for frame in frames:
        sys.stdout.write(
            '1' if vad.is_speech(frame.bytes, sample_rate) else '0')
        if not triggered:
            ring_buffer.append(frame)


            # print vad results
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            print(vad.is_speech(ring_buffer[0].bytes, sample_rate))
            import os;os.exit()        # stop the code


            num_voiced = len([f for f in ring_buffer
                              if vad.is_speech(f.bytes, sample_rate)])
            if num_voiced > 0.9 * ring_buffer.maxlen:
                sys.stdout.write('+(%s)' % (ring_buffer[0].timestamp,))
                triggered = True
                voiced_frames.extend(ring_buffer)
                ring_buffer.clear()
        else:
            voiced_frames.append(frame)
            ring_buffer.append(frame)
            num_unvoiced = len([f for f in ring_buffer
                                if not vad.is_speech(f.bytes, sample_rate)])
            if num_unvoiced > 0.9 * ring_buffer.maxlen:
                sys.stdout.write('-(%s)' % (frame.timestamp + frame.duration))
                triggered = False
                yield b''.join([f.bytes for f in voiced_frames])
                ring_buffer.clear()
                voiced_frames = []
    if triggered:
        sys.stdout.write('-(%s)' % (frame.timestamp + frame.duration))
    sys.stdout.write('\n')
    if voiced_frames:
        yield b''.join([f.bytes for f in voiced_frames])

I thought every printed line should show the same boolean value (True or False), but there were different values among them.
Could you explain this result?

Couldn't install on python 3.6 windows 7

I got the following error after running the command
pip install webrtcvad

- --python-tag cp36:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-3.6
  copying webrtcvad.py -> build\lib.win-amd64-3.6
  running build_ext
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "C:\Users\engadmin\AppData\Local\Temp\pip-build-eqlnrdtl\webrtcvad\setu
p.py", line 87, in <module>
      'psutil', 'memory_profiler']
    File "C:\Users\engadmin\Anaconda3\lib\distutils\core.py", line 148, in setup

      dist.run_commands()
    File "C:\Users\engadmin\Anaconda3\lib\distutils\dist.py", line 955, in run_c
ommands
      self.run_command(cmd)
    File "C:\Users\engadmin\Anaconda3\lib\distutils\dist.py", line 974, in run_c
ommand
      cmd_obj.run()
    File "c:\users\engadmin\anaconda3\lib\site-packages\wheel\bdist_wheel.py", l
ine 179, in run
      self.run_command('build')
    File "C:\Users\engadmin\Anaconda3\lib\distutils\cmd.py", line 313, in run_co
mmand
      self.distribution.run_command(command)
    File "C:\Users\engadmin\Anaconda3\lib\distutils\dist.py", line 974, in run_c
ommand
      cmd_obj.run()
    File "C:\Users\engadmin\Anaconda3\lib\distutils\command\build.py", line 135,
 in run
      self.run_command(cmd_name)
    File "C:\Users\engadmin\Anaconda3\lib\distutils\cmd.py", line 313, in run_co
mmand
      self.distribution.run_command(command)
    File "C:\Users\engadmin\Anaconda3\lib\distutils\dist.py", line 974, in run_c
ommand
      cmd_obj.run()
    File "c:\users\engadmin\anaconda3\lib\site-packages\setuptools\command\build
_ext.py", line 75, in run
      _build_ext.run(self)
    File "c:\users\engadmin\anaconda3\lib\site-packages\Cython\Distutils\old_bui
ld_ext.py", line 185, in run
      _build_ext.build_ext.run(self)
    File "C:\Users\engadmin\Anaconda3\lib\distutils\command\build_ext.py", line
308, in run
      force=self.force)
    File "C:\Users\engadmin\Anaconda3\lib\distutils\ccompiler.py", line 1031, in
 new_compiler
      return klass(None, dry_run, force)
    File "C:\Users\engadmin\Anaconda3\lib\distutils\cygwinccompiler.py", line 28
2, in __init__
      CygwinCCompiler.__init__ (self, verbose, dry_run, force)
    File "C:\Users\engadmin\Anaconda3\lib\distutils\cygwinccompiler.py", line 15
7, in __init__
      self.dll_libraries = get_msvcr()
    File "C:\Users\engadmin\Anaconda3\lib\distutils\cygwinccompiler.py", line 86
, in get_msvcr
      raise ValueError("Unknown MS Compiler version %s " % msc_ver)
  ValueError: Unknown MS Compiler version 1900

  ----------------------------------------
  Failed building wheel for webrtcvad
  Running setup.py clean for webrtcvad
Failed to build webrtcvad
Installing collected packages: webrtcvad
  Running setup.py install for webrtcvad ... error
    Complete output from command C:\Users\engadmin\Anaconda3\python.exe -u -c "i
mport setuptools, tokenize;__file__='C:\\Users\\engadmin\\AppData\\Local\\Temp\\
pip-build-eqlnrdtl\\webrtcvad\\setup.py';f=getattr(tokenize, 'open', open)(__fil
e__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__,
'exec'))" install --record C:\Users\engadmin\AppData\Local\Temp\pip-k7jjwudm-rec
ord\install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    creating build
    creating build\lib.win-amd64-3.6
    copying webrtcvad.py -> build\lib.win-amd64-3.6
    running build_ext
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\Users\engadmin\AppData\Local\Temp\pip-build-eqlnrdtl\webrtcvad\se
tup.py", line 87, in <module>
        'psutil', 'memory_profiler']
      File "C:\Users\engadmin\Anaconda3\lib\distutils\core.py", line 148, in set
up
        dist.run_commands()
      File "C:\Users\engadmin\Anaconda3\lib\distutils\dist.py", line 955, in run
_commands
        self.run_command(cmd)
      File "C:\Users\engadmin\Anaconda3\lib\distutils\dist.py", line 974, in run
_command
        cmd_obj.run()
      File "c:\users\engadmin\anaconda3\lib\site-packages\setuptools\command\ins
tall.py", line 61, in run
        return orig.install.run(self)
      File "C:\Users\engadmin\Anaconda3\lib\distutils\command\install.py", line
545, in run
        self.run_command('build')
      File "C:\Users\engadmin\Anaconda3\lib\distutils\cmd.py", line 313, in run_
command
        self.distribution.run_command(command)
      File "C:\Users\engadmin\Anaconda3\lib\distutils\dist.py", line 974, in run
_command
        cmd_obj.run()
      File "C:\Users\engadmin\Anaconda3\lib\distutils\command\build.py", line 13
5, in run
        self.run_command(cmd_name)
      File "C:\Users\engadmin\Anaconda3\lib\distutils\cmd.py", line 313, in run_
command
        self.distribution.run_command(command)
      File "C:\Users\engadmin\Anaconda3\lib\distutils\dist.py", line 974, in run
_command
        cmd_obj.run()
      File "c:\users\engadmin\anaconda3\lib\site-packages\setuptools\command\bui
ld_ext.py", line 75, in run
        _build_ext.run(self)
      File "c:\users\engadmin\anaconda3\lib\site-packages\Cython\Distutils\old_b
uild_ext.py", line 185, in run
        _build_ext.build_ext.run(self)
      File "C:\Users\engadmin\Anaconda3\lib\distutils\command\build_ext.py", lin
e 308, in run
        force=self.force)
      File "C:\Users\engadmin\Anaconda3\lib\distutils\ccompiler.py", line 1031,
in new_compiler
        return klass(None, dry_run, force)
      File "C:\Users\engadmin\Anaconda3\lib\distutils\cygwinccompiler.py", line
282, in __init__
        CygwinCCompiler.__init__ (self, verbose, dry_run, force)
      File "C:\Users\engadmin\Anaconda3\lib\distutils\cygwinccompiler.py", line
157, in __init__
        self.dll_libraries = get_msvcr()
      File "C:\Users\engadmin\Anaconda3\lib\distutils\cygwinccompiler.py", line
86, in get_msvcr
        raise ValueError("Unknown MS Compiler version %s " % msc_ver)
    ValueError: Unknown MS Compiler version 1900

    ----------------------------------------
Command "C:\Users\engadmin\Anaconda3\python.exe -u -c "import setuptools, tokeni
ze;__file__='C:\\Users\\engadmin\\AppData\\Local\\Temp\\pip-build-eqlnrdtl\\webr
tcvad\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().repla
ce('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --reco
rd C:\Users\engadmin\AppData\Local\Temp\pip-k7jjwudm-record\install-record.txt -
-single-version-externally-managed --compile" failed with error code 1 in C:\Use
rs\engadmin\AppData\Local\Temp\pip-build-eqlnrdtl\webrtcvad\

Is there any way to distinguish silence and noise from the input?

Why is there a variation in output with various audio readers?

Hi,
I had been trying to read the various .wav files with different python libraries. I tried using wave as you did and also tried librosa, ffmpeg and soundfile. But with different libraries i am getting different number of chunks in the output. Can you please explain why is it that VAD works the best when we use wave reader over the other audio readers.

[start time, end time] of extracted segments

Thanks a lot for sharing your code.

I would like to know how I can get the [start time, end time] data of extracted segments. I am running "example.py".

README correction - also supports 48kHz

I was looking at the source and noticed that the chromium webrtc vad library also supports 48kHz audio, which is not mentioned in the README.

Might save some headaches if you add it to the README and the test for valid sample rates.

Edit: probably should've included a link to the source in case I'm misunderstanding something.

Get vad score

Is it possible to get the vad score? I mean, to delete this lines:

py-webrtcvad/cbits/webrtc/common_audio/vad/webrtc_vad.c

Lines 86 to 88 in 3b39545

 if (vad > 0) { 

 vad = 1; 

 }

I would like to set a custom threshold to consider that there is speech or not.

Thanks!

System Crash

Hi All,
I get a system crush when ever I try to execute this line:
frames = frame_generator(30, audio, sample_rate)
frames = list(frames)

It seems frames = list(frames) takes up all the RAM (16 GB)

Please advice

macos can't install

Hey guys. Thank you for your great work.

I install in other system easy. But come to macos. I encounter the problem

1 pip install fail

Collecting webrtcvad
  Using cached https://files.pythonhosted.org/packages/89/34/e2de2d97f3288512b9ea56f92e7452f8207eb5a0096500badf9dfd48f5e6/webrtcvad-2.0.10.tar.gz
Building wheels for collected packages: webrtcvad
  Building wheel for webrtcvad (setup.py) ... error
  ERROR: Complete output from command /Users/liuxinxin/anaconda3/bin/python -u -c 'import setuptools, tokenize;__file__='"'"'/private/var/folders/78/tp45tbbn6pn1yvhmk_wnzgvc0000gn/T/pip-install-btcmvsjj/webrtcvad/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /private/var/folders/78/tp45tbbn6pn1yvhmk_wnzgvc0000gn/T/pip-wheel-i9d1b_vk --python-tag cp36:
  ERROR: running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.macosx-10.7-x86_64-3.6
  copying webrtcvad.py -> build/lib.macosx-10.7-x86_64-3.6
  running build_ext
  building '_webrtcvad' extension
  creating build/temp.macosx-10.7-x86_64-3.6
  creating build/temp.macosx-10.7-x86_64-3.6/cbits
  creating build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc
  creating build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio
  creating build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing
  creating build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/vad
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/pywebrtcvad.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/pywebrtcvad.o
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/complex_fft.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/complex_fft.o
  In file included from cbits/webrtc/common_audio/signal_processing/complex_fft.c:19:0:
  cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init();
   ^
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/resample_by_2_internal.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/resample_by_2_internal.o
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/energy.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/energy.o
  In file included from cbits/webrtc/common_audio/signal_processing/energy.c:18:0:
  cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init();
   ^
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/downsample_fast.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/downsample_fast.o
  In file included from cbits/webrtc/common_audio/signal_processing/downsample_fast.c:11:0:
  cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init();
   ^
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/spl_init.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/spl_init.o
  In file included from cbits/webrtc/common_audio/signal_processing/spl_init.c:17:0:
  cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init();
   ^
  cbits/webrtc/common_audio/signal_processing/spl_init.c:34:13: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   static void InitPointersToC() {
               ^
  cbits/webrtc/common_audio/signal_processing/spl_init.c:140:6: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init() {
        ^
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/cross_correlation.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/cross_correlation.o
  In file included from cbits/webrtc/common_audio/signal_processing/cross_correlation.c:11:0:
  cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init();
   ^
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/division_operations.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/division_operations.o
  In file included from cbits/webrtc/common_audio/signal_processing/division_operations.c:24:0:
  cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init();
   ^
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/get_scaling_square.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/get_scaling_square.o
  In file included from cbits/webrtc/common_audio/signal_processing/get_scaling_square.c:18:0:
  cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init();
   ^
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/min_max_operations.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/min_max_operations.o
  In file included from cbits/webrtc/common_audio/signal_processing/min_max_operations.c:27:0:
  cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init();
   ^
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/vector_scaling_operations.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/vector_scaling_operations.o
  In file included from cbits/webrtc/common_audio/signal_processing/vector_scaling_operations.c:23:0:
  cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init();
   ^
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/resample_fractional.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/resample_fractional.o
  In file included from cbits/webrtc/common_audio/signal_processing/resample_fractional.c:18:0:
  cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init();
   ^
  gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/real_fft.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/real_fft.o
  In file included from /usr/include/Availability.h:236:0,
                   from /usr/include/stdlib.h:61,
                   from cbits/webrtc/common_audio/signal_processing/real_fft.c:13:
  /usr/include/AvailabilityInternal.h:33:18: error: missing binary operator before token "("
   #if __has_include(<AvailabilityInternalPrivate.h>)
                    ^
  In file included from /usr/include/stdlib.h:61:0,
                   from cbits/webrtc/common_audio/signal_processing/real_fft.c:13:
  /usr/include/Availability.h:497:18: error: missing binary operator before token "("
   #if __has_include(<AvailabilityProhibitedInternal.h>)
                    ^
  In file included from cbits/webrtc/common_audio/signal_processing/real_fft.c:15:0:
  cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
   void WebRtcSpl_Init();
   ^
  error: command 'gcc' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for webrtcvad
  Running setup.py clean for webrtcvad
Failed to build webrtcvad
Installing collected packages: webrtcvad
  Running setup.py install for webrtcvad ... error
    ERROR: Complete output from command /Users/liuxinxin/anaconda3/bin/python -u -c 'import setuptools, tokenize;__file__='"'"'/private/var/folders/78/tp45tbbn6pn1yvhmk_wnzgvc0000gn/T/pip-install-btcmvsjj/webrtcvad/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/78/tp45tbbn6pn1yvhmk_wnzgvc0000gn/T/pip-record-mkmcvora/install-record.txt --single-version-externally-managed --compile:
    ERROR: running install
    running build
    running build_py
    creating build
    creating build/lib.macosx-10.7-x86_64-3.6
    copying webrtcvad.py -> build/lib.macosx-10.7-x86_64-3.6
    running build_ext
    building '_webrtcvad' extension
    creating build/temp.macosx-10.7-x86_64-3.6
    creating build/temp.macosx-10.7-x86_64-3.6/cbits
    creating build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc
    creating build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio
    creating build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing
    creating build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/vad
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/pywebrtcvad.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/pywebrtcvad.o
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/complex_fft.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/complex_fft.o
    In file included from cbits/webrtc/common_audio/signal_processing/complex_fft.c:19:0:
    cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init();
     ^
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/resample_by_2_internal.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/resample_by_2_internal.o
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/energy.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/energy.o
    In file included from cbits/webrtc/common_audio/signal_processing/energy.c:18:0:
    cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init();
     ^
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/downsample_fast.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/downsample_fast.o
    In file included from cbits/webrtc/common_audio/signal_processing/downsample_fast.c:11:0:
    cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init();
     ^
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/spl_init.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/spl_init.o
    In file included from cbits/webrtc/common_audio/signal_processing/spl_init.c:17:0:
    cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init();
     ^
    cbits/webrtc/common_audio/signal_processing/spl_init.c:34:13: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     static void InitPointersToC() {
                 ^
    cbits/webrtc/common_audio/signal_processing/spl_init.c:140:6: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init() {
          ^
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/cross_correlation.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/cross_correlation.o
    In file included from cbits/webrtc/common_audio/signal_processing/cross_correlation.c:11:0:
    cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init();
     ^
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/division_operations.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/division_operations.o
    In file included from cbits/webrtc/common_audio/signal_processing/division_operations.c:24:0:
    cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init();
     ^
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/get_scaling_square.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/get_scaling_square.o
    In file included from cbits/webrtc/common_audio/signal_processing/get_scaling_square.c:18:0:
    cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init();
     ^
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/min_max_operations.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/min_max_operations.o
    In file included from cbits/webrtc/common_audio/signal_processing/min_max_operations.c:27:0:
    cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init();
     ^
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/vector_scaling_operations.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/vector_scaling_operations.o
    In file included from cbits/webrtc/common_audio/signal_processing/vector_scaling_operations.c:23:0:
    cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init();
     ^
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/resample_fractional.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/resample_fractional.o
    In file included from cbits/webrtc/common_audio/signal_processing/resample_fractional.c:18:0:
    cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init();
     ^
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/real_fft.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/real_fft.o
    In file included from /usr/include/Availability.h:236:0,
                     from /usr/include/stdlib.h:61,
                     from cbits/webrtc/common_audio/signal_processing/real_fft.c:13:
    /usr/include/AvailabilityInternal.h:33:18: error: missing binary operator before token "("
     #if __has_include(<AvailabilityInternalPrivate.h>)
                      ^
    In file included from /usr/include/stdlib.h:61:0,
                     from cbits/webrtc/common_audio/signal_processing/real_fft.c:13:
    /usr/include/Availability.h:497:18: error: missing binary operator before token "("
     #if __has_include(<AvailabilityProhibitedInternal.h>)
                      ^
    In file included from cbits/webrtc/common_audio/signal_processing/real_fft.c:15:0:
    cbits/webrtc/common_audio/signal_processing/include/signal_processing_library.h:115:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
     void WebRtcSpl_Init();
     ^
    error: command 'gcc' failed with exit status 1
    ----------------------------------------
ERROR: Command "/Users/liuxinxin/anaconda3/bin/python -u -c 'import setuptools, tokenize;__file__='"'"'/private/var/folders/78/tp45tbbn6pn1yvhmk_wnzgvc0000gn/T/pip-install-btcmvsjj/webrtcvad/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/78/tp45tbbn6pn1yvhmk_wnzgvc0000gn/T/pip-record-mkmcvora/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/var/folders/78/tp45tbbn6pn1yvhmk_wnzgvc0000gn/T/pip-install-btcmvsjj/webrtcvad/

2 download install error

running install
running bdist_egg
running egg_info
writing webrtcvad.egg-info/PKG-INFO
writing dependency_links to webrtcvad.egg-info/dependency_links.txt
writing requirements to webrtcvad.egg-info/requires.txt
writing top-level names to webrtcvad.egg-info/top_level.txt
reading manifest file 'webrtcvad.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'webrtcvad.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.7-x86_64/egg
running install_lib
running build_py
running build_ext
building '_webrtcvad' extension
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/pywebrtcvad.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/pywebrtcvad.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/complex_fft.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/complex_fft.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/resample_by_2_internal.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/resample_by_2_internal.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/energy.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/energy.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/downsample_fast.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/downsample_fast.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/spl_init.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/spl_init.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/cross_correlation.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/cross_correlation.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/division_operations.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/division_operations.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/get_scaling_square.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/get_scaling_square.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/liuxinxin/anaconda3/include -arch x86_64 -I/Users/liuxinxin/anaconda3/include -arch x86_64 -DWEBRTC_POSIX -Icbits -I/Users/liuxinxin/anaconda3/include/python3.6m -c cbits/webrtc/common_audio/signal_processing/min_max_operations.c -o build/temp.macosx-10.7-x86_64-3.6/cbits/webrtc/common_audio/signal_processing/min_max_operations.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
In file included from /usr/include/Availability.h:236:0,
                 from /usr/include/stdlib.h:61,
                 from cbits/webrtc/common_audio/signal_processing/min_max_operations.c:27:
/usr/include/AvailabilityInternal.h:33:18: error: missing binary operator before token "("
 #if __has_include(<AvailabilityInternalPrivate.h>)
                  ^
In file included from /usr/include/stdlib.h:61:0,
                 from cbits/webrtc/common_audio/signal_processing/min_max_operations.c:27:
/usr/include/Availability.h:497:18: error: missing binary operator before token "("
 #if __has_include(<AvailabilityProhibitedInternal.h>)
                  ^
error: command 'gcc' failed with exit status 1

Use webrtcvad in python 3.5

Dear John Wiseman,
I'm trying to use voice activity detector(webrtcvad) in python 3.5 but I can't because version problems. Is it possible to use it? Because I use always with python 2.7 and it works very eficient.

The completed error is :
ImportError Traceback (most recent call last)
in ()
import webrtcvad

ImportError: Module use of python27.dll conflicts with this version of Python.

pip install webrtcvad error

ERROR: Command errored out with exit status 1:
command: 'C:\Users\tuvshintugs.b\Anaconda3\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\TUVSHI~~1.B\AppData\Local\Temp\pip-install-z3ovods6\webrtcvad\setup.py'"'"'; file='"'"'C:\Users\TUVSHI~~1.B\AppData\Local\Temp\pip-install-z3ovods6\webrtcvad\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\TUVSHI1.B\AppData\Local\Temp\pip-record-rz7zvt7g\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\tuvshintugs.b\Anaconda3\Include\webrtcvad'
cwd: C:\Users\TUVSHI1.B\AppData\Local\Temp\pip-install-z3ovods6\webrtcvad
Complete output (19 lines):
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.7
copying webrtcvad.py -> build\lib.win-amd64-3.7
running build_ext
building '_webrtcvad' extension
creating build\temp.win-amd64-3.7
creating build\temp.win-amd64-3.7\Release
creating build\temp.win-amd64-3.7\Release\cbits
creating build\temp.win-amd64-3.7\Release\cbits\webrtc
creating build\temp.win-amd64-3.7\Release\cbits\webrtc\common_audio
creating build\temp.win-amd64-3.7\Release\cbits\webrtc\common_audio\signal_processing
creating build\temp.win-amd64-3.7\Release\cbits\webrtc\common_audio\vad
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -D_WIN32 -Icbits -IC:\Users\tuvshintugs.b\Anaconda3\include -IC:\Users\tuvshintugs.b\Anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" /Tccbits\pywebrtcvad.c /Fobuild\temp.win-amd64-3.7\Release\cbits\pywebrtcvad.obj
pywebrtcvad.c
c:\users\tuvshintugs.b\anaconda3\include\pyconfig.h(203): fatal error C1083: Cannot open include file: 'basetsd.h': No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' failed with exit status 2
----------------------------------------
ERROR: Command errored out with exit status 1: 'C:\Users\tuvshintugs.b\Anaconda3\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\TUVSHI~~1.B\AppData\Local\Temp\pip-install-z3ovods6\webrtcvad\setup.py'"'"'; file='"'"'C:\Users\TUVSHI~~1.B\AppData\Local\Temp\pip-install-z3ovods6\webrtcvad\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\TUVSHI~1.B\AppData\Local\Temp\pip-record-rz7zvt7g\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\tuvshintugs.b\Anaconda3\Include\webrtcvad' Check the logs for full command output.

import _webrtcvad error

Hi,

I'm trying to use py-webrtcvad to find out the word Start time in a wav file of 48khz, 1 channel with continuous speech. But I'm not able to run example.py and getting import error. Please help me out with the package.

Thanks & Regards,
Kavya B

How did you release the package to PyPi

Hi, I found simply release the package using commands following

python setup.py sdist bdist_wheel
twine upload --repository-url https://test.pypi.org/legacy/ dist/*

would give platform tag error like

Binary wheel 'webrtcvad-2.0.11.dev0-cp36-cp36m-linux_x86_64.whl' has an unsupported platform tag 'linux_x86_64'. for url: https://test.pypi.org/legacy/

Have you ever encount this problem and how did you solve that?

using webrtcvad in realtime application

Hi, I was curious whether we could use this tool for real time application. That is whether we could detect voices coming directly from a mic in a noisy environment. If so I would be glad if anyone could give me or direct me to such implementation.

why divide duration by 2 in example.py

Hi,
In frame_generator of example.py, why
n = int(sample_rate * (frame_duration_ms / 1000.0) * 2) duration = (float(n) / sample_rate) / 2.0
It seems that the resulted timestamp will be half of the real timestamp. But the wierd thing is if I delete the *2 and /2.0, the webrtcvad can't process the frame.

ERROR: Command errored out with exit status 1:

Collecting webrtcvad
Using cached webrtcvad-2.0.10.tar.gz (66 kB)
Building wheels for collected packages: webrtcvad
Building wheel for webrtcvad (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: 'c:\users\asteroids\anaconda3\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\ASTERO1\AppData\Local\Temp\pip-install-o2x864lh\webrtcvad\setup.py'"'"'; file='"'"'C:\Users\ASTERO1\AppData\Local\Temp\pip-install-o2x864lh\webrtcvad\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\ASTERO1\AppData\Local\Temp\pip-wheel-wvucsn6_'
cwd: C:\Users\ASTERO1\AppData\Local\Temp\pip-install-o2x864lh\webrtcvad
Complete output (9 lines):
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-3.7
copying webrtcvad.py -> build\lib.win-amd64-3.7
running build_ext
building '_webrtcvad' extension
error: Microsoft Visual C++ 14.0 is required. Get it with "Build Tools for Visual Studio": https://visualstudio.microsoft.com/downloads/

ERROR: Failed building wheel for webrtcvad
Running setup.py clean for webrtcvad
Failed to build webrtcvad
Installing collected packages: webrtcvad
Running setup.py install for webrtcvad ... error
ERROR: Command errored out with exit status 1:
command: 'c:\users\asteroids\anaconda3\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\ASTERO~~1\AppData\Local\Temp\pip-install-o2x864lh\webrtcvad\setup.py'"'"'; file='"'"'C:\Users\ASTERO~~1\AppData\Local\Temp\pip-install-o2x864lh\webrtcvad\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\ASTERO1\AppData\Local\Temp\pip-record-ra8zvocp\install-record.txt' --single-version-externally-managed --compile --install-headers 'c:\users\asteroids\anaconda3\Include\webrtcvad'
cwd: C:\Users\ASTERO1\AppData\Local\Temp\pip-install-o2x864lh\webrtcvad
Complete output (9 lines):
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.7
copying webrtcvad.py -> build\lib.win-amd64-3.7
running build_ext
building '_webrtcvad' extension
error: Microsoft Visual C++ 14.0 is required. Get it with "Build Tools for Visual Studio": https://visualstudio.microsoft.com/downloads/
----------------------------------------
ERROR: Command errored out with exit status 1: 'c:\users\asteroids\anaconda3\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\ASTERO~~1\AppData\Local\Temp\pip-install-o2x864lh\webrtcvad\setup.py'"'"'; file='"'"'C:\Users\ASTERO~~1\AppData\Local\Temp\pip-install-o2x864lh\webrtcvad\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\ASTERO~1\AppData\Local\Temp\pip-record-ra8zvocp\install-record.txt' --single-version-externally-managed --compile --install-headers 'c:\users\asteroids\anaconda3\Include\webrtcvad' Check the logs for full command output.

help...

PortAudio error

Am I the only one consistently bumping into https://app.assembla.com/spaces/portaudio/tickets/268/details on all my machines at _webrtcvad.process? There are patches for this portaudio issue but non on its trunk, so I'm wondering whether other people have a setup where they can use the current project (py-webrtcvad) without bumping into it. Ubuntu 18.04 and 20.04 here.

Webrtcvad with sample width 1 instead of 2

I have an 8-bit recording but it seems like the webrtcvad only supports 16-bit recordings. Is there any way to get 8-bit recordings to work other than converting them to 16-bit representation?

VAD quality

The readme says:

The VAD that Google developed for the WebRTC project is reportedly one of the best available, being fast, modern and free.

However I was unable to witness any auspicious accuracy with any aggression level (0-3). Is this statement based on any kind of benchmark or publication? Have you experienced any useful accuracy levels in your setup, using py-webrtcvad?

win10 python3 pip install error

OS: win10 64
python 3.6 64

cmd: pip install webrtcvad

I installed Visual C++ Build Tools， And set up environment variables according to webrtcvad's error info.:
INCLUDE:

C:\Program Files (x86)\Windows Kits\8.1\Include\um
C:\Program Files (x86)\Windows Kits\8.1\Include
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include
C:\Program Files (x86)\Windows Kits\8.1\Include\shared
C:\Program Files (x86)\Windows Kits\10\Include\10.0.10240.0\ucrt

LIB:

C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\lib
C:\Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um
C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\ucrt
C:\Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um\x86
C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\ucrt\x86

PATH:

C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin

but ....

error info:

webrtc_vad.c
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:D:\tools\anaconda3\envs\py36\libs /LIBPATH:D:\tools\anaconda3\envs\py36\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\ucrt\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um\x86" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\lib" "/LIBPATH:C:\Program Files (x86)\Windows Kits\8.1\Lib" "/LIBPATH:C:\Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\Lib" /EXPORT:PyInit__webrtcvad build\temp.win-amd64-3.6\Release\cbits\pywebrtcvad.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\complex_bit_reverse.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\complex_fft.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\cross_correlation.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\division_operations.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\downsample_fast.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\energy.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\get_scaling_square.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\min_max_operations.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\real_fft.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\resample_48khz.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\resample_by_2_internal.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\resample_fractional.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\spl_init.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\signal_processing\vector_scaling_operations.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\vad\vad_core.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\vad\vad_filterbank.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\vad\vad_gmm.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\vad\vad_sp.obj build\temp.win-amd64-3.6\Release\cbits\webrtc\common_audio\vad\webrtc_vad.obj /OUT:build\lib.win-amd64-3.6_webrtcvad.cp36-win_amd64.pyd /IMPLIB:build\temp.win-amd64-3.6\Release\cbits_webrtcvad.cp36-win_amd64.lib
Creating library build\temp.win-amd64-3.6\Release\cbits_webrtcvad.cp36-win_amd64.lib and object build\temp.win-amd64-3.6\Release\cbits_webrtcvad.cp36-win_amd64.exp
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp__Py_BuildValue
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp___Py_NoneStruct
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp__PyExc_ValueError
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp__PyErr_Format
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp___Py_FalseStruct
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp__PyCapsule_GetPointer
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp__PyModule_AddObject
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp__PyModule_Create2
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp__PyErr_NewException
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp__PyCapsule_New
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp___Py_TrueStruct
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp__PyBuffer_Release
pywebrtcvad.obj : error LNK2001: unresolved external symbol __imp__PyArg_ParseTuple
build\lib.win-amd64-3.6_webrtcvad.cp36-win_amd64.pyd : fatal error LNK1120: 13 unresolved externals
error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\link.exe' failed with exit status 1120

----------------------------------------

Command "D:\tools\anaconda3\envs\py36\python.exe -u -c "import setuptools, tokenize;file='C:\Users\\AppData\Local\Temp\pip-install-9mghztwb\webrtcvad\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record C:\Users\\AppData\Local\Temp\pip-record-_fr5rj05\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in C:\Users\**\AppData\Local\Temp\pip-install-9mghztwb\webrtcvad\

Is there any good way to solve this problem?
or can provide a compiled .whl installation package ?

Thank

example.py does not return/specify the start-time of each segment in the original input wave file

Hi John,

The example.py segments a wave file to different segments. I am interested to know/extract the start/stop time of each segment within the wave file. For example, the the second segment start-time is exactly at which second in the original input wave file.

Thanks in advance,
Soroosh

can't install

I get this error when try to install.

Binary wheel for Windows posted on PyPI?

Could we get a binary wheel for Windows posted on PyPI? Anything I could do to help? FYI: https://packaging.python.org/guides/supporting-windows-using-appveyor/

Linux wheels would also be nice, but much less critical. https://docs.travis-ci.com/user/deployment/pypi/

BTW, thanks for the great package!

how to get the wav striped the silence ?

Hi,
Now I have a wav file,I just want to use the webrtcvad to strip the silence ,
how could I realize ?

Functional Description of Algorithm

Hi Py-WebrtcVAD-Team

Thank you for wrapping the webrtcvad. Everything worked straightforward and the algo had a pretty good performance. I want to integrate the algorithm in an embedded solution. As i'm already calculating features related to the log-power of the six bands used by the VAD on my device, I am interested in the detailed design of the algorithm to adapt it to my features. Have you any description? Like a paper or a flowchart? What I found out from the code is, that some sort of adaptive GMM (sequential EM?) is involved. Do you have any further information?

Thank you in advance?

Is there some adaptive technique happening in the back stage?

I have been using py-webrtcvad for detecting existence of voice from many short clips. What I observe is, if I do in this way: (some pseudo code)

vad1 = webrtcvad.Vad()
for filename in files:
    have_voice.append ( process_by_vad1(filename) )
get_count_have_voice (have_voice)

The count of files with voice derived is not same as this way:

for filename in files:
    vad1 = webrtcvad.Vad()
    have_voice.append ( process_by_vad1(filename) )
get_count_have_voice (have_voice)

So, if I didn't screwed up something, I guess there is some adaptive things happens in vad class, so in the 1st way, some coefficients get tuned when I read in more and more files, hence in the final, the performance is different with the 2nd way.

Is the assumption correct?

Many Thanks.

Calling the script from Azure Function or Something similar to this in .NET framework

I have a requirement to invoke the example.py script from Azure Function , is that any way of doing that ?

Or if no that , will be be any library available similar to py-webrtcvad to be use in Azure Functions?

Please suggest .

PS: This is not an issue , its kind of suggestion question .

Build for linux-ppc64le, pip install fails

Im working on an IBM Power VM and there seems to be no wheel for it. How would I build one myself if there is none?

Python 3.6
Ubuntu 18

Thanks for a great project

   ERROR: Command errored out with exit status 1:
     command: /home/anaconda3/envs/wmlce-1.6.1/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-m8vmxfbt/webrtcvad/setup.py'"'"'; __file__='"'"'/tmp/pip-install-m8vmxfbt/webrtcvad/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-zhq51ldg/install-record.txt --single-version-externally-managed --compile
         cwd: /tmp/pip-install-m8vmxfbt/webrtcvad/
    Complete output (24 lines):
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-ppc64le-3.6
    copying webrtcvad.py -> build/lib.linux-ppc64le-3.6
    running build_ext
    building '_webrtcvad' extension
    creating build/temp.linux-ppc64le-3.6
    creating build/temp.linux-ppc64le-3.6/cbits
    creating build/temp.linux-ppc64le-3.6/cbits/webrtc
    creating build/temp.linux-ppc64le-3.6/cbits/webrtc/common_audio
    creating build/temp.linux-ppc64le-3.6/cbits/webrtc/common_audio/signal_processing
    creating build/temp.linux-ppc64le-3.6/cbits/webrtc/common_audio/vad
    gcc -pthread -B /home/anaconda3/envs/wmlce-1.6.1/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWEBRTC_POSIX -Icbits -I/home/anaconda3/envs/wmlce-1.6.1/include/python3.6m -c cbits/pywebrtcvad.c -o build/temp.linux-ppc64le-3.6/cbits/pywebrtcvad.o
    In file included from cbits/webrtc/common_audio/vad/include/webrtc_vad.h:19:0,
                     from cbits/pywebrtcvad.c:2:
    cbits/webrtc/typedefs.h:51:2: error: #error Please add support for your architecture in typedefs.h
     #error Please add support for your architecture in typedefs.h
      ^~~~~
    cbits/webrtc/typedefs.h:55:2: error: #error Define either WEBRTC_ARCH_LITTLE_ENDIAN or WEBRTC_ARCH_BIG_ENDIAN
     #error Define either WEBRTC_ARCH_LITTLE_ENDIAN or WEBRTC_ARCH_BIG_ENDIAN
      ^~~~~
    error: command 'gcc' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /home/anaconda3/envs/wmlce-1.6.1/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-m8vmxfbt/webrtcvad/setup.py'"'"'; __file__='"'"'/tmp/pip-install-m8vmxfbt/webrtcvad/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-zhq51ldg/install-record.txt --single-version-externally-managed --compile Check the logs for full command output.

Voiced frames duration

Hello, how to decide the duration of the voiced frames?
Because for now the result is to short..

I want to make it longer, like 1 second of silence then the voice appear and then 1 second of silence in the ending.

what is aggressiveness

hi! first and foremost, thank you! this wrapper has been a life-saver for me. I am a linguist with some programming skills, but none in C/C++, and python is really the easiest for me to work with the data I have. Tangentially: I would like to cite you in my dissertation; do you have a publication that would be appropriate, or should I cite this repo? I'll do something like the following, if you don't object or offer an alternative:

Wiseman, J. (2017). Py-WebRTC VAD. GitHub repository, https://github.com/wiseman/py-webrtcvad/

And now, my actual question: I need to provide a basic description of the WebRTC VAD algorithm. I have looked and looked, but I have not found any detailed documentation and I don't fully understand the source.

I realize you are not the author, but I am hoping you might generously lend me your understanding. I know it's a Gaussian Mixture Model (GMM), but my math knowledge is weak, so I barely understand what that means. Is it correct to say it's a statistically based machine learning model, and for the case of VAD, one that has been trained to recognize speech vs. noise?

I don't think I really need to understand the details, but I do need to know to some degree, what the "aggressiveness" setting does wrt to the algorithm. Is it literally just stipulating a confidence threshold? Does it have any acoustic repercussions?

Any help is much appreciated! I am realizing I should perhaps have posted this on stackoverflow; perhaps I will if I don't hear back, but I'm posting anyway bc of the more-specific-to-you citation question :)

where is the _webrtcvad module

I found something unusual in webrtcvad.py, in the 3rd line says import _webrtcvad but i couldn't find any folder or file that is named _webrtcvad. can you tell me where, so i could know what _webrtcvad.process() do

Couldn't install on Windows by pip

I couldn't install it by pip on windows, and received the error messages below;

error: command 'C:\Users\XXX\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Bin\cl.exe' failed with exit status 2

If somebody suggest me, it would be great helpful!
Hiro

sample rate

When I use the program in the example to process the audio of 44100 sampling rate, the program will report an error。

sample rate:44100
Traceback (most recent call last):
File "vad2.py", line 143, in
main()
File "vad2.py", line 138, in main
filter(in_wav, out_dir, expand=False)
File "vad2.py", line 118, in filter
voiced_frames = vad_collector(sample_rate, vad, frames)
File "vad2.py", line 87, in vad_collector
is_speech = vad.is_speech(frame.bytes, sample_rate)
File "/home/tian/anaconda3/envs/torch14/lib/python3.7/site-packages/webrtcvad.py", line 27, in is_speech
return _webrtcvad.process(self._vad, sample_rate, buf, length)
webrtcvad.Error: Error while processing frame

And I want to know whether this tool can handle 44100 sample rate audio. I'm looking forward to your reply, thanks.

Error: Error while processing frame

Hi John,
I was trying to use it to classify my audio frames into speech and silence. When I segment my audio into 30ms, the code runs with no errors. However, when I try 25ms, I get an error that says:

 in is_speech(self, buf, sample_rate, length)
     25                 'buffer has %s frames, but length argument was %s' % (
     26                     int(len(buf) / 2.0), length))
---> 27         return _webrtcvad.process(self._vad, sample_rate, buf, length)
     28 
     29 

Error: Error while processing frame

This is my code:

source1 = path + "phone1.wav"
audio, sample_rate = read_wave(source1)
framesz=25.
frames = frame_generator(framesz, audio, sample_rate)
vad = webrtcvad.Vad(3)
frames = list(frames)
num_voiced = [1 if vad.is_speech(f, sample_rate) else 0 for f in frames]

wiseman / py-webrtcvad Goto Github PK

py-webrtcvad's People

Stargazers

Watchers

Forkers

py-webrtcvad's Issues

1 pip install fail

2 download install error

Recommend Projects

Recommend Topics

Recommend Org

Jobs