GithubHelp home page GithubHelp logo

sindhu-hegde / pseudo-visual-speech-denoising Goto Github PK

View Code? Open in Web Editor NEW
101.0 101.0 24.0 2.56 MB

Official code for the paper "Visual Speech Enhancement Without A Real Visual Stream" published at WACV 2021

License: MIT License

Python 100.00%

pseudo-visual-speech-denoising's People

Contributors

prajwalkr avatar rudrabha avatar sindhu-hegde avatar snehitvaddi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pseudo-visual-speech-denoising's Issues

tmp.waw and temp.wav

what are these tmp.wav and temp.wav in inference.py
i am getting no such file or directory when i try to denoise using pretrained models given by you

i had given
python inference.py --lipsync_student_model_path=<"C:\Users\savdo\OneDrive\Desktop\major\pseudo-visual-speech
-denoising-main\lipsync\checkpoints\lipsync_student.pth"> --checkpoint_path=<"C:\Users\savdo\OneDrive\Desktop\major\pseudo-visual-speech-denoising-main\checkpoints\denoising.pt"> --input=
<"C:\Users\savdo\OneDrive\Desktop\major\pseudo-visual-speech-denoising-main\timit_3spk.wav">

but i am getting this as an error(Please help me)

The system cannot find the file specified.
C:\Users\savdo\AppData\Local\Programs\Python\Python39\lib\site-packages\librosa\util\decorators.py:88: UserWarning: PySoundFile failed. Trying audioread instead.
return f(*args, **kwargs)
Traceback (most recent call last):
File "C:\Users\savdo\AppData\Local\Programs\Python\Python39\lib\site-packages\librosa\core\audio.py", line 155, in load
context = sf.SoundFile(path)
File "C:\Users\savdo\AppData\Local\Programs\Python\Python39\lib\site-packages\soundfile.py", line 629, in init
self._file = self._open(file, mode_int, closefd)
File "C:\Users\savdo\AppData\Local\Programs\Python\Python39\lib\site-packages\soundfile.py", line 1183, in _open
_error_check(_snd.sf_error(file_ptr),
File "C:\Users\savdo\AppData\Local\Programs\Python\Python39\lib\site-packages\soundfile.py", line 1357, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening 'C:\Users\savdo\OneDrive\Desktop\major\pseudo-visual-speech-denoising-main\timit_3spk.wav': System error.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\savdo\OneDrive\Desktop\major\pseudo-visual-speech-denoising-main\inference.py", line 292, in
predict(args)
File "C:\Users\savdo\OneDrive\Desktop\major\pseudo-visual-speech-denoising-main\inference.py", line 175, in predict
inp_wav = load_wav(args)
File "C:\Users\savdo\OneDrive\Desktop\major\pseudo-visual-speech-denoising-main\inference.py", line 22, in load_wav
wav = audio.load_wav(wav_file, sampling_rate)
File "C:\Users\savdo\OneDrive\Desktop\major\pseudo-visual-speech-denoising-main\audio\audio_utils.py", line 8, in load_wav
return librosa.core.load(path, sr=sr)[0]
File "C:\Users\savdo\AppData\Local\Programs\Python\Python39\lib\site-packages\librosa\util\decorators.py", line 88, in inner_f
return f(*args, **kwargs)
File "C:\Users\savdo\AppData\Local\Programs\Python\Python39\lib\site-packages\librosa\core\audio.py", line 174, in load
y, sr_native = __audioread_load(path, offset, duration, dtype)
File "C:\Users\savdo\AppData\Local\Programs\Python\Python39\lib\site-packages\librosa\core\audio.py", line 198, in _audioread_load
with audioread.audio_open(path) as input_file:
File "C:\Users\savdo\AppData\Local\Programs\Python\Python39\lib\site-packages\audioread_init
.py", line 111, in audio_open
return BackendClass(path)
File "C:\Users\savdo\AppData\Local\Programs\Python\Python39\lib\site-packages\audioread\rawread.py", line 62, in init
self._fh = open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\savdo\OneDrive\Desktop\major\pseudo-visual-speech-denoising-main\timit_3spk.wav'

Realtime usage

Is the network able to process audio for interactive use?

inference --input missing value

Thank for your quick response in last issue @Rudrabha !! you are always doing great projects! I tried to run the inference and showed the following message, thanks :

File "/usr/local/lib/python3.7/dist-packages/soundfile.py", line 629, in init
self._file = self._open(file, mode_int, closefd)
File "/usr/local/lib/python3.7/dist-packages/soundfile.py", line 1184, in _open
"Error opening {0!r}: ".format(self.name))
File "/usr/local/lib/python3.7/dist-packages/soundfile.py", line 1357, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening 'tmp.wav': System error.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "inference.py", line 274, in
predict(args)
File "inference.py", line 174, in predict
inp_wav = load_wav(args)
File "inference.py", line 22, in load_wav
wav = audio.load_wav(wav_file, sampling_rate)
File "/root/pseudo-visual-speech-denoising/audio/audio_utils.py", line 7, in load_wav
return librosa.core.load(path, sr=sr)[0]
File "/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py", line 142, in load
y, sr_native = __audioread_load(path, offset, duration, dtype)
File "/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py", line 164, in __audioread_load
with audioread.audio_open(path) as input_file:
File "/usr/local/lib/python3.7/dist-packages/audioread/init.py", line 111, in audio_open
return BackendClass(path)
File "/usr/local/lib/python3.7/dist-packages/audioread/rawread.py", line 62, in init
self._fh = open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'tmp.wav'

dependency issues

I've tried with multiple versions of python, including python 3.7.4,

pip usually runs into the following error:

ERROR: Could not find a version that satisfies the requirement torch~=1.6.0 (from versions: 1.7.0, 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0, 1.10.1, 1.10.2, 1.11.0)
ERROR: No matching distribution found for torch~=1.6.0

sometimes it's other dependencies (e.g. on a later python, I've seen it fail with tensorflow-gpu). Can you perhaps share the exact testing methodology - i.e. what machine/os was used what software installed and commands used?

How to Run file.

Hi Admin,

Could you mention how to run the project step by step procedure.

Please help us

cv2 module not found error occurred

Real-time

Hey. I went through the paper the results and I must mention that the work is remarkable. Congrats achieving such a tremendous results. However, I was wondering whether the model can be used in real-time applications or not? if it can, can you suggest the ways to do it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.