GithubHelp home page GithubHelp logo

achabotl / pambox Goto Github PK

View Code? Open in Web Editor NEW
34.0 6.0 8.0 5.23 MB

Python auditory modeling toolbox.

Home Page: http://pambox.org

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.28% Python 99.72%
auditory modeling hearing speech

pambox's People

Contributors

achabotl avatar jfsantos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

pambox's Issues

Experiment needs a way to set levels before or after the processing

I often run into the issue that the speech and noise levels should be set before the distortion, for example when an SNR is set at the source, rather than at the ears, in an experiment with signals in space. Right now I have to do very inelegant overrides of the preprocessing function, or

Have an extra argument to the Experiment class, to define adjust_levels_before_processing or something like that would really simplify things.

Additionally, that would solve the problem that adjusting the levels of binaural signals when HRTFs are applied. If the levels are adjusted before the distortion processing, then the signals should, in principle, still be binaural.

SRT conversion should allow for model-specific criteria

Experiment.srts_from_df (which is a bad name) should allow the srt_at argument to take model-specific values. For example, if a model outputs SII values, then the SRT shouldn't be at 50% but at some value between 0 and 1. The current API is

srts_from_df(df, col='Intelligibility', srt_at=50)

We should allow for model specific criteria, like: {('Model', 'Output'}: criterion'}. This has the downside that if the "default" srt_at shouldn't be 50, then all models must be part of the dictionary. Therefore, adding another keyword argument might be a better idea.

srts_from_df(df, col='Intelligibility', srt_at=50, model_srts=None)

Add an option to Experiment class to save complete data frame to HDF5

Since it's possible to save a Pandas Dataframe directly to HDF5, it would be a good idea to offer that option when running an experiment. I think the default should be "off", because the resulting files will be too big, but it would certainly be useful for debugging, makings plots of the internal representations, etc.

Not possible to apply processing mixture of target and maskers

In Experiment.processing, the application of the distortion is done independently for the target and masker. It's a problem for non-linear distortions, like spectral-subtraction, which require the noisy signal and the noise alone.

I see two approaches for "fixing" that:

  1. If required the user should subclass Experiment and replace the preprocessing method with a method that applies the distortion whichever way they want.
  2. Add an additional option to Experiment to define the behavior inside the preprocessing method. That would require changing the level adjustment behavior as well. Actually, the level adjustment would have to be done before the application of the distortion.

Right now, I'd say we should stick to option 1.

utils.rms and utils.setdbspl fail with some signal sizes

If the signal is of shape 2xN, for example, utils.rms and utils.setdbspl spit out a ValueError because of incompatible shapes. The issue is that both function have to do a division by, or a subtraction of, the mean and that does not fit the broadcasting rules.

Modulation filtering stage produces different output from "butter" function

Noticed that the time output of the modulation filterbank is different from the time output if we just use the sp.signal.butter function to create the coefficients and then filter the signal with sp. signal.filtfilt.

Apparently, there an extra "-1" factor in when creating the frequency vector that should not be there. If we remove it, then the output of the mod_filterbank function is the same as when using butter.

It would probably make sense to use butter, since we're using Butterworth filters anyway, instead of using our own implementation. Additionally, because of the way the modulation filtering is currently done, the shape of the filter is dependent on the length of the input signal, because it affects the resolution of the frequency vector.

Implement the STOI intelligibility model

Taal, C. H., Hendriks, R. C., Heusdens, R., and Jensen, J. (2010). A short-time objective intelligibility measure for time-frequency weighted noisy speech, , (), 4214--4217
DOI

Pick a "reference level" for the toolbox

Should pick a "reference level" for signals. For example, should a signal with an RMS value of 1 correspond to 0 dB, 100 dB, or something else?

We could use a physical standard too, where an RMS of 20e-6 corresponds to 0 dB, i.e.

level = 20 * log10(rms / 20e-6)

noctave_filtering should calculate the boundaries for each center frequency

The boundaries of each flitter should be calculated independently, and not suppose that the input frequencies are spaced properly.

So right now, if we input two frequencies that are not spaced according to the width parameter, e.g. [63, 1000], the boudaries are: [56.12661924, 70.71510904, 1122.46204831]. But there should be 4 boundaries.

Level adjustment should work for binaural signals too

In speech.Exeriment.adjust_levels, it should be possible to adjust the levels correctly even if the signals are binaural. A way to do this is simply:

average_level = np.mean(utils.dbspl(signal))

Average level should therefore always be a single number, independently of if signal has one, two, or more channels.

Standardize the return values of the intelligibility models

Each intelligibility model returns a different type of prediction value. Sometimes it is intelligibility percentage directly, but more often than not, it is some particular value that has to be transformed to intelligibility. A model can also return internal intermediate values, such a envelope powers, level spectra, etc. It would be great if the output of the the models was standardized such that the models can be used interchangeably.

Implement STI model

International Electrotechnical Commission (2003). 60268--16-2003 Sound system equipment---Part 16: Objective rating of speech intelligibility by speech transmission index, , (60268--16-2003), 1--28

Refactor Sepsm and MrSepsm

The predict function should be broken down for more modularity, such that there's no need to duplicate it for the mr-sEPSM. The abstraction level of all the call in predict should be the same.

The sEPSM does a double compensation for filter bandwidth

When finding the bands above threshold in the sEPSM, there is a factor of 0.231 for the compensation of the filter bandwidth. This factor is unnecessary because the diffuse hearing threshold used for the comparison are already adjusted for the filter bandwidths.

The factor should be removed. Hopefully, that would not affect the predictions too much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.