运行speech-recognition-from-microphone-onnx 报错 <a target="_blank" rel=

你是不是 sherpa-onnx 和 sherpa-ncnn 弄混了？有运行 <p dir="aut

<div class="highlight highlight-source-python notranslate position-relative overflow-auto" dir="auto

<div class="highlight highlight-source-python notranslate position-relative overflow-au

你写的代码 <div class="highlight highlight-source-python notranslate position-relative

你写的代码 <div class="highlight highlight-source-python notranslate posit

你模型文件路径，是不是不对？ <a href="https://k2-fsa.github.io/sherpa/onnx/python/

sherpa_onnx.OnlineRecognizer( TypeError: OnlineRecognizer() takes no arguments about sherpa-onnx HOT 12 CLOSED

lonngxiang commented on May 29, 2024

sherpa_onnx.OnlineRecognizer( TypeError: OnlineRecognizer() takes no arguments

from sherpa-onnx.

Comments (12)

csukuangfj commented on May 29, 2024

你是不是 sherpa-onnx 和 sherpa-ncnn 弄混了？

有运行

pip install sherpa-onnx

么

from sherpa-onnx.

lonngxiang commented on May 29, 2024

你是不是 sherpa-onnx 和 sherpa-ncnn 弄混了？

有运行

pip install sherpa-onnx

么

有安装的：Installing collected packages: sherpa-onnx
Successfully installed sherpa-onnx-1.9.10；

windows运行：

python .\speech-recognition-from-microphone-onnx.py  --tokens=sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/tokens.txt --encoder=sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/encoder-epoch-20-avg-1-chunk-16-left-128.onnx   --decoder=sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/decoder-epoch-20-avg-1-chunk-16-left-128.onnx --joiner=sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/joiner-epoch-20-avg-1-chunk-16-left-128.onnx

#!/usr/bin/env python3

# Real-time speech recognition from a microphone with sherpa-onnx Python API
#
# Please refer to
# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html
# to download pre-trained models

import argparse
import sys
from pathlib import Path

try:
    import sounddevice as sd
except ImportError:
    print("Please install sounddevice first. You can use")
    print()
    print("  pip install sounddevice")
    print()
    print("to install it")
    sys.exit(-1)

import sherpa_onnx




def assert_file_exists(filename: str):
    assert Path(filename).is_file(), (
        f"{filename} does not exist!\n"
        "Please refer to "
        "https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it"
    )


def get_args():
    parser = argparse.ArgumentParser(
        formatter_class=argparse.ArgumentDefaultsHelpFormatter
    )

    parser.add_argument(
        "--tokens",
        type=str,
        help="Path to tokens.txt",
    )

    parser.add_argument(
        "--encoder",
        type=str,
        help="Path to the encoder model",
    )

    parser.add_argument(
        "--decoder",
        type=str,
        help="Path to the decoder model",
    )

    parser.add_argument(
        "--joiner",
        type=str,
        help="Path to the joiner model",
    )

    parser.add_argument(
        "--decoding-method",
        type=str,
        default="greedy_search",
        help="Valid values are greedy_search and modified_beam_search",
    )

    return parser.parse_args()


def create_recognizer():
    args = get_args()
    assert_file_exists(args.encoder)
    assert_file_exists(args.decoder)
    assert_file_exists(args.joiner)
    assert_file_exists(args.tokens)
    # Please replace the model files if needed.
    # See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html
    # for download links.
    print(args.tokens,
        args.encoder,
        args.decoder,
        args.joiner)
    recognizer = sherpa_onnx.OnlineRecognizer(
        tokens=args.tokens,
        encoder=args.encoder,
        decoder=args.decoder,
        joiner=args.joiner,
        num_threads=1,
        sample_rate=16000,
        feature_dim=80,
        decoding_method=args.decoding_method,
    )
    return recognizer


def main():
    recognizer = create_recognizer()
    print("Started! Please speak")

    # The model is using 16 kHz, we use 48 kHz here to demonstrate that
    # sherpa-onnx will do resampling inside.
    sample_rate = 48000
    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms
    last_result = ""
    stream = recognizer.create_stream()
    # last_result = ""
    i=0
    with sd.InputStream(channels=1, dtype="float32", samplerate=sample_rate) as s:
        while True:
            samples, _ = s.read(samples_per_read)  # a blocking read
            samples = samples.reshape(-1)
            stream.accept_waveform(sample_rate, samples)
            while recognizer.is_ready(stream):
                recognizer.decode_stream(stream)
            result = recognizer.get_result(stream)
            # if last_result != result:
            #     last_result = result
            #     print("\r{}".format(result), end="", flush=True)

            if last_result != result:
                if i==0:
                    print("{}".format(result),end='')
                    last_result = result
                    i=i+1
                else:
                    last_result_len=len(last_result)
                    
                    new_word = result[last_result_len:]
                    # print(last_result,result,new_word)
                    print("{}".format(new_word),end='', flush=True)
                    last_result = result


if __name__ == "__main__":
    devices = sd.query_devices()
    print(devices)
    default_input_device_idx = sd.default.device[0]
    print(f'Use default device: {devices[default_input_device_idx]["name"]}')

    try:
        main()
    except KeyboardInterrupt:
        print("\nCaught Ctrl + C. Exiting")

from sherpa-onnx.

csukuangfj commented on May 29, 2024

import sherpa_onnx
print(sherpa_onnx.__file__)
print(help(sherpa_onnx.OnlineRecognizer))

这几行，输出什么呢？

from sherpa-onnx.

lonngxiang commented on May 29, 2024

```python
print(help(sherpa_onnx.OnlineRecognizer))

import sherpa_onnx
>>> print(sherpa_onnx.__file__)
C:\Users\loong\.conda\envs\nlp\Lib\site-packages\sherpa_onnx\__init__.py
>>> print(help(sherpa_onnx.OnlineRecognizer))
Help on class OnlineRecognizer in module sherpa_onnx.online_recognizer:

class OnlineRecognizer(builtins.object)
 |  A class for streaming speech recognition.
 |
 |  Please refer to the following files for usages
 |   - https://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/python/tests/test_online_recognizer.py
 |   - https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/online-decode-files.py
 |
 |  Methods defined here:
 |
 |  create_stream(self, hotwords: Optional[str] = None)
 |
 |  decode_stream(self, s: _sherpa_onnx.OnlineStream)
 |
 |  decode_streams(self, ss: List[_sherpa_onnx.OnlineStream])
 |
 |  get_result(self, s: _sherpa_onnx.OnlineStream) -> str
 |
 |  is_endpoint(self, s: _sherpa_onnx.OnlineStream) -> bool
 |
 |  is_ready(self, s: _sherpa_onnx.OnlineStream) -> bool
 |
 |  reset(self, s: _sherpa_onnx.OnlineStream) -> bool
 |
 |  timestamps(self, s: _sherpa_onnx.OnlineStream) -> List[float]
 |
 |  tokens(self, s: _sherpa_onnx.OnlineStream) -> List[str]
 |
 |  ----------------------------------------------------------------------
 |  Class methods defined here:
 |
 |  from_paraformer(tokens: str, encoder: str, decoder: str, num_threads: int = 2, sample_rate: float = 16000, feature_dim: int = 80, enable_endpoint_detection: bool = False, rule1_min_trailing_silence: float = 2.4, rule2_min_trailing_silence: float = 1.2, rule3_min_utterance_length: float = 20.0, decoding_method: str = 'greedy_search', provider: str = 'cpu') from builtins.type
 |      Please refer to
 |      `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html>`_
 |      to download pre-trained models for different languages, e.g., Chinese,
 |      English, etc.
 |
 |      Args:
 |        tokens:
 |          Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two
 |          columns::
 |
 |              symbol integer_id
 |
 |        encoder:
 |          Path to ``encoder.onnx``.
 |        decoder:
 |          Path to ``decoder.onnx``.
 |        num_threads:
 |          Number of threads for neural network computation.
 |        sample_rate:
 |          Sample rate of the training data used to train the model.
 |        feature_dim:
 |          Dimension of the feature used to train the model.
 |        enable_endpoint_detection:
 |          True to enable endpoint detection. False to disable endpoint
 |          detection.
 |        rule1_min_trailing_silence:
 |          Used only when enable_endpoint_detection is True. If the duration
 |          of trailing silence in seconds is larger than this value, we assume
 |          an endpoint is detected.
 |        rule2_min_trailing_silence:
 |          Used only when enable_endpoint_detection is True. If we have decoded
 |          something that is nonsilence and if the duration of trailing silence
 |          in seconds is larger than this value, we assume an endpoint is
 |          detected.
 |        rule3_min_utterance_length:
 |          Used only when enable_endpoint_detection is True. If the utterance
 |          length in seconds is larger than this value, we assume an endpoint
 |          is detected.
 |        decoding_method:
 |          The only valid value is greedy_search.
 |        provider:
 |          onnxruntime execution providers. Valid values are: cpu, cuda, coreml.
 |
 |  from_transducer(tokens: str, encoder: str, decoder: str, joiner: str, num_threads: int = 2, sample_rate: float = 16000, feature_dim: int = 80, enable_endpoint_detection: bool = False, rule1_min_trailing_silence: float = 2.4, rule2_min_trailing_silence: float = 1.2, rule3_min_utterance_length: float = 20.0, decoding_method: str = 'greedy_search', max_active_paths: int = 4, hotwords_score: float = 1.5, blank_penalty: float = 0.0, hotwords_file: str = '', provider: str = 'cpu', model_type: str = '', lm: str = '', lm_scale: float = 0.1) from builtins.type
 |      Please refer to
 |      `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html>`_
 |      to download pre-trained models for different languages, e.g., Chinese,
 |      English, etc.
 |
 |      Args:
 |        tokens:
 |          Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two
 |          columns::
 |
 |              symbol integer_id
 |
 |        encoder:
 |          Path to ``encoder.onnx``.
 |        decoder:
 |          Path to ``decoder.onnx``.
 |        joiner:
 |          Path to ``joiner.onnx``.
 |        num_threads:
 |          Number of threads for neural network computation.
 |        sample_rate:
 |          Sample rate of the training data used to train the model.
 |        feature_dim:
 |          Dimension of the feature used to train the model.
 |        enable_endpoint_detection:
 |          True to enable endpoint detection. False to disable endpoint
 |          detection.
 |        rule1_min_trailing_silence:
 |          Used only when enable_endpoint_detection is True. If the duration
 |          of trailing silence in seconds is larger than this value, we assume
 |          an endpoint is detected.
 |        rule2_min_trailing_silence:
 |          Used only when enable_endpoint_detection is True. If we have decoded
 |          something that is nonsilence and if the duration of trailing silence
 |          in seconds is larger than this value, we assume an endpoint is
 |          detected.
 |        rule3_min_utterance_length:
 |          Used only when enable_endpoint_detection is True. If the utterance
 |          length in seconds is larger than this value, we assume an endpoint
 |          is detected.
 |        decoding_method:
 |          Valid values are greedy_search, modified_beam_search.
 |        max_active_paths:
 |          Use only when decoding_method is modified_beam_search. It specifies
 |          the maximum number of active paths during beam search.
 |        blank_penalty:
 |          The penalty applied on blank symbol during decoding.
 |        hotwords_file:
 |          The file containing hotwords, one words/phrases per line, and for each
 |          phrase the bpe/cjkchar are separated by a space.
 |        hotwords_score:
 |          The hotword score of each token for biasing word/phrase. Used only if
 |          hotwords_file is given with modified_beam_search as decoding method.
 |        provider:
 |          onnxruntime execution providers. Valid values are: cpu, cuda, coreml.
 |        model_type:
 |          Online transducer model type. Valid values are: conformer, lstm,
 |          zipformer, zipformer2. All other values lead to loading the model twice.
 |
 |  from_wenet_ctc(tokens: str, model: str, chunk_size: int = 16, num_left_chunks: int = 4, num_threads: int = 2, sample_rate: float = 16000, feature_dim: int = 80, enable_endpoint_detection: bool = False, rule1_min_trailing_silence: float = 2.4, rule2_min_trailing_silence: float = 1.2, rule3_min_utterance_length: float = 20.0, decoding_method: str = 'greedy_search', provider: str = 'cpu') from builtins.type
 |      Please refer to
 |      `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/wenet/index.html>`_
 |      to download pre-trained models for different languages, e.g., Chinese,
 |      English, etc.
 |
 |      Args:
 |        tokens:
 |          Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two
 |          columns::
 |
 |              symbol integer_id
 |
 |        model:
 |          Path to ``model.onnx``.
 |        chunk_size:
 |          The --chunk-size parameter from WeNet.
 |        num_left_chunks:
 |          The --num-left-chunks parameter from WeNet.
 |        num_threads:
 |          Number of threads for neural network computation.
 |        sample_rate:
 |          Sample rate of the training data used to train the model.
 |        feature_dim:
 |          Dimension of the feature used to train the model.
 |        enable_endpoint_detection:

from sherpa-onnx.

csukuangfj commented on May 29, 2024

你写的代码

    recognizer = sherpa_onnx.OnlineRecognizer(
        tokens=args.tokens,
        encoder=args.encoder,
        decoder=args.decoder,
        joiner=args.joiner,
        num_threads=1,
        sample_rate=16000,
        feature_dim=80,
        decoding_method=args.decoding_method,
    )
    return recognizer

这个是从哪里来的？

from sherpa-onnx.

csukuangfj commented on May 29, 2024

https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/speech-recognition-from-microphone.py

sherpa_onnx.OnlineRecognizer( TypeError: OnlineRecognizer() takes no arguments about sherpa-onnx HOT 12 CLOSED

Comments (12)

Footer

Footer navigation

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs