Ctranslate2 supports this via the files argument (e.g

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Allow loading model from memory about faster-whisper HOT 6 CLOSED

George0828Zhang commented on June 11, 2024

Allow loading model from memory

from faster-whisper.

Comments (6)

trungkienbkhn commented on June 11, 2024

@George0828Zhang , hello. Tks for your idea. I created a new PR to implement this.

from faster-whisper.

George0828Zhang commented on June 11, 2024

@George0828Zhang , hello. Tks for your idea. I created a new PR to implement this.

Though I appreciate the quick actions, this PR did not fully solve this issue. There's still the tokenizer and preprocesser that needed to be handled. Here's what I'm using, feel free to add to the PR and let's hope it gets merged soon.

import json
import os
from inspect import signature
from typing import List, Optional, Union
import ctranslate2
import tokenizers
from faster_whisper.feature_extractor import FeatureExtractor
from faster_whisper.utils import download_model, get_logger


class MyWhisper(WhisperModel):
    def __init__(
        self,
        model_size_or_path: str,
        device: str = "auto",
        device_index: Union[int, List[int]] = 0,
        compute_type: str = "default",
        cpu_threads: int = 0,
        num_workers: int = 1,
        download_root: Optional[str] = None,
        local_files_only: bool = False,
        files: object = None,
        **kwargs
    ):
        """
        """
        self.logger = get_logger()

        tokenizer_bytes, preprocessor_bytes = None, None
        if files:
            model_path = model_size_or_path
            tokenizer_bytes = files.pop("tokenizer.json", None)
            preprocessor_bytes = files.pop("preprocessor_config.json", None)
        elif os.path.isdir(model_size_or_path):
            model_path = model_size_or_path
        else:
            model_path = download_model(
                model_size_or_path,
                local_files_only=local_files_only,
                cache_dir=download_root,
            )

        self.model = ctranslate2.models.Whisper(
            model_path,
            device=device,
            device_index=device_index,
            compute_type=compute_type,
            intra_threads=cpu_threads,
            inter_threads=num_workers,
            files=files,
            **kwargs
        )

        tokenizer_file = os.path.join(model_path, "tokenizer.json")
        if tokenizer_bytes:
            self.hf_tokenizer = tokenizers.Tokenizer.from_buffer(tokenizer_bytes)
        elif os.path.isfile(tokenizer_file):
            self.hf_tokenizer = tokenizers.Tokenizer.from_file(tokenizer_file)
        else:
            self.hf_tokenizer = tokenizers.Tokenizer.from_pretrained(
                "openai/whisper-tiny" + ("" if self.model.is_multilingual else ".en")
            )
        self.feat_kwargs = self._get_feature_kwargs(model_path, preprocessor_bytes)
        self.feature_extractor = FeatureExtractor(**self.feat_kwargs)
        self.num_samples_per_token = self.feature_extractor.hop_length * 2
        self.frames_per_second = (
            self.feature_extractor.sampling_rate // self.feature_extractor.hop_length
        )
        self.tokens_per_second = (
            self.feature_extractor.sampling_rate // self.num_samples_per_token
        )
        self.input_stride = 2
        self.time_precision = 0.02
        self.max_length = 448

    def _get_feature_kwargs(self, model_path, preprocessor_bytes=None) -> dict:
        preprocessor_config_file = os.path.join(model_path, "preprocessor_config.json")
        config = {}
        if preprocessor_bytes or os.path.isfile(preprocessor_config_file):
            try:
                if preprocessor_bytes:
                    config = json.loads(preprocessor_bytes)
                else:
                    with open(preprocessor_config_file, "r", encoding="utf-8") as json_file:
                        config = json.load(json_file)
                valid_keys = signature(FeatureExtractor.__init__).parameters.keys()
                config = {k: v for k, v in config.items() if k in valid_keys}
            except json.JSONDecodeError as e:
                self.logger.warning(
                    "Could not load preprocessor_config.json: %s", str(e)
                )

        return config

from faster-whisper.

trungkienbkhn commented on June 11, 2024

@George0828Zhang , I think that if you want to handle tokenizer and preprocessor with other initialization data, you could edit tokenizer.json and preprocessor_config.json files in your custom FW model instead of default files after conversion.

from faster-whisper.

George0828Zhang commented on June 11, 2024

@trungkienbkhn I'm not "handl[ing] tokenizer and preprocessor with other initialization data", I'm not modifying anything in any way. I'm loading these files from memory (rather than disk), i.e. a dictionary like so:

files={
    "config.json":  open("config.json", "rb").read(),
    "tokenizer.json": open("tokenizer.json", "rb").read(),
    "model.bin": open("model.bin", "rb").read(),
    "vocabulary.txt": open("vocabulary.txt", "rb").read(),
    # preprocessor_config.json is optional
}

Naively passing this dict to the underlying ctranslate2.models.Whisper DOES NOT WORK.

You might ask: Why read it like these? Why not pass the path, or let whisper download it?
Well, this is specifically for the use case where the service 1. has no public internet access and 2. the model files are stored on NAS and 3. the local storage is limited. The solution is to read the bytes from NAS through local network, and then load the model via the bytes.

I provided what does work, so it's not really an issue.

from faster-whisper.

trungkienbkhn commented on June 11, 2024

@George0828Zhang , okay I updated my PR.

from faster-whisper.

George0828Zhang commented on June 11, 2024

Since the PR got merged, I'm closing this. Thanks.

from faster-whisper.

Allow loading model from memory about faster-whisper HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs