GithubHelp home page GithubHelp logo

Comments (8)

titu1994 avatar titu1994 commented on May 27, 2024 1

Hmm can you try to see if you remove trailing new line in your manifest file, and also try to see if removing the json indent helps ?

from nemo.

krishnacpuvvada avatar krishnacpuvvada commented on May 27, 2024 1

return func(*args, **kwargs)
TypeError: EncDecMultiTaskModel.transcribe() got an unexpected keyword argument 'audio

can you verify that you are working with main branch of NeMo? the argument was updated from paths2audio_files to audio from r1.23 to main branch.

Also as a sanity check, can you try the following on main branch

trascript = canary_model.trascribe(audio=['path_to_file.wav']) # assuming path_to_file.wav is english audio.

from nemo.

P15V avatar P15V commented on May 27, 2024

@titu1994 I am experiencing much the same issue on my end. I tried as such and had no luck. I get the same JSON error as the one posted above.
when doing as suggested by removing the JSON indent and trailing lines. I get the following error:
"json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 171 (char 170)"

When directly fixing the JSON errors, I end up with this error that suggests it expects the JSON to have an 'answer' key somewhere?!? :

"-packages/nemo/collections/common/data/lhotse/nemo_adapters.py", line 84, in iter
text=data[self.text_field],
KeyError: 'answer'
Transcribing: 0it [00:00, ?it/s]"

I want to run ASR, so expecting an 'answer' key is wonky.

from nemo.

krishnacpuvvada avatar krishnacpuvvada commented on May 27, 2024

text=data[self.text_field],
KeyError: 'answer'

to fix this, please modify the input lines file to add 'answer' field.

{
  "audio_filepath": "PathRemovedDueToPersonalName",  
  "duration": 30.0,  
  "taskname": "asr",  
  "source_lang": "en", 
  "target_lang": "en", 
  "pnc": 'yes', 
  "answer": 'na',
}

also, we recently updated .transcribe signature, so if you are using main branch

transcript = canary_model.transcribe(paths2audio_files="/home/pjstimac/NvidiaCanaryTest/transcribe_manifest.json", batch_size=16)

should be updated to
transcript = canary_model.transcribe(audio="/home/pjstimac/NvidiaCanaryTest/transcribe_manifest.json", batch_size=16)

from nemo.

P15V avatar P15V commented on May 27, 2024

@krishnacpuvvada No luck, unfortunately. Tried that exact format(with JSON & JSONL) with the updated transcript variable. Yields this error
"
return func(*args, **kwargs)
TypeError: EncDecMultiTaskModel.transcribe() got an unexpected keyword argument 'audio
'"

from nemo.

P15V avatar P15V commented on May 27, 2024

@krishnacpuvvada I'll double-check and update again when I'm home. But I was following the Nvidia tutorial and git-cloning the repo just last night, so I'm assuming it's the latest as of yesterday evening.

The good news is I finally got basic inference working with three lines of code, and forgoing the JSON/JSONL manifest.

"
import nemo.collections.asr as nemo_asr
nemoasr_model = nemo_asr.models.EncDecCTCModelBPE.from_pretrained("nvidia/canary-1b")
nemoasr_model.transcribe(['AudioClipDirectly.wav'])

"

To accomplish the transcription of a directory of audio clips, I wrote some Python code that loops through the directory and outputs the transcription results as JSON for my local model comparison app.

from nemo.

titu1994 avatar titu1994 commented on May 27, 2024

You can use this high level script to do inference with Canary as well - https://github.com/NVIDIA/NeMo/blob/main/examples/asr/transcribe_speech.py

from nemo.

P15V avatar P15V commented on May 27, 2024

@titu1994 I did try that as well per the Nvidia NeMo tutorial. Leading to the same exact issues I was running into elsewhere.

from nemo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.