Describe the bug This bug occurs when I try to use the canary-1b

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

text=data[self.text_field], KeyError: 'answer' <p d

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

You can use this high level to do inference with Canary as well - <a href="http

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

canary-1b not proccessing json files correctly about nemo HOT 8 OPEN

LaiWeiQuan commented on May 27, 2024 1

canary-1b not proccessing json files correctly

from nemo.

Comments (8)

titu1994 commented on May 27, 2024 1

Hmm can you try to see if you remove trailing new line in your manifest file, and also try to see if removing the json indent helps ?

from nemo.

krishnacpuvvada commented on May 27, 2024 1

return func(*args, **kwargs)
TypeError: EncDecMultiTaskModel.transcribe() got an unexpected keyword argument 'audio

can you verify that you are working with main branch of NeMo? the argument was updated from paths2audio_files to audio from r1.23 to main branch.

Also as a sanity check, can you try the following on main branch

trascript = canary_model.trascribe(audio=['path_to_file.wav']) # assuming path_to_file.wav is english audio.

from nemo.

P15V commented on May 27, 2024

@titu1994 I am experiencing much the same issue on my end. I tried as such and had no luck. I get the same JSON error as the one posted above.
when doing as suggested by removing the JSON indent and trailing lines. I get the following error:
"json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 171 (char 170)"

When directly fixing the JSON errors, I end up with this error that suggests it expects the JSON to have an 'answer' key somewhere?!? :

"-packages/nemo/collections/common/data/lhotse/nemo_adapters.py", line 84, in iter
text=data[self.text_field],
KeyError: 'answer'
Transcribing: 0it [00:00, ?it/s]"

I want to run ASR, so expecting an 'answer' key is wonky.

from nemo.

krishnacpuvvada commented on May 27, 2024

text=data[self.text_field],
KeyError: 'answer'

to fix this, please modify the input lines file to add 'answer' field.

{
  "audio_filepath": "PathRemovedDueToPersonalName",  
  "duration": 30.0,  
  "taskname": "asr",  
  "source_lang": "en", 
  "target_lang": "en", 
  "pnc": 'yes', 
  "answer": 'na',
}

also, we recently updated .transcribe signature, so if you are using main branch

transcript = canary_model.transcribe(paths2audio_files="/home/pjstimac/NvidiaCanaryTest/transcribe_manifest.json", batch_size=16)

should be updated to
transcript = canary_model.transcribe(audio="/home/pjstimac/NvidiaCanaryTest/transcribe_manifest.json", batch_size=16)

from nemo.

P15V commented on May 27, 2024

@krishnacpuvvada No luck, unfortunately. Tried that exact format(with JSON & JSONL) with the updated transcript variable. Yields this error
"
return func(*args, **kwargs)
TypeError: EncDecMultiTaskModel.transcribe() got an unexpected keyword argument 'audio
'"

from nemo.

P15V commented on May 27, 2024

@krishnacpuvvada I'll double-check and update again when I'm home. But I was following the Nvidia tutorial and git-cloning the repo just last night, so I'm assuming it's the latest as of yesterday evening.

The good news is I finally got basic inference working with three lines of code, and forgoing the JSON/JSONL manifest.

"
import nemo.collections.asr as nemo_asr
nemoasr_model = nemo_asr.models.EncDecCTCModelBPE.from_pretrained("nvidia/canary-1b")
nemoasr_model.transcribe(['AudioClipDirectly.wav'])

To accomplish the transcription of a directory of audio clips, I wrote some Python code that loops through the directory and outputs the transcription results as JSON for my local model comparison app.

from nemo.

titu1994 commented on May 27, 2024

You can use this high level script to do inference with Canary as well - https://github.com/NVIDIA/NeMo/blob/main/examples/asr/transcribe_speech.py

from nemo.

P15V commented on May 27, 2024

@titu1994 I did try that as well per the Nvidia NeMo tutorial. Leading to the same exact issues I was running into elsewhere.

from nemo.

canary-1b not proccessing json files correctly about nemo HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs