Comments (6)
(BTW I love how fast you are responding to issues here kudos 😀)
from transformers.js.
Updated in 6c9ea41 👍
will close the issue when I make the 1.3.2 release.
Let me know if you have any other questions or suggestions :)
from transformers.js.
Hi. Yes, you are correct, I originally implemented it with a TODO comment somewhere there to return a type parameter, which will help users differentiate between what is being run in the callback function.
For the most part, this is because HF's transformer library doesn't have the same functionality (probably because it would cause some inconsistencies), so, we had to make it ourselves (something which is very useful for streaming the output back).
I suppose we could change it to something like chunk_callback
? Any thoughts/suggestions?
from transformers.js.
Yeah, I really like the stream part, so keep it. I would keep calling the streaming callback callback_function
to stay consistent. chunk_callback
makes sense here since it is only used in the chuck callback.
Personally, my use cases are as follows:
callback_function
: for visualization of the newly generated tokens. This give the user feedback that the results are being generated.chunk_callback
: to get the entire result for the chunk + plus the timestamps.
callback_function
would then render the initial results, while chunk_callback
obtains the final result and adds the timestamps. (I'm not sure if this directly works since you need all the chucks so far for timestamps if you use pipeline.tokenizer._decode_asr(chunks, ...)
. I think the chunk_callback
only receives the newly generated chucks. However, this is fixable by keeping track of all the states yourself in the callback).
from transformers.js.
(BTW I love how fast you are responding to issues here kudos 😀)
Haha yeah I'm trying my best to get all these things fixed! 😄 I'm online 24/7 ;)
Yeah, I really like the stream part, so keep it. I would keep calling the streaming callback callback_function to stay consistent. chunk_callback makes sense here since it is only used in the chuck callback.
Okay that sounds do-able. This functionality (streaming while doing merging) isn't available in the python implementation, so, there aren't any "rules" to follow per-se (other than it should make sense haha).
callback_function would then render the initial results, while chunk_callback obtains the final result and adds the timestamps. (I'm not sure if this directly works since you need all the chucks so far for timestamps if you use pipeline.tokenizer._decode_asr(chunks, ...). I think the chunk_callback only receives the newly generated chucks. However, this is fixable by keeping track of all the states yourself in the callback).
Technically, it is possible to get the timestamps while you are generating, using a combination of the callback_function
and pipeline.tokenizer._decode_asr(chunks, ...)
, with force_full_sequences=false
. This would entail keeping track of all the output tokens as you generate, and you can detect a "new chunk" if the output token you get is the <|startoftranscript|>
token. Then, you can perform the merging as you generate new tokens. The force_full_sequences=false
is necessary, otherwise it will through an error because the whole transcript isn't ready yet.
I can try provide an example (later possibly), but I am currently working on fixing some other bugs (see other issues).
from transformers.js.
Update is now live! (https://www.npmjs.com/package/@xenova/transformers/v/1.3.2)
Closing the issue now :)
from transformers.js.
Related Issues (20)
- loading time
- What does "Error: failed to call OrtRun(). error code = 6." mean? I know it is ONNX related, but how to fix? HOT 6
- Request to this.path failed with status code 403
- GGUF support
- Useful snippets HOT 4
- Unknown model class "new", attempting to construct from base class. HOT 1
- Unknown PostProcessor type: Sequence HOT 6
- Chrome on Android crashes when starting Whisper HOT 3
- TypeError: Cannot read properties of undefined (reading 'create') HOT 5
- Could not read from file: D:\vite-project\node_modules\stream-browserify\web HOT 3
- Any plans to add moondream and build a demo? Xenova/moondream2 HOT 2
- TypeError: Failed to fetch dynamically imported module HOT 4
- Error using Xenova/nanoLLaVA in pipeline HOT 3
- Excessive Memory consumption. HOT 4
- could not find model_q4.onnx_data (v3 PR) HOT 1
- Uncaught (in promise) Error: no available backend found. ERR: [webgpu] TypeError: Failed to fetch dynamically imported module: HOT 10
- Options for the "translation" pipeline when using Xenova/t5-small
- Have considered using wasm technology to implement this library? HOT 1
- Example not working on Chrome/Arc v.124(M1 Mac) HOT 3
- Can you use all transformers models with transformers.js?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.js.