Comments (10)
Sure, here's the code for it:
transformers.js/src/utils/tensor.js
Lines 577 to 620 in 035f69f
Nothing too fancy... and it assumes certain dimensions for the input.
from transformers.js.
You saved me hours! :P
from transformers.js.
The most likely reason is due to quantisation of the models. The model weights are reduced in precision from 32-bit to 8-bit to reduce model size by a factor of ~4 (very important for usage on a website).
However, if you are okay with loading the full model, you can export the model yourself without quantising it, and this should produce the exact same outputs. The conversion script provided uses huggingface's optimum library under the hood to do the conversion, and they generally match the accuracy quite well.
Here's an example command (without --quantize
)
python ./scripts/convert.py --model_id sentence-transformers/all-MiniLM-L6-v2 --from_hub --task default
and then just point to the location of your model (see readme)
from transformers.js.
That said, while trying to do this on my end, I did run into an issue where the pooled value wasn't being returned (most likely due to the newest version of optimum, which removed some of those nodes). So, I will implement the mean pooling myself (see here)
That should fix everything :) (since you will be able to use the original model, which is only 80MB, so, nothing too problematic)
from transformers.js.
Thanks for the quick reply!
I ran convert.py and generated an unquantized model and modified my code to look like this:
global.self = global;
const { pipeline, env } = require("@xenova/transformers");
env.onnx.wasm.numThreads = 1;
env.remoteModels = false;
env.localURL = "transformers.js/models/onnx/unquantized";
(async()=> {
let embedder = await pipeline('embeddings', 'sentence-transformers/all-MiniLM-L6-v2')
let sentences = [
'The quick brown fox jumps over the lazy dog.'
]
let output = await embedder(sentences)
console.log(output[0].length);
})();
and I now get this error:
TypeError: Cannot read properties of undefined (reading 'data')
at Function._call (node-transformers/node_modules/@xenova/transformers/src/pipelines.js:286:51)
which corresponds to this line in pipeline.js:
let embeddings = reshape(embeddingsTensor.data, embeddingsTensor.dims);
I confirmed that the path for the model was correct by altering the localUrl
to an invalid path and transformers.js responded with File not found
. So it appears that transformers.js is finding the exported model.
from transformers.js.
Yep, you did everything correct.
That's the exact error message I got now (which is because of the version of optimum used to export). Busy fixing now! :)
from transformers.js.
Okay - these changes should fix it: 851815b
This also fixes the original issue of the outputs being different (since I wasn't correctly performing mean pooling + normalization, as the library does). The outputs should be much closer to what they should be (when quantized), and nearly identical (when unquantized)
Before I make a full release, do you mind testing on your side to see if it functions correctly? I believe you can install an npm package directly from GitHub
Also: I updated it so it returns a tensor (instead of nested javascript lists), for efficiency reasons. To get back the list, just call .tolist()
on the tensor.
from transformers.js.
Just tested this out and everything looks great! Thank you! I generated embeddings for two sentences in both JS and Python and calculated the cos similarity (in their respective libraries) and they were identical.
from transformers.js.
Awesome! I'll push a new release.
v1.3.1 is now live with the changes :) https://www.npmjs.com/package/@xenova/transformers
Thanks again for reporting!
from transformers.js.
That said, while trying to do this on my end, I did run into an issue where the pooled value wasn't being returned (most likely due to the newest version of optimum, which removed some of those nodes). So, I will implement the mean pooling myself (see here)
That should fix everything :) (since you will be able to use the original model, which is only 80MB, so, nothing too problematic)
Do you have the implementation of this mean pooling in js?
from transformers.js.
Related Issues (20)
- Options for the "translation" pipeline when using Xenova/t5-small
- Have considered using wasm technology to implement this library? HOT 1
- Example not working on Chrome/Arc v.124(M1 Mac) HOT 5
- Can you use all transformers models with transformers.js?
- Request to support suno-ai/bark model HOT 2
- Something went wrong during model construction (most likely a missing operation). Using `wasm` as a fallback. HOT 4
- v3: Issue loading T5-Small with webgpu backend HOT 1
- WebAssembly Compilation Error Due to CSP Restrictions HOT 5
- [v3] webgpu-chat demo fails with `no available backend found` HOT 1
- How to point to a specific model path in order to use compressed models? (brotli) HOT 1
- Method to send `HF_TOKEN` from the client side
- Pipeline execution time with 'image-classification' pipeline HOT 1
- Mistral Tokenizer V2
- Allow overriding model file name to support custom implementations of various compressions (brotli, gzip, etc)
- Progress callback for Moondream?
- NanoLlava is repeating itself a lot HOT 1
- Shouldn't this work? #v3 HOT 3
- Using AutoModel, AutoTokenizer with distilbert models HOT 1
- Support stopping criteria for sequence generation HOT 1
- WebGPU and WASM Backends Unavailable within Service Worker (V3 issue) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.js.