nothing is here
seanlee97 / angle Goto Github PK
View Code? Open in Web Editor NEWTrain and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
Home Page: https://arxiv.org/abs/2309.12871
License: MIT License
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
Home Page: https://arxiv.org/abs/2309.12871
License: MIT License
nothing is here
Hi,
Awesome work! Can you share the details about what data was used for adapting WhereIsAI/UAE-Large-V1 from BGE-large? Can you share the data as well?
Thanks!
Hello, I cannot run examples/Angle-ATEC.ipynb, angle.fit()
output nothing and GPUs are not working.
Might be version issue, my environment: Successfully installed bitsandbytes-0.41.3.post2 boltons-23.1.1 peft-0.7.1 tokenizers-0.15.0 transformers-4.36.2
Dear author, thank you for your excellent work. I am now looking to measure the semantic similarity between multiple answers generated by a llm and the ground truth answer. Can I directly use your model to extract features from both the answers generated by the large model and the real answer, and then calculate their cosine similarity as the score for their semantic similarity match? Will the performance of STS be affected?
Would you also be looking at other Llama based models, like Gemma?
I am able to create an embedding using sentence transformers, but I was not sure it support it or not.
Hi Sean,
Thanks for the amazing work! I notice that there might be a small bug in newer versions of the code resulting in a device error when using angle-llama to get embeddings. I downgrade the version to 0.3.0 and the problem disappears.
To reproduce the error, simply execute the code given in Angle-llama instructions
Could you take a quick look at the problem? Thanks !
From a code perspective, the paper concludes by adding up all values of the complex loss function
This is a normal complex division formula and transformation. The purpose of the paper is to obtain the content of the red box.
But you ultimately add up, as shown in the following figure:
Is this the desired result of the paper? May I ask if you can tell me?, thank you.
On the leaderboard I see a result for re-ranking. How is this done with these embeddings?
Hello, could you please add a little explanation of the difference Non-Retrieval and Retrieval tasks for UAE? Why would one be used instead of another? I'm looking to create sentence embeddings to store in a database. Thank you!
Hi author, I have a question: the so-called cosine similarity is actually a vector dot product, and there is no real cosine at all. When the gradient is calculated, there is only multiplication, there is no cos at all, and there is no so-called saturation region where the gradient disappears. Can you explain?
When I use angle-bert-base-uncased-nli-en-v1 to evaluate STS performance, I find that it is inconsistent with the original report.
The command line I use:
python eval_nli.py
--model_name_or_path /home/whzhu_st/Model/angle-bert-base-uncased-nli-en-v1
--task_set sts
--pooling_strategy cls_avg
Enviroment:
torch 1.13.1
transformer 4.38.1
V100 GPU
So is this result acceptable within the error range or is there something wrong with my command?
Hey,
I am trying to get the encoding using tiktoken to initiate token counter:
import tiktoken
from llama_index.callbacks import CallbackManager, TokenCountingHandler
enc = tiktoken.get_encoding("WhereIsAI/UAE-Large-V1")
token_counter = TokenCountingHandler(tokenizer= enc.encode)
But i am getting following error:
_---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[20], line 3
1 import tiktoken
2 from llama_index.callbacks import CallbackManager, TokenCountingHandler
----> 3 enc = tiktoken.get_encoding("WhereIsAI/UAE-Large-V1")
4 token_counter = TokenCountingHandler(tokenizer= enc.encode)
File f:\pycharmprojects\llamaindex\venv\lib\site-packages\tiktoken\registry.py:68, in get_encoding(encoding_name)
65 assert ENCODING_CONSTRUCTORS is not None
67 if encoding_name not in ENCODING_CONSTRUCTORS:
---> 68 raise ValueError(
69 f"Unknown encoding {encoding_name}. Plugins found: {_available_plugin_modules()}"
70 )
72 constructor = ENCODING_CONSTRUCTORS[encoding_name]
73 enc = Encoding(**constructor())
ValueError: Unknown encoding WhereIsAI/UAE-Large-V1. Plugins found: ['tiktoken_ext.openai_public']_
Is there any way to use the encodings with tiktoken ?
Thanks
Thank you for your awesome project!!
Can you provide the SimCSE-LLaMA2 code?
in angle_emb/angle.py
line 767
labels = inputs.pop("labels", None) <-- may be error
# labels = inputs.pop("labels", None) <-- may be ok
You have already declared.
def compute_loss(self, model, inputs, return_outputs=False):
labels = inputs.pop("labels", None) <-- like this
It seems to be will rasie error..
Dear author, I want to use bert-base-uncased model to train on NLI dataset based on your method for some research. Could you provide relevant training scripts so that I can better reproduce your experimental results? This is my training script, using the same data as your training. I cannot reproduce the evaluation effect of your angle-bert-base-uncased-nli-en-v1 model.
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 --master_port=1234 train_nli.py \
--task NLI-STS --output_dir ckpts/NLI-STS-bert-cls \
--model_name_or_path ../models/bert-base-uncased \
--learning_rate 5e-5 --maxlen 50 \
--epochs 1 \
--batch_size 10 \
--logging_steps 500 \
--warmup_steps 0 \
--save_steps 1000 --seed 42 --do_eval 0 --gradient_accumulation_steps 4 --fp16 1 --torch_dtype 'float32' \
--pooling_strategy 'cls'
I am running out of memory on Tesla T4. I have 4 of them though and I usually use accelerator for multigpu setup. How can I use them for angle semantic similarity?
您好,能否调整API使模型推理时存储在多张显卡上?我现在有多张24G显存的显卡并且我希望能够运行LLama-7B进行embedding
Hi,
To use the generated embeddings from WhereIsAI/UAE-Large-V1 in an LLM model , do I first need to fine tune a pre-trained LLM model with AnglE so that WhereIsAI/UAE-Large-V1 embeddings are compatible with an LLM? e.g.
angle = AnglE.from_pretrained('NousResearch/Llama-2-7b-hf', pretrained_lora_path='SeanLee97/angle-llama-7b-nli-v2')
Thank you !
When we use the AnglE to build a (faiss) vector store for retrieval, do we need to customize an distance function which is in accord with the final objective function?
The default distance of faiss vector store is L2_distance, and it has an option to COSINE.
Will the retrieval system perform well just with L2 or COSINE?
I am facing while deploying embedded model on AWS sagemaker. I ran the same script given in hugging face but got error:
ModelError Traceback (most recent call last)
Cell In[5], line 32
26 # deploy model to SageMaker Inference
27 predictor = huggingface_model.deploy(
28 initial_instance_count=1, # number of instances
29 instance_type='ml.r5d.12xlarge' # ec2 instance type
30 )
---> 32 predictor.predict({
33 "inputs": "Today is a sunny day and I will get some ice cream.",
34 })
File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/sagemaker/base_predictor.py:167, in Predictor.predict(self, data, initial_args, target_model, target_variant, inference_id)
137 """Return the inference from the specified endpoint.
138
139 Args:
(...)
161 as is.
162 """
164 request_args = self._create_request_args(
165 data, initial_args, target_model, target_variant, inference_id
166 )
--> 167 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
168 return self._handle_response(response)
File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/botocore/client.py:553, in ClientCreator._create_api_method.._api_call(self, *args, **kwargs)
549 raise TypeError(
550 f"{py_operation_name}() only accepts keyword arguments."
551 )
552 # The "self" in this scope is referring to the BaseClient.
--> 553 return self._make_api_call(operation_name, kwargs)
File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/botocore/client.py:1009, in BaseClient._make_api_call(self, operation_name, api_params)
1005 error_code = error_info.get("QueryErrorCode") or error_info.get(
1006 "Code"
1007 )
1008 error_class = self.exceptions.from_code(error_code)
-> 1009 raise error_class(parsed_response, operation_name)
1010 else:
1011 return parsed_response
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "Could not load model /.sagemaker/mms/models/WhereIsAI__UAE-Large-V1 with any of the following classes: (\u003cclass \u0027transformers.models.auto.modeling_auto.AutoModel\u0027\u003e, \u003cclass \u0027transformers.models.bert.modeling_bert.BertModel\u0027\u003e)."
}
This is an amazing work. I have been working on something that would require me to evaluate the generated outputs of models like Mistral, using a prompt like:
"Fill the [MASK] token in the sentence. Generate a single output."
Now earlier, I would simply instruction fine-tune a Mistral Model. But I would like to explore the possibility of using these models with a bi-directional attention.
I see that the library allows me to access the backbone
model underneath. But it is not clear to me if this model has the bi-directional attention. Can you please clarify this? If it does, I could simply use the backbone.generate()
function for my purpose.
Thanks in advance!
Hi, could you please give me an overview of how to fine-tune an NLI model? Namely:
In the paper, at training time, it appears that you treat embeddings as a vectors of real numbers which is used to calculate cosine similarity and also as vectors of complex numbers which is used to calculate the angle between the two vectors to measure similarity. At inference time, what similarity metric should I use measure semantic similarity?
In angle.py:
if end_with_eos:
features = self.tokenizer.pad(
{'input_ids': [feature['input_ids'] for feature in new_features]},
padding=False,
max_length=self.max_length - 1,
return_tensors=return_tensors,
truncation=True,
)
features['input_ids'] = [input_ids + [self.tokenizer.eos_token_id] for input_ids in features['input_ids']]
features = self.tokenizer.pad(features, padding=self.padding, return_tensors=return_tensors)
TypeError: PreTrainedTokenizerBase.pad() got an unexpected keyword argument 'truncation'
I'm using AngIE 0.3.1, tokenizers 0.15.1
Here is my train_lora.py:
from datasets import load_dataset
from angle_emb import AnglE, AngleDataTokenizer
# 2. load dataset
# `text1`, `text2`, and `label` are three required columns.
def get_ds(path):
ds = xxx
return ds
# 3. transform data
rt = '../data/dataset/v02/'
data_files = {xxx}
ds = load_dataset(rt)
ds = ds.map(lambda obj: {"text1": str(obj["s1"]), "text2": str(obj['s2']), "label": obj['label']})
ds = ds.select_columns(["text1", "text2", "label"])
# 1. load pretrained model
# model_path = '../UAE-Large-V1' # first finetune based model
model_path = '../sts-b/2/ll10e1/best-checkpoint/' # second finetune based model
angle = AnglE.from_pretrained(model_path, max_length=50, pooling_strategy='cls', apply_lora=True, load_kbit=4, train_mode=True).cuda()
# 3. transform data
train_ds = ds['train'].shuffle().map(AngleDataTokenizer(angle.tokenizer, angle.max_length), num_proc=8)
valid_ds = ds['validation'].map(AngleDataTokenizer(angle.tokenizer, angle.max_length), num_proc=8)
batch_size = 32
save_steps = len(train_ds) // batch_size
lrb = 10
epoch = 5
output_dir = f'../sts-b/7/ll{lrb}e{epoch}'
print('save_steps:', save_steps, output_dir)
# 4. fit
angle.fit(
train_ds=train_ds,
valid_ds=valid_ds,
output_dir=output_dir,
batch_size=batch_size,
epochs=epoch,
learning_rate=lrb * (10 ** -5),
save_steps=save_steps,
eval_steps=1000,
warmup_steps=0,
gradient_accumulation_steps=4,
loss_kwargs={
'w1': 1.0,
'w2': 35,
'w3': 1.0,
'cosine_tau': 20,
'ibn_tau': 20,
'angle_tau': 1.0
},
fp16=True,
logging_steps=100
)
When I run this code to finetune with the first finetune model, this error occurs:
INFO:AnglE:lora_config={'task_type': <TaskType.FEATURE_EXTRACTION: 'FEATURE_EXTRACTION'>, 'r': 32, 'lora_alpha': 32, 'lora_dropout': 0.1}
INFO:AnglE:lora target modules=['base_layer', 'default']
INFO:peft.tuners.tuners_utils:Already found a peft_config
attribute in the model. This will lead to having multiple adapters in the model. Make sure to know what you are doing!
Traceback (most recent call last):
File "/mnt/bd/mlx-bytedrive-1378-622c9164/llm/uae/train_lora.py", line 22, in
angle = AnglE.from_pretrained(model_path, max_length=50, pooling_strategy='cls', apply_lora=True, load_kbit=4, train_mode=True).cuda() #
File "/mnt/bd/mlx-bytedrive-1378-622c9164/llm/venv/lib/python3.9/site-packages/angle_emb/angle.py", line 847, in from_pretrained
angle = AnglE(model_name_or_path,
File "/mnt/bd/mlx-bytedrive-1378-622c9164/llm/venv/lib/python3.9/site-packages/angle_emb/angle.py", line 772, in init
model = get_peft_model(model, peft_config)
File "/mnt/bd/mlx-bytedrive-1378-622c9164/llm/venv/lib/python3.9/site-packages/peft/mapping.py", line 133, in get_peft_model
return MODEL_TYPE_TO_PEFT_MODEL_MAPPING[peft_config.task_type](model, peft_config, adapter_name=adapter_name)
File "/mnt/bd/mlx-bytedrive-1378-622c9164/llm/venv/lib/python3.9/site-packages/peft/peft_model.py", line 1835, in init
super().init(model, peft_config, adapter_name)
File "/mnt/bd/mlx-bytedrive-1378-622c9164/llm/venv/lib/python3.9/site-packages/peft/peft_model.py", line 125, in init
self.base_model = cls(model, {adapter_name: peft_config}, adapter_name)
File "/mnt/bd/mlx-bytedrive-1378-622c9164/llm/venv/lib/python3.9/site-packages/peft/tuners/lora/model.py", line 111, in init
super().init(model, config, adapter_name)
File "/mnt/bd/mlx-bytedrive-1378-622c9164/llm/venv/lib/python3.9/site-packages/peft/tuners/tuners_utils.py", line 90, in init
self.inject_adapter(self.model, adapter_name)
File "/mnt/bd/mlx-bytedrive-1378-622c9164/llm/venv/lib/python3.9/site-packages/peft/tuners/tuners_utils.py", line 247, in inject_adapter
self._create_and_replace(peft_config, adapter_name, target, target_name, parent, **optional_kwargs)
File "/mnt/bd/mlx-bytedrive-1378-622c9164/llm/venv/lib/python3.9/site-packages/peft/tuners/lora/model.py", line 202, in _create_and_replace
new_module = self._create_new_module(lora_config, adapter_name, target, **kwargs)
File "/mnt/bd/mlx-bytedrive-1378-622c9164/llm/venv/lib/python3.9/site-packages/peft/tuners/lora/model.py", line 355, in _create_new_module
raise ValueError(
ValueError: Target module Dropout(p=0.1, inplace=False) is not supported. Currently, only the following modules are supported: torch.nn.Linear
, torch.nn.Embedding
, torch.nn.Conv2d
, transformers.pytorch_utils.Conv1D
.
If I load model with
angle = AnglE.from_pretrained(model_path, max_length=50, pooling_strategy='cls', train_mode=True).cuda()
Then this error occus:
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
What should I do to finetune my finetuned adapter(peft) model again?
Thanks!
Thank you for your works. I'm new to NLP, and I want to know which feature to use to cluster similar sentences?
After UAE(non retrieval), I'll get a (n, 1024) feature, should I use the starter token's feature same as E5?
And BTW, I found that using E5, "A red teddy bear wearing blue shirt" is very similar to "A blue teddy bear wearing red shirt". Similarly, "A man riding a horse" will be close to "A horse riding a man", is that a problem for all algorithms?
Mac mini
14.2.1 (23C71)
angle-emb==0.4.5
from angle_emb import AnglE, Prompts
print('All predefined prompts:', Prompts.list_prompts())
angle = AnglE.from_pretrained('WhereIsAI/UAE-Large-V1', pooling_strategy='cls')
print("angle:", angle)
angle.set_prompt(prompt=Prompts.C)
get error
angle: <angle_emb.angle.AnglE object at 0x152515d30>
Traceback (most recent call last):
File "/Volumes/NBDATA/JobProjects/Tsinghua/Data-chat/text_splitter/article_partition_splitter.py", line 19, in <module>
angle.set_prompt(prompt=Prompts.C)
AttributeError: 'AnglE' object has no attribute 'set_prompt'
I wanted to start by expressing my appreciation for your incredible model; its outstanding performance has significantly benefited my work, and for that, I am truly grateful.
I'm reaching out to inquire if you might consider incorporating Matryoshka Representation Learning into your model's training process. I believe that this technique could further amplify the model's capabilities and effectiveness, potentially boosting its performance even more.
Thank you for your time and for creating such a valuable tool.
Have you include M3E model as a comparison ?
I wanna ask whether it is possible and How to combine Angle and bert-multilingual-base to obtain a model similar to angle-bert-multilingual-base-uncased-nli-en-v1?
Is there any information if this is also recommended for extracting embeddings from code snippets? In particular Javascipt and Solidity?
Hi, very interesting approach, and impressive results.
Have you run the MTEB STS benchmark on your trained models?
Would be interesting to see their performance v/s existing models with similar model size.
Thanks
Hello, I am wondering what constant values you were using for fine-tuning, the loss is L = w1 ∗ Lcos + w2 ∗ Libn + w3 ∗ Langle, but I did not find the values of w1, w2 and w3 in your paper.
Snli dataset contains contradict pairs, they define labels:
label: an integer whose value may be either 0, indicating that the hypothesis entails the premise, 1, indicating that the premise and hypothesis neither entail nor contradict each other, or 2, indicating that the hypothesis contradicts the premise. Dataset instances which don't have any gold label are marked with -1 label. Make sure you filter them before starting the training using datasets.Dataset.filter.
If I want to use AngLE to fine-tune on those kind of dataset, Should I set -1 for contradict pairs?
Got this error while using this library to train an embedding model:
File "/usr/local/lib/python3.8/dist-packages/angle_emb/angle.py", line 986, in on_epoch_end
corrcoef, accuracy = self.evaluate_fn(self.valid_ds)
File "/usr/local/lib/python3.8/dist-packages/angle_emb/angle.py", line 1470, in evaluate
pred = (x_vecs[::2] * x_vecs[1::2]).sum(1)
ValueError: operands could not be broadcast together with shapes (38,384) (37,384)
I confirmed that valid_ds
and train_ds
were of even length, so ultimately I just modified one line of the evaluate method of the AnglE class. After this line:
x_vecs = l2_normalize(x_vecs)
I added:
if len(x_vecs) % 2 != 0:
x_vecs = x_vecs[:-1]
Hopefully that doesn't break anything/everything else? Any thougths on what else might be the source of the issue?
Also, I attempted to restart training by running the same angle.fit()
as I did when I started it but adjusting the from_pretrained
to point to the most recent checkpoint:
angle = AnglE.from_pretrained('/checkpoint-1100', max_length=512, pooling_strategy='cls').cuda()
I don't see a resume_from_checkpoint=True
argument option anywhere... so it's not clear that it's aware of how many epochs have already been run etc.
A recent update to PEFT deprecates and removes references to prepare_model_for_8bit_training
. Since AnglE
's dependencies are unversioned and it references the deprecated method here, the model breaks when its constructor is invoked.
I am trying to train my model using LLAMA-v2-nli. I was able to do so with the bert-nli model but when I try to run with LLAMA I get the following error:
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
`from angle_emb import AnglE, AngleDataTokenizer
angle = AnglE.from_pretrained('NousResearch/Llama-2-7b-hf', pretrained_lora_path='SeanLee97/angle-llama-7b-nli-v2').cuda()
train_ds = ds['train'].shuffle().map(AngleDataTokenizer(angle.tokenizer, angle.max_length), num_proc=8)
valid_ds = ds['valid'].map(AngleDataTokenizer(angle.tokenizer, angle.max_length), num_proc=8)
test_ds = ds['test'].map(AngleDataTokenizer(angle.tokenizer, angle.max_length), num_proc=8)
angle.fit(
train_ds=train_ds,
valid_ds=test_ds,
output_dir='ckpts/sts-b',
batch_size=16,
epochs=5,
learning_rate=2e-5,
save_steps=100,
eval_steps=1000,
warmup_steps=0,
gradient_accumulation_steps=1,
loss_kwargs={
'w1': 1.0,
'w2': 1.0,
'w3': 1.0,
'cosine_tau': 20,
'ibn_tau': 20,
'angle_tau': 1.0
},
fp16=True,
logging_steps=100
)`
I used the same code for bert (loaded the bert model instead) and it works no issues
Since I only saw the training example of angle-bert-base-uncased-nli-en-v1, I was wondering if the UAE-Large-V1 training is the same. Thank you very much for your replies.
Thanks for your great work. Could you please provide the code for 2D Matryoshka Sentence Embeddings or any checkpoints?
I am using Llamaindex to index documents into chromadb and for that I use the HuggingFaceEmbedding abstraction like that:
embed_model = HuggingFaceEmbedding(model_name="WhereIsAI/UAE-Large-V1")
However I read that one need to specify prompt C in order to optimize the embedding for retrieval.
Hi, this is a really good and useful codebase. I tried to reproduce the results reported in the paper but failed. I used the code in README_ESE.md
:
WANDB_MODE=disabled CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 --master_port=1234 -m angle_emb.angle_trainer \
--model_name_or_path WhereIsAI/UAE-Large-V1 \
--train_name_or_path SeanLee97/nli_for_simcse --save_dir ckpts/UAE-Large-Espresso \
--ibn_w 10.0 --cosine_w 0. --angle_w 1.0 --angle_tau 20.0 --learning_rate 1e-6 --maxlen 75 \
--workers 16 \
--pooling_strategy cls \
--epochs 1 \
--batch_size 128 \
--logging_steps 100 \
--warmup_steps 200 \
--save_steps 1000 \
--fp16 1 \
--gradient_accumulation_steps 4 \
--apply_ese 1 \
--ese_compression_size 128 \
--ese_kl_temperature 1.0
However, it only gave the following results:
sts12 | sts13 | sts14 | sts15 | sts16 | STSB | SICKR | Avg. |
---|---|---|---|---|---|---|---|
79.25 | 88.63 | 84.15 | 89.61 | 85.99 | 87.79 | 79.59 | 85.00 |
I also change --cosine_w 0.
to --cosine_w 1.0
and --ibn_w 10.0
to --ibn_w 35.0
, but the results were even worse.
The results reported in your paper are:
sts12 | sts13 | sts14 | sts15 | sts16 | STSB | SICKR | Avg. |
---|---|---|---|---|---|---|---|
79.64 | 90.40 | 85.76 | 90.33 | 86.64 | 88.54 | 81.09 | 86.06 |
If I purely evaluate the WhereIsAI/UAE-Large-V1
model, the results are:
sts12 | sts13 | sts14 | sts15 | sts16 | STSB | SICKR | Avg. |
---|---|---|---|---|---|---|---|
79.09 | 89.62 | 85.02 | 89.51 | 86.61 | 89.06 | 82.09 | 85.86 |
This means fine-tuning gave me worse performance. In addition, I noticed that the more epochs I train, the worse the performance gets.
Besides, I also tried the code in examples/NLI/README.md
to train Qwen1.5-0.5B
:
CUDA_VISIBLE_DEVICES=1,2,3,4 torchrun --nproc_per_node=4 --master_port=1234 train_angle.py \
--task NLI-STS --save_dir ckpts/NLI-STS-angle-Qwen1.5-0.5B \
--model_name Qwen/Qwen1.5-0.5B \
--w2 35 --learning_rate 1e-4 --maxlen 50 \
--lora_r 32 --lora_alpha 32 --lora_dropout 0.1 \
--save_steps 500 --batch_size 120 --seed 42 --do_eval 0 --load_kbit 4 --gradient_accumulation_steps 4 --epochs 1
It gave me an average score of 70.23, whereas the paper reports 82.82.
I wonder whether these scripts are the ones you used to train your model, especially regarding the parameter values. It would be really helpful if you could assist me in reproducing the results so I can use this codebase. I really appreciate your time and help! Thank you!
I created an application that uses the UAE-large-V1 model inside Transformers.js and was able to embed sentences in a browser without issues. The model would return a single vector for a single input:
extractor = await pipeline("feature-extraction", "WhereIsAI/UAE-Large-V1", {
quantized: true,
});
let result = await extractor(text, { pooling: "mean", normalize: true });
When I hosted the model on Huggingface using their inference endpoint solution, it no longer works as expected. Instead of returning a single vector, it returns a variable length of 1024 dimension vectors.
Sample input:
{
"inputs": "Where are you"
}
This returns a list of lists of lists of numbers.
Is there a way to make hosted model return a single vector? And why does the the model act differently based on where it's hosted?
When I install angle-emb and then try to load the Llama model, I get an error because sentencepiece
is not installed. Maybe sentencepiece
needs to be added as a requirement of the package?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.