Comments (12)
公式でonnxが提供されるかも。
https://onnxruntime.ai/blogs/accelerating-phi-3
from ailia-models.
公式でonnxが提供された。
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx
from ailia-models.
generate apiはpythonで書く必要がある。
from ailia-models.
推論コードの例。
microsoft/onnxruntime#20448
from ailia-models.
onnxruntimeのベータ版であれば下記で動く。
import onnxruntime_genai as og
import argparse
import time
model = og.Model(".\Phi-3-mini-128k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32")
tokenizer = og.Tokenizer(model)
tokenizer_stream = tokenizer.create_stream()
def input_llm(text):
print("Question:",text)
input_tokens = tokenizer.encode(text)
params = og.GeneratorParams(model)
params.try_use_cuda_graph_with_max_batch_size(1)
params.input_ids = input_tokens
generator = og.Generator(model, params)
return generator
def output_llm(generator):
print("Answer:")
stt = time.time()
list_error = []
list_sentence = []
while not generator.is_done():
generator.compute_logits()
generator.generate_next_token()
new_token = generator.get_next_tokens()[0]
if not new_token in list_error:
try:
list_sentence.append(tokenizer_stream.decode(new_token))
except:
list_error.append(new_token)
list_sentence.append(new_token)
print(list_sentence)
fin = time.time()
print(fin-stt)
return list_error
from ailia-models.
onnxruntime_genaiのコード。
https://github.com/microsoft/onnxruntime-genai
from ailia-models.
generateはC++で書かれているので、Pytorch向けの実装を持ってきた方が良さそう。
from ailia-models.
とりあえずtokenizerはtransformersを使うと良さそう。
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
torch.random.manual_seed(0)
model = AutoModelForCausalLM.from_pretrained(
"microsoft/Phi-3-mini-128k-instruct",
device_map="cuda",
torch_dtype="auto",
trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-128k-instruct")
messages = [
{"role": "system", "content": "You are a helpful digital assistant. Please provide safe, ethical and accurate information to the user."},
{"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
{"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
{"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
)
generation_args = {
"max_new_tokens": 500,
"return_full_text": False,
"temperature": 0.0,
"do_sample": False,
}
output = pipe(messages, **generation_args)
print(output[0]['generated_text'])
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct
from ailia-models.
文章生成はとりあえずgreedy searchとか。
https://github.com/axinc-ai/ailia-models/blob/master/natural_language_processing/rinna_gpt2/utils_rinna_gpt2.py
from ailia-models.
LlamaTokenizerを使っている。
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/tokenizer_config.json
from ailia-models.
from ailia-models.
SentencePieceの一般的なTokenizerに見える。
from ailia-models.
Related Issues (20)
- ADD MusicGen
- Add bert-network-packet-flow-header-payload
- PaddleOCRの標準モデルをServerモデルにする
- ADD AniPortrait HOT 2
- ADD sdxl-turbo HOT 3
- ModuleNotFoundError: No module named 'fvcore'
- ADD bge-m3 HOT 1
- ADD VISTA (hands-segmentation-pytorch)
- ADD Ego2Hands
- ADD japanese-reranker-cross-encoder-large-v1 HOT 1
- ADD cross-encoder-mmarco-mMiniLMv2-L12-H384-v1 HOT 18
- How to obtain hubert_base.onnx that supports v2 [768]
- ADD BeatNet HOT 1
- ADD kotoba-whisper-v1.0 HOT 4
- Add gradio ui
- READMEにOpen in Colabボタンを追加
- FP16 not working for CLAP HOT 1
- ADD g2p_en
- ADD IDN-VTON
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ailia-models.