Light

Add phi3-mini about ailia-models HOT 12 OPEN

kyakuno commented on June 10, 2024

Add phi3-mini

from ailia-models.

Comments (12)

kyakuno commented on June 10, 2024

公式でonnxが提供されるかも。
https://onnxruntime.ai/blogs/accelerating-phi-3

from ailia-models.

kyakuno commented on June 10, 2024

公式でonnxが提供された。
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx

from ailia-models.

kyakuno commented on June 10, 2024

generate apiはpythonで書く必要がある。

from ailia-models.

kyakuno commented on June 10, 2024

推論コードの例。
microsoft/onnxruntime#20448

from ailia-models.

kyakuno commented on June 10, 2024

onnxruntimeのベータ版であれば下記で動く。

import onnxruntime_genai as og
import argparse
import time

model = og.Model(".\Phi-3-mini-128k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32")
tokenizer = og.Tokenizer(model)
tokenizer_stream = tokenizer.create_stream()


def input_llm(text):
    print("Question:",text)
    input_tokens = tokenizer.encode(text)
    params = og.GeneratorParams(model)
    params.try_use_cuda_graph_with_max_batch_size(1)
    params.input_ids = input_tokens
    generator = og.Generator(model, params)
    return generator

def output_llm(generator):
    print("Answer:")
    stt = time.time()
    list_error = []
    list_sentence = []
    while not generator.is_done():
        generator.compute_logits()
        generator.generate_next_token()
        new_token = generator.get_next_tokens()[0]
        if not new_token in list_error:
            try:
                list_sentence.append(tokenizer_stream.decode(new_token))
            except:
                list_error.append(new_token)
                list_sentence.append(new_token)
    print(list_sentence)
    fin = time.time()
    print(fin-stt)
    return list_error

from ailia-models.

kyakuno commented on June 10, 2024

onnxruntime_genaiのコード。
https://github.com/microsoft/onnxruntime-genai

from ailia-models.

kyakuno commented on June 10, 2024

generateはC++で書かれているので、Pytorch向けの実装を持ってきた方が良さそう。

from ailia-models.

kyakuno commented on June 10, 2024

とりあえずtokenizerはtransformersを使うと良さそう。

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

torch.random.manual_seed(0)

model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-128k-instruct", 
    device_map="cuda", 
    torch_dtype="auto", 
    trust_remote_code=True, 
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-128k-instruct")

messages = [
    {"role": "system", "content": "You are a helpful digital assistant. Please provide safe, ethical and accurate information to the user."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

generation_args = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "temperature": 0.0,
    "do_sample": False,
}

output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

https://huggingface.co/microsoft/Phi-3-mini-128k-instruct

from ailia-models.

kyakuno commented on June 10, 2024

文章生成はとりあえずgreedy searchとか。
https://github.com/axinc-ai/ailia-models/blob/master/natural_language_processing/rinna_gpt2/utils_rinna_gpt2.py

from ailia-models.

kyakuno commented on June 10, 2024

LlamaTokenizerを使っている。
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/tokenizer_config.json

from ailia-models.

kyakuno commented on June 10, 2024

LlamaTokenizer
https://github.com/huggingface/transformers/blob/37fa1f654f17b68bbe30440c64e611f1a4d55bc7/src/transformers/models/llama/tokenization_llama.py#L55

from ailia-models.

kyakuno commented on June 10, 2024

SentencePieceの一般的なTokenizerに見える。

from ailia-models.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs