GithubHelp home page GithubHelp logo

Add phi3-mini about ailia-models HOT 12 OPEN

kyakuno avatar kyakuno commented on June 10, 2024
Add phi3-mini

from ailia-models.

Comments (12)

kyakuno avatar kyakuno commented on June 10, 2024

公式でonnxが提供されるかも。
https://onnxruntime.ai/blogs/accelerating-phi-3

from ailia-models.

kyakuno avatar kyakuno commented on June 10, 2024

公式でonnxが提供された。
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx

from ailia-models.

kyakuno avatar kyakuno commented on June 10, 2024

generate apiはpythonで書く必要がある。

from ailia-models.

kyakuno avatar kyakuno commented on June 10, 2024

推論コードの例。
microsoft/onnxruntime#20448

from ailia-models.

kyakuno avatar kyakuno commented on June 10, 2024

onnxruntimeのベータ版であれば下記で動く。

import onnxruntime_genai as og
import argparse
import time

model = og.Model(".\Phi-3-mini-128k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32")
tokenizer = og.Tokenizer(model)
tokenizer_stream = tokenizer.create_stream()


def input_llm(text):
    print("Question:",text)
    input_tokens = tokenizer.encode(text)
    params = og.GeneratorParams(model)
    params.try_use_cuda_graph_with_max_batch_size(1)
    params.input_ids = input_tokens
    generator = og.Generator(model, params)
    return generator

def output_llm(generator):
    print("Answer:")
    stt = time.time()
    list_error = []
    list_sentence = []
    while not generator.is_done():
        generator.compute_logits()
        generator.generate_next_token()
        new_token = generator.get_next_tokens()[0]
        if not new_token in list_error:
            try:
                list_sentence.append(tokenizer_stream.decode(new_token))
            except:
                list_error.append(new_token)
                list_sentence.append(new_token)
    print(list_sentence)
    fin = time.time()
    print(fin-stt)
    return list_error

from ailia-models.

kyakuno avatar kyakuno commented on June 10, 2024

onnxruntime_genaiのコード。
https://github.com/microsoft/onnxruntime-genai

from ailia-models.

kyakuno avatar kyakuno commented on June 10, 2024

generateはC++で書かれているので、Pytorch向けの実装を持ってきた方が良さそう。

from ailia-models.

kyakuno avatar kyakuno commented on June 10, 2024

とりあえずtokenizerはtransformersを使うと良さそう。

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

torch.random.manual_seed(0)

model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-128k-instruct", 
    device_map="cuda", 
    torch_dtype="auto", 
    trust_remote_code=True, 
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-128k-instruct")

messages = [
    {"role": "system", "content": "You are a helpful digital assistant. Please provide safe, ethical and accurate information to the user."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

generation_args = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "temperature": 0.0,
    "do_sample": False,
}

output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

https://huggingface.co/microsoft/Phi-3-mini-128k-instruct

from ailia-models.

kyakuno avatar kyakuno commented on June 10, 2024

文章生成はとりあえずgreedy searchとか。
https://github.com/axinc-ai/ailia-models/blob/master/natural_language_processing/rinna_gpt2/utils_rinna_gpt2.py

from ailia-models.

kyakuno avatar kyakuno commented on June 10, 2024

LlamaTokenizerを使っている。
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/tokenizer_config.json

from ailia-models.

kyakuno avatar kyakuno commented on June 10, 2024

LlamaTokenizer
https://github.com/huggingface/transformers/blob/37fa1f654f17b68bbe30440c64e611f1a4d55bc7/src/transformers/models/llama/tokenization_llama.py#L55

from ailia-models.

kyakuno avatar kyakuno commented on June 10, 2024

SentencePieceの一般的なTokenizerに見える。

from ailia-models.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.