GithubHelp home page GithubHelp logo

project-alpaca's Introduction

PROJECT-Alpaca

  • 기간 : 23.04.06 ~ 23.04.13

Gererating the data

  • API 연결

    ### .env 파일
    OPENAI_API_KEY = {OPENAI_API_KEY}
  • 실행

    python -m generate_instruction generate_instruction_following_data

  • trouble shooting

    typeerror: 'type' object is not subscriptable

    from typing import Optional, Sequence, Union, Dict
    
    def openai_completion(
        # prompts: Union[str, Sequence[str], Sequence[dict[str, str]], dict[str, str]],
        ### 타입 어노테이션 오류 !
        prompts: Union[str, Sequence[str], Sequence[Dict[str, str]], Dict[str, str]],
        decoding_args: OpenAIDecodingArguments,
        model_name="text-davinci-003",
        sleep_time=2,
        batch_size=1,
        max_instances=sys.maxsize,
        max_batches=sys.maxsize,
        return_text=False,
        **decoding_kwargs,
    ) -> Union[Union[StrOrOpenAIObject], Sequence[StrOrOpenAIObject], Sequence[Sequence[StrOrOpenAIObject]],]:
  • KoAlpaca

    • 영어로 데이터셋 생성 → Instruction과 Input 번역 → output 생성 ⇒ 52k 한국어 데이터셋
    • 즉, 영어로 데이터셋을 생성하는 과정은 동일

Code Q&A 데이터셋에 대해 fine-tuning

  1. input / output 확인
  2. generation instruction dataset in specific domain (code Q&A)
  3. Alpaca model FT

Reference

project-alpaca's People

Contributors

chaewon-leee avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.