🌸 Synthetic Haiku DPO 🌸

Using open source LLMs to build synthetic datasets for direct preference optimization

Lines in code retreat,
Synthetic haiku compete,
Nature's rhythm, neat.

📖 About

Haiku (俳句) is a type of short form poetry that originated in Japan. Traditional Japanese haiku consist of three phrases composed of 17 phonetic units (called on in Japanese, which are similar to syllables) in a 5, 7, 5 pattern - https://en.wikipedia.org/wiki/Haiku

A current request for a haiku looks like when using the TinyLlama/TinyLlama-1.1B-Chat-v1.0 model looks like this:

>>> Write a haiku about moss.
Mosses soft, green and shining,
Gracefully drifting in the breeze,
A symphony of light and sound.

This has some of the properties of a traditional haiku (as expressed in the English language), it has three lines, and the themes of nature and the seasons are present. However, the syllable count is off.

What happens if we do a little prompt engineering and ask the model to write a haiku with a 5, 7, 5 syllable pattern?

>>> Write a haiku about moss. Follow the traditional 5-7-5 syllable pattern.
Mosses, tiny green eyes, inchworm life,
Growing tall, they're hard to see,
Syllables flow, as the world moves by: haiku of life.

Prompt engineering didn't help use here. What if we use a slightly larger model? The teknium/OpenHermes-2.5-Mistral-7B model is a 7B parameter model, which I've found to perform very well for many tasks.

>>> Write a haiku about moss. Follow the traditional 5-7-5 syllable pattern.
Moss, soft emerald hue,
Creeps on stone, whispers secrets -
Nature's silent truth.

This repository contains the code used to generate the Synthetic Haiku DPO dataset, a dataset of synthetic haiku generated using Distilabel.

The goal of this repo is to help the author explore the process of using synthetic data to train a model for direct preference optimization (DPO).

dvsrepo / haiku-dpo Goto Github PK

haiku-dpo's Introduction

🌸 Synthetic Haiku DPO 🌸

📖 About

haiku-dpo's People

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs