parolteknologio / esperantogpt Goto Github PK

View Code? Open in Web Editor NEW

23.0 23.0 2.0 325 KB

Esperanto language expert and instructor for ChatGPT and other systems

Home Page: https://chat.openai.com/g/g-D4jB3Ml4b-esperanto-helpanto

gpts huggingchat openai-gpt openai-gpts

esperantogpt's People

Contributors

Stargazers

Watchers

Forkers

5l1v3r1 dduan2002cn

esperantogpt's Issues

Prompting for better pronunciation in voice mode

ChatGPT can detect if it runs in a normal text chat or in voice mode. I try to improve the pronunciation by prompting a different way of writing Esperanto: Instructions_tts.md and looks like this:

When you are in a text Chat always respond in normal Esperanto.

When you have a voice conversations always respond in lowercase Esperanto and write the first letter of the second-to-the-last syllable uppercase! For example hoDiau, salUton, bonvEnon Do not use the Esperanto special characters (ĉ, ĝ, ĥ, ĵ, ŝ and ŭ) instead:

use ch for ĉ
use k for ĥ
use dż for ĝ
use sh for ŝ
use ł for ŭ
use w for v

This idea is based on fonetika-transskribilo and vocx

It sometimes works but not in a reliable and predictable way. I will experiment with adding more examples. If you say it is a game the performance sometimes also improves.

Add Etymology dictionary or knowledge file

Some feedback on Reddit was, that a Etymology dictionary would be a nice feature.

HuggingChat Assistant

Here is a first experimental GPT on Huggingchat: https://hf.co/chat/assistant/65c145c565be046e86ee130f

Right now no knowledge files are possible, so I have to create a good short system prompt with some basic knowledge that the models are missing most of the times.

Experiment with python spell checker

For example https://pypi.org/project/literumilo/ could be run inside the chat to double check things.

Test1:

Prompt:

Use the code interpreter to check Esperanto spelling like this:

import literumilo

Using the above method, literumilo's functions must be prefixed with the package name, as below:

result = literumilo.check_word("ĉirkaŭiris")

Alternatively, you can import the function names directly:

from literumilo import x_to_accent
from literumilo import check_word
from literumilo import analyze_string
from literumilo import analyze_file

The code samples below assume that the second method has been used:

analyze_string

This function has two modes, morpheme mode and spell checker mode. The first parameter is the string to analyze. The second is the mode. When the mode is True, analyze_string will divide every Esperanto word in the string into morphemes, and return the new string. For example:

TEXT = "Birdoj (Aves) estas klaso de vertebruloj kun ĉirkaŭ 9 ĝis 10 mil vivantaj specioj."
result = analyze_string(TEXT, True)
print(result)

The above will print out

Bird.oj (Aves) est.as klas.o de vertebr.ul.oj kun ĉirkaŭ 9 ĝis 10 mil viv.ant.aj speci.oj

When the morpheme mode is False, analyze_string outputs a list of unknown words. This code,

TEXT = "Birdoj (Aves) estas klaso de vertebruloj kun ĉirkaŭ 9 ĝis 10 mil vivantaj specioj."
result = analyze_string(TEXT, False)
print(result)

Search Reta Vortaro offline xdxf file using code interpreter

You can search through the revo.xdxf file using code interpreter:

Used prompt

I used this prompt with a minified version of an example python code and some information about the data structure:

# reta vortaro - revo.xdxf
When asked about the reta vortaro, revo or generally about complex multi-lingual dictionary questions, you can search revo.xdxf using python. The Esperanto words can be found in the <ar> elements, examples in <ex> and translations to other languages in <dtrn>. For non-Esperanto word searches always search in dtrn and return the corresponding Esperanto word and example if not specified differently. Here is an example of the data structure of translations: <dtrn> /de/ Beispiel, Muster, Vorbild</dtrn> There can be additional elements inside of the elements described above. Write robust code that can handle messy XML.

Here is an example of such a search for an Esperanto word:

import xml.etree.ElementTree as D
def A(file_path,word):
   C='def';E=D.parse(file_path);F=E.getroot()
   for A in F.findall('.//ar'):
   	B=A.find('k')
   	if B is not None and B.text.strip()==word:G=A.find(C).text if A.find(C)is not None else'No definition found';H=[A.text for A in A.findall('.//def/ex')]or['No examples found'];I=[A.text for A in A.findall('dtrn')]or['No translations found'];return{'word':word,'definition':G,'examples':H,'translations':I}
B='revo.xdxf'
C=A(B,'krokodili')
print(C)

Conclusion

IMO right now it is too slow to include it into EsperantoGPT. It takes almost one minute to look up a word. I tried to make it use minfied code to speed things up, but this hasn't worked until now.

parolteknologio / esperantogpt Goto Github PK

esperantogpt's People

Contributors

Stargazers

Watchers

Forkers

esperantogpt's Issues

Prompting for better pronunciation in voice mode

Add Etymology dictionary or knowledge file

HuggingChat Assistant

Experiment with python spell checker

Search Reta Vortaro offline xdxf file using code interpreter

Used prompt

Conclusion

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs