GithubHelp home page GithubHelp logo

parolteknologio / esperantogpt Goto Github PK

View Code? Open in Web Editor NEW
23.0 23.0 2.0 325 KB

Esperanto language expert and instructor for ChatGPT and other systems

Home Page: https://chat.openai.com/g/g-D4jB3Ml4b-esperanto-helpanto

gpts huggingchat openai-gpt openai-gpts

esperantogpt's People

Contributors

stefangrotz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

esperantogpt's Issues

Prompting for better pronunciation in voice mode

ChatGPT can detect if it runs in a normal text chat or in voice mode. I try to improve the pronunciation by prompting a different way of writing Esperanto: Instructions_tts.md and looks like this:

When you are in a text Chat always respond in normal Esperanto.

When you have a voice conversations always respond in lowercase Esperanto and write the first letter of the second-to-the-last syllable uppercase! For example hoDiau, salUton, bonvEnon Do not use the Esperanto special characters (ĉ, ĝ, ĥ, ĵ, ŝ and ŭ) instead:

use ch for ĉ
use k for ĥ
use dż for ĝ
use sh for ŝ
use ł for ŭ
use w for v

This idea is based on fonetika-transskribilo and vocx

It sometimes works but not in a reliable and predictable way. I will experiment with adding more examples. If you say it is a game the performance sometimes also improves.

Experiment with python spell checker

For example https://pypi.org/project/literumilo/ could be run inside the chat to double check things.

Test1:

grafik

Prompt:

Use the code interpreter to check Esperanto spelling like this:

import literumilo

Using the above method, literumilo's functions must be prefixed with the package name, as below:

result = literumilo.check_word("ĉirkaŭiris")

Alternatively, you can import the function names directly:

from literumilo import x_to_accent
from literumilo import check_word
from literumilo import analyze_string
from literumilo import analyze_file

The code samples below assume that the second method has been used:

analyze_string

This function has two modes, morpheme mode and spell checker mode. The first parameter is the string to analyze. The second is the mode. When the mode is True, analyze_string will divide every Esperanto word in the string into morphemes, and return the new string. For example:

TEXT = "Birdoj (Aves) estas klaso de vertebruloj kun ĉirkaŭ 9 ĝis 10 mil vivantaj specioj."
result = analyze_string(TEXT, True)
print(result)

The above will print out

Bird.oj (Aves) est.as klas.o de vertebr.ul.oj kun ĉirkaŭ 9 ĝis 10 mil viv.ant.aj speci.oj

When the morpheme mode is False, analyze_string outputs a list of unknown words. This code,

TEXT = "Birdoj (Aves) estas klaso de vertebruloj kun ĉirkaŭ 9 ĝis 10 mil vivantaj specioj."
result = analyze_string(TEXT, False)
print(result)

Search Reta Vortaro offline xdxf file using code interpreter

You can search through the revo.xdxf file using code interpreter:

chatGPT-revo

Used prompt

I used this prompt with a minified version of an example python code and some information about the data structure:

# reta vortaro - revo.xdxf
When asked about the reta vortaro, revo or generally about complex multi-lingual dictionary questions, you can search revo.xdxf using python. The Esperanto words can be found in the <ar> elements, examples in <ex> and translations to other languages in <dtrn>. For non-Esperanto word searches always search in dtrn and return the corresponding Esperanto word and example if not specified differently. Here is an example of the data structure of translations: <dtrn> /de/ Beispiel, Muster, Vorbild</dtrn> There can be additional elements inside of the elements described above. Write robust code that can handle messy XML.

Here is an example of such a search for an Esperanto word:

import xml.etree.ElementTree as D
def A(file_path,word):
   C='def';E=D.parse(file_path);F=E.getroot()
   for A in F.findall('.//ar'):
   	B=A.find('k')
   	if B is not None and B.text.strip()==word:G=A.find(C).text if A.find(C)is not None else'No definition found';H=[A.text for A in A.findall('.//def/ex')]or['No examples found'];I=[A.text for A in A.findall('dtrn')]or['No translations found'];return{'word':word,'definition':G,'examples':H,'translations':I}
B='revo.xdxf'
C=A(B,'krokodili')
print(C)

Conclusion

IMO right now it is too slow to include it into EsperantoGPT. It takes almost one minute to look up a word. I tried to make it use minfied code to speed things up, but this hasn't worked until now.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.