GithubHelp home page GithubHelp logo

luteorg / lute-v3 Goto Github PK

View Code? Open in Web Editor NEW
292.0 6.0 32.0 2.73 MB

LUTE = Learning Using Texts: learn languages through reading. Python/Flask.

License: MIT License

Python 61.82% CSS 7.00% JavaScript 7.73% HTML 15.23% Gherkin 7.37% Shell 0.72% Dockerfile 0.12%
language-learning lute

lute-v3's Introduction

Lute v3

tests code style: black linting: pylint coverage Discord Server

This repo contains the source code for Lute (Learning Using Texts) v3, a Python/Flask tool for learning foreign languages through reading.

To learn more about Lute v3, or to install it for your own use and study, please see the Lute v3 manual.

Lute v3 demo

Getting Started

Users

See the Lute v3 manual. Hop onto the Discord too.

Developing

For more information on building and developing, please see Development.

Contributing

If you'd like to contribute code to Lute (hooray!), check out the Contribution Guidelines. And with every repo star, an angel gets its wings.

License

Lute uses the MIT license: LICENSE

lute-v3's People

Contributors

jzohrab avatar webofpies avatar imamcr avatar cghyzel avatar satyen-akolkar avatar robby1066 avatar mzraly avatar sakolkar avatar dgc08 avatar jayanth-parthsarathy avatar barash-asenov avatar eterdas1 avatar anand-s23 avatar yue-dongchen avatar eikemenzel avatar drm00 avatar fanyingfx avatar

Stargazers

Charles Ancheta avatar  avatar Mehdi Allag avatar Tryggve W. Folkestad avatar James Ivan Mostajo avatar Shiyu Wang avatar Shao Yupeng avatar Oikio avatar Penny avatar Alper ASLAN avatar  avatar GuyCui avatar  avatar Michael Vetter avatar  avatar leileibuku avatar  avatar Micah Demong avatar  avatar  avatar Cameron Blankenbuehler avatar Asaf Bartov avatar  avatar  avatar 王华北 avatar  avatar Final avatar  avatar  avatar Kalaokay avatar  avatar  avatar  avatar  avatar  avatar  avatar Iwazaru avatar 菟狐 avatar 郭永康(Guo Yongkang) avatar Jaya avatar Teddy Xinyuan Chen avatar  avatar  avatar  avatar lym avatar tanpengsccd avatar MingChen avatar  avatar dengj avatar Matt Mao avatar  avatar  avatar  avatar  avatar  avatar enzoblues avatar  avatar leetao avatar xream avatar Gustavo_Mota avatar Kalyvara avatar  avatar Al Whatmough avatar Cuong Ha avatar Plamen Dzhelepov avatar  avatar  avatar LinguaCafe avatar Brad avatar Parker Henderson avatar  avatar  avatar Kevin Pavon avatar  avatar  avatar  avatar  avatar Sergey Titov avatar  avatar José Valerio avatar Denis Kislinskiy avatar Silas Duarte avatar  avatar  avatar João Victor T. Salgado avatar  avatar Meow avatar  avatar Amador Cuenca avatar Toby Dickenson avatar  avatar 许紫弈_Estraven avatar  avatar Mason Yao avatar  avatar StephonB avatar  avatar Gabriel Morgado avatar  avatar Mohamed Sabbar avatar

Watchers

 avatar  avatar  avatar  avatar J. Nathan Allen avatar  avatar

lute-v3's Issues

Only set translation for parent terms?

When creating a new Term and adding a new Parent, currently both get the same translation. E.g. from the Tutorial, click on "dogs" and create a new Term with new parent "dog", translation "woof." When saved, "dogs: woof" and "dog: woof" are both created -- but that's kind of redundant.

Would it be better to save the translation with the parent only?

I've created branch set_translation_for_parent_only which implements this, but it makes Lute look like it loses data. eg create new term for "dogs":

image

On save, the "dogs" and "dog" term hovers look good, however, when I click on "dogs" again I see the following:

image

This looks like some data has been lost.

Add "words read" statistics

  • words read today
  • words read cumulative

Todo:

  • add texts.TxWordCount - should be updated when page text is updated
  • user might read the same page several times, this should be included in the word count ... perhaps log reading in a separate table? Nope, good enough just to have the one track of the counts.

Support non-consecutive multi-word terms

Is your feature request related to a problem? Please describe.

Germanic languages like German or Dutch have words that span several words that can be separated, in particular verbs.

For example: "Ich lade dich zu meiner Party ein." means I invite you to my party. The verb is "einladen", but in the phrase, the "lade" and the "ein" are separate (and not only this is common, this is mandatory as per the grammar). These kinds of verbs are very common.

Describe the solution you'd like

Current multi-word selection doesn't work, and shift+clicking on words is used for bulk selection. Maybe alt+clicking or some other combination could work.

Describe alternatives you've considered

I don't think there's any,, or at least I can't imagine it. The suggestion on possible solution works maybe to add terms, but no idea on how it would work to show back the information, to be honest. The main reason for the feature request is in case I'm missing something obvious that can work as a solution (short of trying to parse natural grammars).

Feature request: Add option to remove image from word

It happened to me, that I chose an image for a word, and then I realized, that it would be very difficult to find an image that represented that particular word.

It would be great to be able to click on a selected image again and have that image separate from the word.

Add "book package" export and import

It would be nice be able to export individual books into files that someone else can load into their Lute. It'd include all the words/definitions, the audio, the bookmarks, everything. I think a feature like that would really help form a Lute community. I know that I'd love to share my texts with other Czech learners.

This would be something like an "anki package" zip file. Work involved:

  • define export, import format
  • export file versioning, so that importers can deal with old formats
  • export book and audio only, no terms?
  • export book and terms?
  • handle term images
  • overwrite existing terms on import, or ignore?
  • book with same title allowed or not?

Integrate Golden dict app with Lute

Is your feature request related to a problem? Please describe.
Golden dict is app can have many dictionaries. It may be good to integrate it with Lute
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like

  1. I have an idea. Can open Golden dict in webpage like localhost so I could have its link like DeepL or Google translate to use. I'm not sure it can be successful!
    A clear and concise description of what you want to happen.

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

Add any other context or screenshots about the feature request here.
Screen Shot 2023-02-16 at 13 31 16

Add "term count" statistics

Terms created today/cumulative -- when reading, sometimes my reading is creating too many status = 1 terms in one day, too much to bite off. If I create too much new stuff, there's not enough time to digest everything.

"Autopopulate" button for the parent term

In most languages, the parent is pretty visibly related to the child, but with a few letter changes. It´d be nice not to have to copy the whole word from below, but just press a button and then type the end of the word.

I'm not sure how this would be done for most languages ... this feels extremely tough.

Add "page break" markers (e.g "---") to text to force breaks of text during book creation

Currently pages break by tokens. Sometimes it would be nice to break chapters or sections forcibly.

e.g., creating a book with text

Hello.
---
Goodbye.

creates a book with two pages: "Hello.", "Goodbye." This page break marker does not change the max words per page, it works with it.

Test cases:

  • text with and without breaks
  • no blank pages should be created -- e.g. two lines with separators right next to each other shouldn't be created
  • no blank lines at top or bottom of page when split

Import new language option

Is your feature request related to a problem? Please describe.

Not all languages are supported and the regex/link stuff isn't possible for "non-techies."

Describe the solution you'd like

When adding a new language, there's the option to "load from predefined." It would be a great "stepping stone" if there was a simple text file that could be made and shared to "import" languages according to the settings that work well for another user.

This could also allow for more "default" languages to be supported if they're just a small text file that can be downloaded and added to future releases.

Describe alternatives you've considered

Additional context

This is how a file might look:
czechlanguagelute.txt

Allow navigate to arbitrary page in book

Either a slider, text box, or select box. When reading, I sometimes want to jump to the first page, or the list, arbitrarily. Shouldn't have to go through the book page by page.

Prerequisite for this: #86

Support for Korean parsing

Is your feature request related to a problem? Please describe.

I'd like for there to be a way to parse Korean texts as I'm learning Korean.

Describe the solution you'd like

Implement a Korean parser based on MeCab-Ko.

Describe alternatives you've considered

I tried to use MeCab to parse a Korean text, but it didn't work, even though MeCab and MeCab-Ko seem to have similarities based on my online research.

(I was using \p{Hangul} as the Regex for character matching, but I'm not sure if that's correct either so that could have been the issue.)

[Classical Chinese] preload default Dictionary 1 - clicking a word redirects to dictionary 1 entry page

Description
When using https://ctext.org/dictionary.pl?if=en&char=### as a dictionary 1, clicking a word in the text

To Reproduce

Steps to reproduce the behavior, e.g.:

  1. Create new language and use Classical Chinese preload settings
  2. Import a chinese language text and click on any highlighted word
  3. Page will be redirected to the ctext.org definition page

This issue does not come up with other dictionaries such as:
https://www.archchinese.com/chinese_english_dictionary.html?find=###

Extra software info, if not already included in the Description:

  • OS: Both Mac and Windows
  • setup: docker
  • Version: 2.1.3 (@latest version)

Export terms

Is it possible to add ability to export terms as a CSV or TXT file format? It would be awesome if we could export filtered terms, not all of them.

Feature request: Add created date to Books section

Would it be possible to add a column in the Books section with the book creation date and an option to sort based on it (or maybe a simple ordinal number)?

I can use tags to mark records as newest (or a special naming convention), but that is not as convenient.

Add Audio controls to play text mp3.

Store audio in same parent folder as where images are stored, perhaps ... or stream from URI?

Requirements:

  • page content to be ajaxed in, not URL-per-page.
  • store or stream audio file
  • audio player and controls

Import subtitle file, add auto-bookmarks

From Discord:

It'd be great if you could upload an .srt or other subtitle file with an audio file and have it convert it to a txt for reading, but also add bookmarks for the different pages (or even for all the sentences and next to them, there's a little button that, basically, says "skip to this sentence in the audio"). This would make Lute amazing to use with anything audio based alongside Whisper getting better and better.

Challenges I can see with this request:

  • bookmarks aren't associated with pages or sentences, so there will be many many bookmarks in the audio timeline and no clear way to jump to the text
  • during parsing, the timestamp data isn't stored, it's just another token. This could potentially be worked around with the base parser doing a preliminary pass to get timestamps, and then the actual parsers being called for each section between stamps. This is a big change from the current method, but perhaps is doable. I'm not sure of the payoff, but I'd need to work on a new language to fully understand the ins and outs.

Possibly ignore word accents when saving terms in DB

Notes from a slack chat:

I will attempt to articulate what I think the deal breaker could potentially be without getting into the weeds of how Ancient Greek actually works. You may have noticed that the words contain accented characters. There are various features of the language that cause those accents to change without changing anything about the meaning of the word. For example, I would have to define γὰρ twice because it can appear as either γὰρ OR γάρ. That's one of the most common words in the language meaning something like "for" or "since" or "because". Now, it only comes in those two flavors but I think you could see how quickly it would become tedious to define words over and over just because of diacritics.

With respect to this question accents, I have noticed that chrome is character agnostic when it does its "find in page" search. If I search γὰρ it will highlight γάρ and even γαρ. Perhaps Lute could have in the options specific to a language to ignore accents as well?

Let's continue to use the example of γὰρ, when I am reading the text, I would still see the orthography displayed as the author intended but, behind the scenes in the database, as far as Lute is concerned, γὰρ, γάρ, and γαρ share the same entry.

I know the original LWT let you do character substitutions but it actually just hotswapped one character for another and that fact was reflected in the actual text that you are reading. Basically it would see the character set as consisting of only 24 characters (not accounting for uppercase). The unaccented Greek alphabet.

My thoughts:

Rendered TextTokens (i.e., words shown in the reading pane) would include the accents, but Terms (stored in the db) would be without accents, and the rendered TextTokens would be associated to Terms w/o accents.

No idea at the moment if this would be tough or not!

Follow standards for javascript "data-" attribute names

Currently, lute/templates/read/textitem.html has the following:

      tid="{{ item.text_id }}"
      lid="{{ item.lang_id }}"
      paraid="{{ item.para_id }}"
      seid="{{ item.se_id }}"
      data_text="{{ item.text }}"
      data_status_class="{{ item.status_class }}"
      data_order="{{ item.order }}"
{% if item.wo_id is not none %}
      data_wid="{{ item.wo_id }}"

This doesn't follow javascript standards, e.g outlined at https://dev.to/dev-harbiola/custom-data-attributes-in-html-a-guide-to-data--373.

These could be changed as follows:

tid => data-tid (or data-text-id)
lid => data-lid (or data-lang-id)
paraid => data-para-id
data-se-id or data-sentence-id
data-status-class
data-order
data-wid or data-word-id

I believe that these are only referenced in lute/static/lute.js:

(.venv) MacBook-Pro:lute-v3 jeff$ for t in tid lid paraid seid data_text data_status_class data_order data_wid; do
>   echo ------------------------------------
>   echo $t
>   inv search $t | grep lute.js    # limit search to only lute.js
> done
------------------------------------
tid
lute/static/js/lute.js:function prepareTextInteractions(textid) {
------------------------------------
lid
lute/static/js/lute.js:  elid = parseInt(el.attr('data_wid'));
lute/static/js/lute.js:    url: `/read/termpopup/${elid}`,
lute/static/js/lute.js:  const lid = parseInt(el.attr('lid'));
lute/static/js/lute.js:  const url = `/read/termform/${lid}/${sendtext}?${extras}`;
lute/static/js/lute.js:  const langid = firstel.attr('lid');
------------------------------------
paraid
lute/static/js/lute.js:    attr_name = 'paraid';
lute/static/js/lute.js:    attr_value = w.attr('paraid');
------------------------------------
seid
lute/static/js/lute.js:  let attr_name = 'seid';
lute/static/js/lute.js:  let attr_value = w.attr('seid');
------------------------------------
data_text
lute/static/js/lute.js:  let text = extra_args.textparts ?? [ el.attr('data_text') ];
------------------------------------
data_status_class
lute/static/js/lute.js: * Terms have data_status_class attribute.  If highlights should be shown,
lute/static/js/lute.js:/** Add the data_status_class to the term's classes. */
lute/static/js/lute.js:  el.addClass(el.attr("data_status_class"));
lute/static/js/lute.js:    el.removeClass(el.attr("data_status_class"));
lute/static/js/lute.js:    const st = nextword.attr('data_status_class');
lute/static/js/lute.js:  let update_data_status_class = function (e) {
lute/static/js/lute.js:        .attr('data_status_class',`${newClass}`);
lute/static/js/lute.js:  $('span.kwordmarked').each(update_data_status_class);
lute/static/js/lute.js:  $('span.wordhover').each(update_data_status_class);
------------------------------------
data_order
lute/static/js/lute.js:let save_curr_data_order = function(el) {
lute/static/js/lute.js:  LUTE_CURR_TERM_DATA_ORDER = parseInt(el.attr('data_order'));
lute/static/js/lute.js:  save_curr_data_order($(this));
lute/static/js/lute.js:  save_curr_data_order($(this));
lute/static/js/lute.js:  const first = parseInt(start_el.attr('data_order'))
lute/static/js/lute.js:  const last = parseInt(end_el.attr('data_order'));
lute/static/js/lute.js:    const ord = $(this).attr("data_order");
lute/static/js/lute.js:  save_curr_data_order(el);
lute/static/js/lute.js:    return $(a).attr('data_order') - $(b).attr('data_order');
lute/static/js/lute.js:  const i = words.toArray().findIndex(x => parseInt(x.getAttribute('data_order')) === LUTE_CURR_TERM_DATA_ORDER);
lute/static/js/lute.js:  save_curr_data_order(curr);
------------------------------------
data_wid
lute/static/js/lute.js:  elid = parseInt(el.attr('data_wid'));
(.venv) MacBook-Pro:lute-v3 jeff$ 

I don't know if this work is worth it ... following standards is good, but not critical. Is this make-work only?

Add opinionated Anki export

(I've revised this issue based on new thoughts)

Summary

I wrote Lute based off of LWT, but dropped the SRS feature of LWT: the code was brutal, and for the initial MVP (minimal viable product) release of Lute I didn't feel that it was a necessary feature. I still don't :-) for a few reasons:

  1. A brute-force approach of "just test everything" isn't the best. In some cases like verb inflections, I don't need to test every permutation -- perhaps I should only see the parent, and a few child examples. Also, there are many words that I've only seen once so far in my reading, and may never see again. I think I should be able to select the terms I want/need to test.
  2. I question whether Anki testing falls within the primary use-case of Lute, which is just to get you reading, and to hopefully encourage you to keep reading. I read a lot with Lute, and would vastly prefer to focus on reading, rather than testing.
  3. Testing by seeing sentences I've read, and then regurgitating those (or similar), isn't very fun for me!

Even Steve Kaufmann of LingQ doesn't really recommend using their testing feature, probably for the same reasons as I have above. :-) (He does recommend their "sentence mode" for building sentences, I believe.)

Exporting terms to a CSV, and images to somewhere else, may be trickier than needed, so I'll with using AnkiConnect as the first iteration of this.

This will be an opinionated export: it will assume certain note types, deck name, field names, etc.

Design/UX notes

The following are some rough ideas only. I'd need to try implementations to really get a handle on the UX.

Config

  • Some ppl don't use Anki, so they shouldn't see it if they don't want it. Default should be to see it.
  • add setting to set the Ankiweb address and port; default should be whatever needed

Exports from term listing

The term listing has a checkbox. Users could select the terms they want to export, and then click an "export to anki" button.

Ankiconnect supports exporting images, see FooSoft/anki-connect#158.

Lute doesn't store sample sentences for terms, but it does has a reference lookup that could get the latest sentence for any term. The sentences table is loaded on opening a page for reading, so even if the page isn't marked read something should be in the table. Need to update the export to include a non-read page's sentences.

Fields to export:

  • term id (can change if people delete terms)
  • language id
  • term
  • audio (left blank, as lute doesn't have audio, but people might add it later)
  • language
  • image
  • translation
  • parent term
  • tags
  • sample sentence as "manual cloze" (with term replaced by ____)
  • sample sentence with cloze removed

failed exports

  • model doesn't exist
  • deck name doesn't exist
  • ankiconnect isn't on
  • misc errors?

Anki note/card templates

I'll put some kind of pre-designed note type in a public place so people can access it ... that's probably the easiest thing to do. Creating a single-card shared deck on AnkiWeb would be easiest. AnkiConnect apparently does let you create models using the API, I'm not sure how tough that would be. https://foosoft.net/projects/anki-connect/index.html#model-actions

It would be nice to have Anki cards be able click back to Lute, if Lute's running, so that people can see the term and its sample sentences again.

Add book "chapter" markings and table of contents

It would be very nice to have a chapter marking or similar, and then have a table of contents or similar, showing the current chapter number at the top of the page perhaps.

When reading long texts, I sometimes want to know how many pages until I reach the end of the chapter.

Add audio files for word pronunciation (TTS)

Audio files could be stored in user's data folder, and files could be found by md5 of term, e.g.

Big effort required:

  • web service calls to some uri to get a file
  • support different endpoints? polly, azure, forvo, etc etc --
  • API tokens in settings for the different services
  • selecting the service and voice you want to call
  • how to store the file, naming etc
  • allow only one sample, or multiple, for any given word?
  • include audio in any anki export
  • play on mouse over setting? or just mouse over the speaker?

Lots of things to do here.

Add import .epub files

Lute can import .txt files, would be nice to also support importing .epub.

There are python libraries for importing .epubs, eg: https://andrew-muller.medium.com/getting-text-from-epub-files-in-python-fbfe5df5c2da

Don't know if that's the best one.

Code outline

The code in develop has things in place for the epub import to be implemented:

  • /lute/book/routes.py method _get_file_content(filefielddata) has a check for the filename extension, and calls the service.py for epub parsing
  • /lute/book/service.py has a stub method get_epub_content(epub_file_field_data) to be implemented.
  • /tests/acceptance/book.feature has a commented-out epub import test. The implementation should add a short sample epub file to tests/acceptance/sample_files/.

Todo items

The code has a few comments with "todo epub:", where things should be updated:

$ inv todos | grep -i epub
Group: epub
  ./pyproject.toml                                  :  # TODO epub: add epub parsing library to dependencies
  ./tests/acceptance/book.feature                   :  # TODO epub: add an epub file to sample_files, activate this test.
  ./lute/book/service.py                            :  raise ValueError("TODO epub: to be implemented.")
  ./lute/book/forms.py                              :  # TODO epub: add epub to the list, change prompt.

Add "term export"

Make it easy to export terms. This would let users share data, ppl could group up to make data mappings, etc.

Initial solution: just export everything into a CSV. :-) Good enough for now.

Possible long-term solution: somehow combine this with the "filters" in the Term Listing page, so that once a filter is applied, only those terms would be exported.

"Easy exporting and syncing of "parent" database (users learning the same lang could crowdsource)" - "crowdsourcing" to me implies some kind of central place to store definitions, choose the best, filter out trash etc -- that's a different beast.

Text imports mix up lines

Description

I've noticed that some lines (not very common, but some) are out of order from the original text file that was imported. I noticed this while listening along with an audiobook and certain things would be skipped then gone back to in a few moments. It's not an issue with the next file (see attachments).

To Reproduce

Steps to reproduce the behavior, e.g.:

  1. Import using a text file
  2. Notice that some lines are out of order

Screenshots

This is the text I confirmed with. Harry Potter a Fenixuv rad - J. K. Rowling.txt. An example of a line that is out of place is S mou drahou starou matičkou, ano which is on line 1889.

Here it is in Lute out of order:
lutemixingupsomelines
lutemixingupsomelines2

Extra software info, if not already included in the Description:

  • OS (e.g., iOS, windows): Pop!_OS
  • Browser (e.g., chrome, safari): Chromium
  • How you've installed Lute (Docker, python, source): Python (v3)
  • Version: 3.0.0b11

Mass term adding

Is your feature request related to a problem? Please describe.

I sometimes have some dead time in my day when I'd like to just "add more words to my dictionary." I don't really want to read, I just want to mindlessly add more terms and definitions so that when I come across them in reading, it's more seamless.

Describe the solution you'd like

To be able to enter a special view of a book where it only has each new/unknown word ONCE, in order of frequency. Then I can just go through them.

It'd also be cool to have the same, but in alphabetical order so that you might find word families and be able to "kill a bunch of birds with one copy and paste." (Add the head word, then copy it and paste it as a parent in the next 5-6 in the list)

Add support for custom fonts

Discord discussion notes with user "Jiggle":

before I was editing the original css files (styles, styles-compact) and was storing the font files in the same folder as these css files
but unfortunately the custom_styles is not a static file as I understand

css file edits:

@font-face {
font-family: "MYFONT";
src: url("Rubik-Regular.woff") format("woff");

font-style: normal;
font-weight: normal;
}

if the font is in the same folder as the css file, it works

Notes:

  • The "custom styles" are actually a flask route. I'm not sure where the files would need to be stored for them to be available.
  • For docker containers, this would have to be a mounted directory; could potentially just store them in the "data" folder for Docker. Maybe should just store them there for pip users too ... would require a custom config.yml file, no good way to work around that, I think.

Investigate use of spaCy or NLTK for parsing

This is a complicated issue

spaCy and the stanford stanza project are very good parsing libraries, it would be nice to use something like that instead of (hacky?) regex solutions. Unfortunately spaCy is very slow, so things would need to change quite a lot to make it usable within Lute.

Currently Lute parses very frequently:

  • when a new book is created
  • when a page is open for reading
  • when a new term is created

To get around the frequent parsing (for reading), Lute could:

  • parse the book once, and store the tokens and token boundaries (zero-width strings) in the texts.TxText field. Then, when reading, everything is already parsed
  • if creating terms from the reading pane, the zero-width strings (spans with spaces) could be sent to the route that creates the terms. No extra parsing would be needed
  • creating terms from the term index page would still require parsing ... that would be slow. The parsing library could be loaded one-time only and then kept available for any runtime session
  • I'm not sure how the spaCy dictionaries would be loaded, especially for Docker. They'd have to be ... mounted somehow, in the user data folder.

Add user pref toggle to not pause audio on click

Currently, when the term form is displayed while reading, it sends an event to pause the audio. Some users want to be able to disable that, i.e. audio continues even when the form is opened.

Add "sentence notes" for term references

I think there are many use cases for this:

  • There are often cases where a term's usage, or special meaning, is only rarely given, and it's good to keep track of those specially
  • Some sentences may show special grammar, interesting constructions
  • Useful place to record questions etc -- like a running notebook of stuff
  • EDIT: could also use these for sentence translations as well

These notes could be tagged by category, or by term, say, and when looking at a term, the associated notes would be returned too.

Sentences are "Value objects", the id could be tracked by md5 etc of the sentence text, including the language id as part of the md5. case-insens md5 too.

Add control (slider or dropdown?) to increase/decrease text size on the reading screen.

Currently users set font size through custom settings, but a nicer method would be a slider or dropdown.

  • The font size should be applied to all span.textitem elements on the web page.
  • On move to a next/previous page, the font size should stay at the user's setting -- so store it in the localsettings or whatever.
  • I'm not sure if the font-size should be set in px, em, or rem. From my reading, rem is the way to go. I guess that the rem could go between, what, 25% up to 500%? No idea what range makes sense.

For the first pass implementation, don't bother storing this in the db settings table (i.e. where the custom css is stored). That would require a web service call to set the value, another to reload at launch, etc, a bunch of code for little value. If it's easy enough to adjust, it should suffice.

Example: https://codepen.io/p-mohamed-elsawy/pen/bJGgaZ

Improve image search and save

  • see if can add a text box to refine image search results, sometimes the default search images aren't that great.
  • possible to search by the parent term, instead of the child? e.g., if "dogs" has parent "dog", then the image search should be done with "dog".

ChatGPT Integration

Is your feature request related to a problem? Please describe.

Many small issues with texts or dictionaries are easily solved by ChatGPT and a few prompts. But copy/pasting things between Lute and ChatGPT can be cumbersome and slow. It would be great to have the integrated a little.

Describe the solution you'd like

I would like to see a few ChatGPT features.

  1. Add ChatGPT similar to a dictionary source. When you click on a word/sentence, you have a button to send that variable (plus a predefined prompt) to ChatGPT and then receive a response. It would be great for looking up words that don't appear in your dictionary or for getting an explanation for something. ChatGPT can also provide cultural context or come up with mnemonics. There are tons of possibilities if it's configurable in the language settings.
  2. Allow ChatGPT to reformat pages. Since Lute only works on text, some of the pages can be imported weird, and it'd be really handy to have a button that just sends that text to ChatGPT, tells it to reformat it to better fit Lute, then to replace that page with the new response. There'd need to be some prompt engineering, but it would be very useful. I've personally been coming across lots of typos in the webnovels that I'm reading which ChatGPT would solve in an instant.

Additional context

Here're some example prompts that I've been using:

For defining stubborn words:

Help me translate this word as it doesn't appear in dictionaries.

The word is: olizovala

Format the response like this, but replace the capitalized words with the correct information:

WORD
UNCONJUGATED, UNDECLINED DICTIONARY FORM OF THE WORD
PART OF SPEECH

  1. TRANSLATION IN ENGLISH
  2. OTHER MEANING (ONLY IF APPLICABLE)

SHORT EXAMPLE SENTENCE USING THE WORD

VERY SHORT EXPLANATION OF THE SIGINIFICANCE OF THE WORD USING SIMPLE ENGLISH

For reformatting:

The following passage has a few typos and formatting issues. Please rewrite the passage exactly the same, but fix any typos and reformat it to be more readable. Keep all the "artistic" choices made by the author.

Here's the passage:

Hotkey arrow up and down to increase/decrease status number

Hotkey left-right moves to terms, so up/down could change status.

Would need to go up related to the current status. If multiple terms chosen, could just start with the lowest status. Go from 1-2-3-4-5-WellKnown, skip "ignored".

The function to update is in lute.js, handle_keydown -- at least, that is how I was intending to do it. If there is a better option, LMK.

Increase acceptance test coverage (master list)

Commit ce50ee5112f27 added a basic acceptance (browser-level) test of Lute using Panther: reading a text, and creating Terms and multi-word Terms.

Per https://github.com/jzohrab/lute/blob/develop/tests/acceptance/README.md#tests-to-write, there are a bunch of tests to write, and if extensive work is done on any section of Lute then some of these acceptance might be useful.

  • Languages
    • List languages
    • Create new lang
  • Create text
    • from textbox
    • from file
    • import web page
  • Texts
    • archive text
    • view archive
    • unarchive text
    • delete text
  • Terms
    • list terms
    • search for terms
    • create new term from main form
    • bulk map parents from listing
  • Term Tags
    • list all
    • create
    • delete
  • Reading
    • Update refreshes multiple terms
    • Update on one page updates other books
    • hotkeys (done?)
  • Parent term mapping
    • export book file
    • export language file
    • import mapping file
  • Backups
    • backup setting defaults
    • set backup settings
    • create a backup (done, just need to verify file)
  • Version and software info

Change how dictionaries are defined and used

Copying over notes from jzohrab/lute#21.

Currently, Lute stores "dictionary 1" and "dictionary 2" URLs in the Language table, with placeholders for term substitution. This creates a few limitations:

  • use weird "*" character to designate a pop-up dictionary
  • restricted to only using HTML dictionaries, no easy way to handle json, plug-ins, or other types of dictionaries
  • limited to 2 dicts per language

It is potentially worth it to change dictionaries into first-class entities, e.g. with a brand new user form like this:

field notes
dictionary URL textbox, the url with "###" placeholders -- better yet, change the placeholder to "[LUTETERM]" or similar, since "#" is a valid URL entry (e.g., looking up "https://en.m.wiktionary.org/wiki/essere#Italian" would use a URL like "https://en.m.wiktionary.org/wiki/[LUTETERM]#Italian")
opens in pop-up? checkbox
encoding dropdown or textbox
returns dropdown (html default, or json -- reason for the "json" option is that some languages seem to only have dictionaries available via a json API)
active checkbox. Sometimes some dictionaries will be more useful than others -- eg, when offline, any online dicts are useless, so I could potentially deactivate the online dicts and only use an offline Kobo dict or whatever.

These would be stored in a new dictionaries table, and would be linked to the Languages. First draft UI implementation could be a dedicated UI screen to define dictionaries, that would be easiest (It's possible to create child subforms, but I haven't done that yet in Symfony :-) ).

One dict would have to be marked as primary. A language could define one or multiple dicts.

Add book text search

If I know a book has a term, I just want to search for it somewhere, and have the pages where it shows up.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.