GithubHelp home page GithubHelp logo

jackelsey / pleco-history-analyzer Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 0.0 6 KB

helps you identify words that you look up over and over again in Pleco

Python 100.00%
chinese pleco vocabulary vocabulary-learning

pleco-history-analyzer's Introduction

What This Code Does

The code in this GitHub repository helps you identify words you look up repeatedly in Pleco's reader function. It takes history backup files from Pleco, determines what subset of words were queried at least twice (at times that were at least five days apart), and then saves those words as a list in a text file.

The five-day time threshold (which can be changed on line 8 of 1_process_pleco_backups.py) is to account for articles that take more than one day to read.

Requirements

You will need:

  • multiple history backup files from Pleco (e.g. settingsbackup-1509041749.xml). Pleco only saves the most recent lookup in the reader history, so this code needs more than one history backup file to work.
  • the ability to run Python code on your computer. If you are new to Python I recommend downloading the (free and open-source) Anaconda distribution of Python, because the Python scripts (files with .py file extension) in this GitHub repository require some libraries that you won't get if you install Python from the official website. The Python scripts can be run from the command line; you don't have to install a sophisticated IDE. This code has been tested to work with Python 3.7.4, but any later version of Python should be OK.

Steps for Use

  1. Download this repository into a folder on your computer (green "Code" button > Download zip).
  2. Export your reader lookup history from Pleco: hamburger icon > Settings > Backup Settings and/or History > Backup History Only.
  3. Save the .xml backup file in the backups folder.
  4. Read lots of Chinese, look up lots of words in the Pleco reader function, and repeat steps 2 to 3 every couple of weeks. As explained above, this code needs multiple Pleco backup files to work.
  5. Run the first Python script. (You can do this on the command line by typing python 1_process_pleco_backups.py and pressing enter.) After a few seconds you should see a file pleco_import.xml appear in the folder where you saved the code. The file pleco_import.xml is structured just like the Pleco backup files you've been saving, except that it only contains reader lookup history entries for words you've looked up on two days that were at least three days apart.
  6. Clear your reader lookup history: hamburger icon > History > READER > three-dots icon on top right > Clear current list.
  7. Import the file pleco_import.xml into Pleco: hamburger icon > Settings > Restore from Backup.
  8. Create Pleco flashcards for the reader lookup history entries (you may want to create a separate category for these flashcards if you use Pleco's flashcard function to study): hamburger icon > History > READER > three-dots icon > Dump current list to flashcards. This will take a minute, during which your phone screen may be unresponsive.
  9. Export the flashcards as a text file: hamburger icon > Import / Export (under FLASHCARDS) > Export Cards > uncheck all boxes except "Remap if forbidden"
  10. Save the text file Pleco just created (it will be called something like flash-2008111841.txt) into the folder with the Python code.
  11. Open up the text file known_words.txt and enter the words you currently know. If you happened to accidentally look up any of these words in the Pleco reader function the code will make sure they don't end up in the final output file. One word per line, and it's OK if there is other stuff on the line after the 汉字 as long as it's separated by a tab. A text file export of cards from Anki will work as long as you rename the file known_words.txt.
  12. Run the second Python script (python 2_remove_known_words.py). This will create a final file called output.txt.
  13. Learn the words in output.txt that look useful.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.