GithubHelp home page GithubHelp logo

Comments (5)

fasiha avatar fasiha commented on August 17, 2024 8

every time the idea to build my own Memrise/Duolingo clone struck me again

Thank you @bunyk for considering Ebisu for building your app, that has always been my dream with Ebisu—to make it easy to create your perfect quiz app 😄

some prerequisite statistics books or online courses that would help with better understanding math behind all this

So first I'll say this: my goal is to make a library that anybody can use, without needing a Master's degree in statistics 😅! The mathematical details in the README are mostly for folks who want to improve the algorithm—you don't need to understand this to use the Python/JavaScript/etc. library in your app.

However, if you're really very very motivated to learn this, I think an introductory course on probability and Bayesian inference should be all you need to check my derivation of the posterior.

  • MIT's courses tend to be extremely advanced and might be good for self-study: this is their probability/Bayesian course, which has this prerequisite course on advanced multivariate calculus (but we don't really need much multivariate calculus, just the normal single-variate calculus should be fine).
  • If you're a student, you might also look at what courses your university teaches to first-year graduate students in the statistics department (though a lot of that (real analysis, linear regression, etc.) is not useful for Ebisu; also note that Ebisu does not use any tools from machine learning or data science, so you don't need to learn anything about classification/regression to understand what we're doing here).
    • You can also look for the course that teachs BUGS or JAGS or STAN or PyMC, which are MCMC/numerical libraries to solve the Bayesian problem that Ebisu solves analytically. Here's the course I took (fifteen years ago…) at my school: Bayesian Analysis. This kind of course would be my top suggestion if you're in school and want to learn how Ebisu works—especially if they teach you how to use calculus, not just the MCMC numerical libraries. (STAN etc. can solve much bigger problems than Ebisu, but I've tried very hard to make Ebisu's model simple enough to have an analytical solution so predictRecall and updateRecall are fast ("fast"…) and don't need Monte Carlo sampling or numerical integration, because I wanted Ebisu to be able to run on a phone)

I'm sorry that unfortunately we don't have a good way to teach/learn mathematics the way we have excellent resources for programming 😢.

Call predictRecall for each, and select ones with the lovest recall?

This is what I do. (I also like to gently randomize the flashcard to review, so instead of the card with the absolute lowest pRecall I find the bottom percentile of pRecall and then pick a random one from that, but that's a nice-to-have.)

And then have some threshold, like "if recall probability is > 80%", learn new facts instead now, as reviewing that ones will be not so efficient?

Some people do this. I don't really like apps that do this though, because some days I don't want to review (just learn), other days I don't want to learn (just review), so I prefer apps that let me pick whether to learn or to review. That also simplifies the app so I don't need a numerical threshold—the threshold for switching from reviewing to learning is my boredom 😃.

if fact A had lower recall probability today than fact B, then probably the same should be true tomorrow

As @MNastri mentioned (thank you for weighing in!), because cards decay at differing rates, you have to be quite careful doing this.

But like they alluded to, I am working on v3 of the algorithm that changes a lot of how the statistics works (see #43 for the details). This will have a much simpler predictRecall (no beta functions, just arithmetic). And this is going to make it straightforward to calculate recall probability in SQL: you can either do a full-table scan to find the card with the lowest recall probability, or of course you can cache that smartly and update it as needed (newly-learned cards need to be recalculated every 5 minutes; mature cards can be updated every week; something like that).

I was hoping to publish v3 six months ago but got very busy. I'm hoping to release it in the next few months 🙏.

I would prefer to have an app that I could open at any time for 5 minutes (waiting in line, sitting on a toilet, etc..), and it will just give the next best thing to review

I also love this!!! I have apps that rerun predictRecall on hundreds of flashcards before each quiz: it's not as efficient as it could be but it's not the bottleneck. I encourage you to design the actual app—getting the UI/UX right is going to be so important. We'll be able to help you speed things up when predictRecall becomes your bottleneck.

Please feel free to post any more questions or comments if any of the above is unclear. Thanks for writing!

from ebisu.

zxl777 avatar zxl777 commented on August 17, 2024 2

I'm developing a language learning app for iOS/Android, and both my users and I want to have an SRS system to really master the new vocabulary that's being added.

I discovered your Ebisu. Intuition tells me that this algorithm is alive and well, despite your intimidating mathematical arguments 😄

I am going to embed the python program directly into my server, and I am also looking forward to the 3.0 update.

Encourage you! Good Job!

from ebisu.

MNastri avatar MNastri commented on August 17, 2024 1

I like the idea to stop testing old facts depending on its probability of recall. This could help calculate the number of facts you still have to test today, and not have to review indefinitely.

In regards to your last question, I think that facts can be forgotten in different speeds, so even if recall probability of fact A is lower than the recall probability of fact B, on a future time this might not be the case because of the different speeds of forgetting. One fact might be "weaker" and will decay more rapidly than the other.

If I remember correctly, Fasiha was working on something to address this change in speed of forgetting/learning, which he was calling acceleration. I might be forgetting exactly what he wrote, but I think it was about multiplying the half-life by some factor after a recall.

from ebisu.

fasiha avatar fasiha commented on August 17, 2024 1

Closing this for now, please feel free to reopen or add further questions/comments!

from ebisu.

bunyk avatar bunyk commented on August 17, 2024

Oh, so I see since this forgetting is not linear, the only way is to define what is best to review in every moment is to call predictRecall with all that beta functions for each item, and then sort. I guess that should work, if you do like Duolingo, and prepare a learning session with N items, so you don't have to sort for a single item. And probably I should create an app first and then worry about scaling it. :)

This could help calculate the number of facts you still have to test today

I'm lazy, and don't like the word "have to". :) I would prefer to have an app that I could open at any time for 5 minutes (waiting in line, sitting on a toilet, etc..), and it will just give the next best thing to review. Something like Facebook algorithm, but actually useful. So not for daily reviews, but for reviews multiple times per day, on random moments.

from ebisu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.