Coursera data science capstone project: R n-gram text predictor, shiny app
-
See Word Prediction: Exploratory Data Analysis for a introduction of the details of the methods. The first model use about 5% of the whole training dataset.
-
The final model uses around 10% of the whole training dataset, resulting in a 10% top one precision, 15% top three precision. The
data.table
is used to process data frames, making the prediction process faster. -
See the Shiny App for a demo.
-
See these slides for a brief explanation of the project.