This is the github repository for a short introductory course to R and how to use R for data science.
The slides and content are a consolidation of lessons and courses that I have picked up during my graduate studies at Columbia University.
Slides:
- Class 1 - Introduction
- Class 2 - Data Wrangling
- Class 2 - Data Visualisation
- Class 3 - EDA & Missing Data
- Class 4 - Functional Programming
Practice Assignments:
Datasets:
- Cleveland Heart Disease dataset taken from UC Irvine
- Class 1
- Basic R Programming
- Importing and Writing Data
- Class 2
- Data wrangling and manipulation with
dplyr
- Data visualization with
ggplot2
- Data wrangling and manipulation with
- Class 3
- Exploratory Data Analysis
- Dealing with Missing Data using
naniar
- Work on Assignment 1 together
- Class 4
- Functional Programming
- Class 5
- Basic Regression Analysis
- Class 6
- Basic Machine Learning Concepts with
caret
- Basic Machine Learning Concepts with
- Class 7
- Working with strings using
stringr
andrebus
packages - Simple NLP using
tidytext
,tm
andwordcloud
- Working with strings using
- Class 8
- Webscrapping with
rvest
- API with
httr
- Webscrapping with
- Class 9
- Dashboard & Website Building with
shiny
- Dashboard & Website Building with
Shoutout to raw.githack for making the viewing of raw html slides possible