The datasets are downloaded in a folder called "UCI HAR Dataset." It comprises the test folder, train folder, as well as the activity labels and features, etc.
run_analysis.R
- It loads the activity, feature, test and train datasets.
- It merges the test and train datasets for x, y and subject correspondingly.
- It creates the character vector that consists of all the relevant column headers i.e subject, activity, mean, std measurements.
- It subsets the original datatable to give the required datatable.
- It uses activity labels to factorise the Activity Column so that the activity numbers are replaced by the activity labels.
- It also labels the dataset variables correctly using gsub with regex.
- It averages all the values per activity per subject.
- The data table is then written to a file called "Data.txt."
CodeBook.md This illustrates all the variables, and data, along with any transformations that have occured while tidying the data.
Data.txt This file constitutes the final dataset, after cleaning.