GithubHelp home page GithubHelp logo

getdata-project's Introduction

getdata-project

This readme exists to explain how the script written for this assignment works.

Here are the steps from run_analysis.R explained:

  1. Read the activity_labels.txt from the dataset for later use. While reading, assign meaningful column names.

  2. Read the features.txt for later use. Each row of this file corresponds to a column in the X train and X test files.

  3. Define a function to be called on the test and train data to read them into a useable form. That function does the following:

  4. read the Y_ file. Each row defines the activity from which the corresponding X_ file row was measured. While reading, assign a meaningful column name.

  5. read the X_ file. Use the features information read in step 2 to define meaninful column names while reading.

  6. clean up the resulting column names (removing "." introduced when converting the text from the dataset to acceptable column names)

  7. read the subject_ file. Each row defines the subject performing the activity from the corresponding row in the X_ file.

  8. combine the subject, activity, and measurement (subject, Y_, and X_) data into a single data frame.

  9. Merge in the activity labels based on the match of activity ID between the activity_labels and Y_ data.

  10. The above function is called once for the test data and once for the training data.

  11. The results are combined into a single data frame.

  12. The "measurements on the mean and standard deviation for each measurement" are extracted using subset and a pattern match to identify the appropriate columns from the broader dataset (select those that have "mean" or "std" in the name, without picking up those meanFrequency).

  13. Note: descriptive activity names and meaningful variable labels have been included in the result by way of the earlier construction of the data frame.

  14. Use ddply to construct the means for each column per subject per activity. The result is a dataset described in the included codebook.

  15. Use write.table to dump the result to a text file for submission. Note: the read.table command to read in the data is included in the source file run_analysis.R.

Coursera getdata-004 course project

getdata-project's People

Contributors

markerichanson avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.