As requested in the assignment, this repo contains the:
- tidy generated data set (wearable_means.txt),
- script for downloading, manipulating, and otherwise tidying the above data set (run_analysis.R),
- code book describing the tidy data set (CodeBook.md)
- readme (aka this file)
The raw data is stored in the UCI HAR Dataset directory. The code book was generated as an .Rmd file at the end of the run_analysis.R script and then manually edited in order to knit the included CodeBook.md. Descriptions of data manipulations are present in the code book (replicated below) as well as documentation in the run_analysis.R script.
The data set was originally sourced from UCI HAR and tidied into the presented format. The training and test data sets were processed in parallel by associating activity names to levels as described in activity_labels.txt and cleaning up variable names (extended descriptions added to labels after merging) before identifying and extracting only variables for which the mean and standard deviations are present. The combined data set was then used to calculate the average of each variable (mean and standard deviation) for each activity and subject.