Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

rgdk/GettingAndCleaningData

Open more actions menu

Repository files navigation

#An explanation of the run_analysis.R script

##Part 1 ###Here the training and test sets are extracted into separate data frames:

  • The test and trainining activity data sets are then read into the test_data_activities and train_data_activities data frames
  • The column within each of the test_data_activities and train_data_activities data frames are renamed to 'activity_id'
  • The test and training subject data are then extracted into the test_data_subject and train_data_subject data frames
  • The column within each test_data_subject and train_data_subject data frames are renamed to 'subject_id'
  • The activity column labels are extracted and stored within the activity_labels data frame
  • The features column labels are extracted and stored within the features data frame
  • The columns in the activity_labels data frame are renamed to something more meaningful (activity_id and activity)
  • The columns in the features data frame are renamed to something more meaningful (feature_id and feature)

##Part 2 ###Here, descriptive activity names are used to name the activities in the data set

  • A column that contains the test activity description based on the activity_id is added to the test activity data
  • Then the id column is removed
  • A column that contains the training activity description based on the activity_id is added to the training activity data
  • Then the id column is removed
  • The test_data_subject data frame is merged with the test_data frame
  • The train_data_subject data frame is merged with the train_data frame
  • The activities data frame is then merged with the test_data frame
  • The activities data frame is also merged with the train_data frame
  • The test and train data frames are then concatenated

##Part 3 ###The data set is appropriately labelled with descriptive variable names.

  • The columns in the merged data set are renamed based on the feature data frame

##Part 4 ###Extract only the measurements on the mean and standard deviation for each measurement.

  • Data frames to hold the means columns and standard deviation columns are separately set up
  • The mean measure names only are derived from the existing features list and set as the rows for the mean_col_names data. This is based on mean-based measure containing 'mean()' in the name.
  • The standard deviation measure names only are derived from the existing features list and set as the rows for the stdev_col_names data. This is based on standard deviation-based measure containing 'sd()' in the name.
  • The mean and standard deviation measures are stored within separate data frames
  • Blank data frames are set up for each of the mean and standard deviation measures with the correct number of rows
  • The mean data are bound columns together
  • The sd data columns are bound together
  • The row_num column which was just an initial placeholder to establish the correct number of rows in the data frame is then removed from the column lists in each variable

##Part 5 ###A second, independent tidy data set with the average of each variable for each activity and each subject is then created.

  • The data.table package is included in the library
  • The resultant data set from part 3 is converted into a data.table so that we can perform some grouping calculations on the data
  • A variable is set up to include only the names of the columns for which the means are required
  • The means are calculated across all numeric columns and grouped by activity and subject
  • The data is then output to file as ordered data

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

Morty Proxy This is a proxified and sanitized view of the page, visit original site.