Exploring One-Way ANOVA (Part III)
Calculations of One-Way ANOVA
Instructions:
This is a tough post so I am going to have you just implement this code on your data and we will discuss the equations in class tomorrow after the website.
‘When you see purple text commit and state what you just did for your commit.’
mod_mtcars = mtcars %>%
mutate(cyl = as.factor(cyl))
Implementation of the One-Way ANOVA Model
model1 <- aov(mpg ~ cyl, mod_mtcars)
tidy(model1)
## # A tibble: 2 × 6
## term df sumsq meansq statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 cyl 2 825. 412. 39.7 4.98e-9
## 2 Residuals 29 301. 10.4 NA NA
We are going to explain each number
K: number of cylinders (groups)
n: sample size
In this problems K = 3 (types of cylinder) & n = 32 (cars)
K = 3
n = 32
‘Commit Here: state the sample size and number of groups’
degrees of freedom of cyl term is K - 1 = 2
degrees of freedom of Residuals term is n - K = 29
df_groups = K - 1
df_residuals = n - K
‘Commit Here: determine the degrees of freedom’
Sums of Square cyl
overall_mean = mean(mod_mtcars$mpg, na.rm = TRUE)
SS_groups_df = mod_mtcars %>%
group_by(cyl) %>%
summarise(group_means = mean(mpg,na.rm = TRUE), group_sample_size = n()) %>%
mutate(mean_diff = (group_means - overall_mean)) %>%
mutate(mean_diffsq = (mean_diff)^2) %>%
mutate(sam_mean_diffsq = (group_sample_size)* mean_diffsq )
(SS_groups = sum(SS_groups_df$sam_mean_diffsq))
## [1] 824.7846
‘Commit Here: calculate sums of square of groups’
Sums of Square Residuals
SS_residuals_df = mod_mtcars %>%
group_by(cyl) %>%
summarise(sd_group = sd(mpg, na.rm = TRUE), group_sample_size = n()) %>%
mutate(sp2 = (group_sample_size - 1)*(sd_group^2))
(SS_residuals = sum(SS_residuals_df$sp2))
## [1] 301.2626
‘Commit Here: calculate the sums of squares of residuals’
Mean Square
Groups
(MS_groups = SS_groups/df_groups)
## [1] 412.3923
‘Commit Here: calculate the mean square of groups’
Residuals
(MS_residual = SS_residuals/df_residuals)
## [1] 10.38837
‘Commit Here: Calculate the mean square of residuals’
Test Statistic
(TS = MS_groups / MS_residual)
## [1] 39.69752
‘Commit Here: calculate the test statistic’
p-value
1 - pf(TS, df_groups,df_residuals)
## [1] 4.978919e-09
‘Commit Here: calculate the p-value’