# Effect of Vitamin C on Tooth Growth in Guinea Pigs

## Basic Inferential Data Analysis on ToothGrowth dataset (part of Statistical Inference by Johns Hopkins University)

This assignment was part of the Johns Hopkins Coursera module on Statistical Inference as part of the Data Science Specialization.

Source code available on GitHub

## Overview

The goal is to conduct some simple hypothesis testing on the ToothGrowth dataset available in the R datasets package.

Some assumptions:

• equal variances among groups
• standard deviation estimated from the samples
• is set to 5%
• samples are not paired

## Data processing

We import the data and directly set the dose as a factor.

library(ggplot2)
library(datasets)
tg <- datasets::ToothGrowth
tg$dose <- as.factor(tg$dose)


Glimpse at data.

str(tg)

## 'data.frame':    60 obs. of  3 variables:
##  $len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ... ##$ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
y = tg[tg$supp=="VC", "len"] delta = mean(x) - mean(y) p.sd = sqrt((var(x)+var(y))/2) t.res <- t.test(x, y, alternative = "two.sided", mu = 0, paired = FALSE, var.equal = TRUE) p.res <- power.t.test(n, delta, p.sd, sig.level=0.05, type="two.sample", alternative="two.sided")  We have a p-value (6.0393371%) larger the 5% and in addition the confidence interval (-0.1670064, 7.5670064) covers the value 0. We fail to reject the null hypothesis in this case. ### Has the dose an impact on tooth growth? We test the difference in means between each dosage (3 tests: 0.05 vs 1, 0.5 vs 2, 1 vs 2). n = 10 x = tg[tg$dose=="0.5", "len"]
y = tg[tg$dose=="1", "len"] delta = mean(x) - mean(y) p.sd = sqrt((var(x)+var(y))/2) t.res.a <- t.test(x, y, alternative = "two.sided", mu = 0, paired = FALSE, var.equal = TRUE) p.res.a <- power.t.test(n, delta, p.sd, sig.level=0.05, type="two.sample", alternative="two.sided")  n = 10 x = tg[tg$dose=="0.5", "len"]
y = tg[tg$dose=="2", "len"] delta = mean(x) - mean(y) p.sd = sqrt((var(x)+var(y))/2) t.res.b <- t.test(x, y, alternative = "two.sided", mu = 0, paired = FALSE, var.equal = TRUE) p.res.b <- power.t.test(n, delta, p.sd, sig.level=0.05, type="two.sample", alternative="two.sided")  n = 10 x = tg[tg$dose=="1", "len"]
y = tg[tg\$dose=="2", "len"]
delta = mean(x) - mean(y)
p.sd = sqrt((var(x)+var(y))/2)

t.res.c <- t.test(x, y, alternative = "two.sided", mu = 0, paired = FALSE, var.equal = TRUE)
p.res.c <- power.t.test(n, delta, p.sd, sig.level=0.05, type="two.sample", alternative="two.sided")

##                      dose.0.5v1      dose0.5v2      dose.1v2
## p-value            1.266297e-07  2.837553e-14  1.810829e-05
## conf-interval-low -1.198375e+01 -1.815352e+01 -8.994387e+00
## conf-interval-up  -6.276252e+00 -1.283648e+01 -3.735613e+00
## power              9.909607e-01  1.000000e+00  9.057799e-01


## Conclusions

We failed to reject the null-hypothesis regarding the impact of the delivery method on tooth growth.

The dosage was found to be statistically significant and tests rejected the null-hypothesis.

Tags:

Categories:

Updated: