HS/HSL R Workshop Live Code


Statistical analyses of a built-in data

  1. load the dataset called mtcars into your current workspace (it comes with R by default)

    data("mtcars")
  2. show the first few lines of mtcars data

    head(mtcars)
  3. view the mtcars data

    View(mtcars)
  4. show the documentation for the mtcars data

    help(mtcars)
  5. summarize the mtcars data

    summary(mtcars)
  6. access a single column of data, the mpg column

    mtcars$mpg
  7. access a single column of data, the wt column

    mtcars$wt
  8. perform a T-test comparing two variables
    the tilde "~" means "explained by", so the following tests for an explanation of mpg by the car transmission type

    t.test(mpg ~ am, data=mtcars)
  9. assign the T-test result into a variable

    tt = t.test(mpg ~ am, data=mtcars)
  10. show the T-test on demand

    tt
  11. extract only the p-value

    tt$p.value
  12. extract only the confidence interval

    tt$conf.int
  13. perform a correlation test over two variables, mpg and wt

    cor.test(mtcars$mpg, mtcars$wt)
  14. assign the correlation test result into a variable

    ct = cor.test(mtcars$mpg, mtcars$wt)
  15. show the correlation test on demand

    ct
  16. extract only the p-value

    ct$p.value
  17. extract only the estimate

    ct$estimate
  18. extract only the confidence interval

    ct$conf.int
  19. create a linear model showing mpg explained by wt

    fit = lm(mpg ~ wt, mtcars)
  20. summarize the fit

    summary(fit)
  21. extract the matrix of coefficients

    coef(summary(fit))
  22. extract just the estimates of the matrix

    co = coef(summary(fit))
  23. get the first column

    co[, 1]
  24. get the fourth column

    co[, 4]
  25. use the predict function for our existing cars

    predict(fit)
  26. predict for a car at 4500 pounds
    summarize the fit

    summary(fit)
  27. add together the intercept term (37.2851) and the weight coefficient
    (-5.3445) times our new weight, which is 4.5 thousands of pounds

    37.2851 + (-5.3445) * 4.5
  28. use the built in predict function to get same answer as above
    create a data frame containing the predictors we wish to use (4500 lbs)

    newcar = data.frame(wt=4.5)
  29. pass the predict function the new data frame

    predict(fit, newcar)
  30. plot out the linear model with a smoothing curve

    plot12 <- ggplot(mtcars, aes(wt, mpg)) + geom_point() + geom_smooth(method="lm") + ggtitle("Linear model of the relationship between a car weight and efficiency")
  31. show plot

    plot12
  32. save the plot as a .png in your current working directory

    ggsave(filename="cars.png", plot12)