fastfood Assignment Description
For this assignment, you must name your R file fastfood.R
For all questions you should load tidyverse and openintro. You should not need to use any other libraries.
If the tidyverse package is not installed, you’ll need to do a one-time installation from the Console Window in RStudio like this:
install.packages(“tidyverse”)
You cannot attempt to install packages in code that you submit to CodeGrade.
If the openintro package is not installed, you’ll need to do a one-time installation from the Console Window in RStudio like this:
install.packages(“openintro”)
You cannot attempt to install packages in code that you submit to CodeGrade.
Load tidyverse with:
suppressPackageStartupMessages(library(tidyverse))
Load libraries openintro with:
suppressPackageStartupMessages(library(openintro))
The actual data set is called fastfood.
Round all float/dbl values to two decimal places.
All statistics should be run with variables in the order I state
E.g., “Run a regression predicting mileage from mpg, make, and type” would be:
lm(mileage ~ mpg + make + type…)
Before attempting to answer these please review all CodeGrade information provided in the CodeGrade: Intro module. If you do not, you are likely to lose points.
To access the fastfood data, run the following:
fastfood <- openintro::fastfood
Looking only at Burger King and Chick-Fil-A, which item has the highest calories?
The answer may be a tibble or a dataframe. Assign it to Q1. Note: CodeGrade is only concerned with the name of the item.
What is the mean sugar amount for all items from Subway?
Assign the answer to Q2.
What is the mean value of calories for all items from Taco Bell?
Assign the answer to Q3.
Create a variable equal to total_fat x sugar called fatXsugar. Produce a tibble that has the restaurant, item, and fatXsugar for the top 3 items, from highest to lowest.
Your answer should be in a 3 X 3 tibble assigned to Q4 and look something like this:
# A tibble: 3 x 3
restaurant item fatXsugar
1 [name] [name] [value]
2 [name] [name] [value]
3 [name] [name] [value]
How many restaurants have an average saturated fat over 10?
Your answer should be one integer assigned to Q5.
pizza Assignment Description
For this assignment, name your R file pizza.R
For all questions you should load tidyverse. You should not need to use any other libraries.
If the tidyverse package is not installed, you’ll need to do a one-time installation from the Console Window in RStudio like this:
install.packages(“tidyverse”)
You cannot attempt to install packages in code that you submit to CodeGrade.
Load tidyverse with:
suppressPackageStartupMessages(library(tidyverse))
Download the pizza.csv file from Brightspace and place it in the same folder/directory as your script file. Then in RStudio, set your Working Directory to your Source File location:
Load the pizza.csv file like this:
pizza <- read_csv('pizza.csv')
Round all float/dbl values to two decimal places.
If your rounding does not work the way you expect, convert the tibble to a dataframe by using as.data.frame()
All statistics should be run with variables in the order I state
E.g., “Run a regression predicting mileage from mpg, make, and type” would be:
lm(mileage ~ mpg + make + type…)
In each of these you must use at least two dplyr functions. You may use Google to look up how to do certain aspects.
Before attempting to answer these please review all CodeGrade information provided in the CodeGrade: Intro module. If you do not, you are likely to lose points.
Create a tibble containing driver names of instances where free_wine = 1, discount_customer = 1, and the order contained more than 4 pizzas. (There will be repeated names).
Assign the tibble to Q1.
The list portion of your answer (which CodeGrade will be looking at) should look something like this:
1 [value]
2 [value]
3 [value]
4 [value]
5 [value]
6 [value]
7 [value]
8 [value]
9 [value]
Create a variable named ratio that is the ratio of bill to pizza, called ratio. What is the mean of that value (call the value mean_ratio)?
Your answer should be a 1×1 dataframe. Assign this to Q2
For each day of the week, what is the variance in pizzas?
The created values should be called var_pizzas.
The answer should be a dataframe assigned to Q3 and look something like this:
1 Friday [value]
2 Monday [value]
3 Saturday [value]
4 Sunday [value]
5 Thursday [value]
6 Tuesday [value]
7 Wednesday [value]
Which operator had the higher average bill?
The answer should be a tibble assigned to Q4 and look something like this:
operator
1 [name]
What was the highest amount of free wine given by day/driver combination? (For instance, Friday Bruno was 13, while Wednesday Salvator was 12)
The answer should be a tibble assigned to Q5 and look something like this:
day driver n
1 [day] [name] [value]
Depending on how you do this, you might need to convert a to . You can convert a variable using as.integer().
fastfoodStats Assignment Description
For this assignment, name your R file fastfoodStats.R
For all questions you should load tidyverse, openintro, and lm.beta. You should not need to use any other libraries.
If the tidyverse package is not installed, you’ll need to do a one-time installation from the Console Window in RStudio like this:
install.packages(“tidyverse”)
You cannot attempt to install packages in code that you submit to CodeGrade.
If the openintro package is not installed, you’ll need to do a one-time installation from the Console Window in RStudio like this:
install.packages(“openintro”)
You cannot attempt to install packages in code that you submit to CodeGrade.
If the lm.beta package is not installed, you’ll need to do a one-time installation from the Console Window in RStudio like this:
install.packages(“lm.beta”)
You cannot attempt to install packages in code that you submit to CodeGrade.
Load tidyverse with:
suppressPackageStartupMessages(library(tidyverse))
Load openintro with:
suppressPackageStartupMessages(library(openintro))
Load lm.beta with:
suppressPackageStartupMessages(library(lm.beta))
The actual data set is called fastfood.
Round all float/dbl values to two decimal places.
All statistics should be run with variables in the order I state
E.g., “Run a regression predicting mileage from mpg, make, and type” would be:
lm(mileage ~ mpg + make + type…)
Before attempting to answer these please review all CodeGrade information provided in the CodeGrade: Intro module. If you do not, you are likely to lose points.
To access the fastfood data, run the following:
fastfood <- openintro::fastfood
Create a correlation matrix for the relations between calories, total_fat, sugar, and calcium for all items at Sonic, Subway, and Taco Bell, omitting missing values with na.omit().
Assign the matrix to Q1. It should look something like this:
calories total_fat sugar calcium
calories [value] [value] [value] [value]
total_fat [value] [value] [value] [value]
sugar [value] [value] [value] [value]
calcium [value] [value] [value] [value]
Create a regression predicting whether or not a restaurant is McDonalds or Subway based on calories, sodium, and protein. (McDonalds should be 1, Subway 0) Hint: make sure you know how McDonalds is spelled in the dataset.
Assign the model coefficients to Q2. Your output should look something like this:
(Intercept) calories sodium protein
[value] [value] [value] [value]
Run the same regression as in Q2 but remove sodium as a predictor. Which is the better model?
Use the classical AIC (k=2).
Assign the AIC of the better model to Q3.
Run a regression predicting calories from saturated fat, fiber, and sugar. Based on standardized regression coefficients, identify the strongest predictor.
Assign the unstandardized regression coefficient of the strongest predictor to Q4.
(You can access the coefficients by indexing the model object.)
For this question, use data from only restaurants with between 50 and 60 items in the data set. Predict total fat from cholesterol, total carbs, vitamin a, and restaurant. Remove any nonsignificant predictors and run again.
Assign the strongest standardized regression coefficient to Q5. Your output should look something like this:
[variable name]
[value]
pizzaStats Assignment Description
For this assignment, name your R file pizzaStats.R
For all questions you should load tidyverse and lm.beta. You should not need to use any other libraries.
If the tidyverse package is not installed, you’ll need to do a one-time installation from the Console Window in RStudio like this:
install.packages(“tidyverse”)
You cannot attempt to install packages in code that you submit to CodeGrade.
If the lm.beta package is not installed, you’ll need to do a one-time installation from the Console Window in RStudio like this:
install.packages(“lm.beta”)
You cannot attempt to install packages in code that you submit to CodeGrade.
Load tidyverse with:
suppressPackageStartupMessages(library(tidyverse))
Load lm.beta with
suppressPackageStartupMessages(library(lm.beta))
Download the pizza.csv file from Brightspace and place it in the same folder/directory as your script file. Then in RStudio, set your Working Directory to your Source File location:
Load the pizza.csv file like this:
pizza <- read_csv('pizza.csv')
Round all float/dbl values to two decimal places unless otherwise specified.
All statistics should be run with variables in the order I state
E.g., “Run a regression predicting mileage from mpg, make, and type” would be:
lm(mileage ~ mpg + make + type…)
Before attempting to answer these please review all CodeGrade information provided in the CodeGrade: Intro module. If you do not, you are likely to lose points.
Create a correlation matrix for temperature, bill, pizzas, and got_wine.
Assign the matrix to Q1. It should look something like this:
temperature bill pizzas got_wine
temperature [value] [value] [value] [value]
bill [value] [value] [value] [value]
pizzas [value] [value] [value] [value]
got_wine [value] [value] [value] [value]
Create a correlation matrix of the relationships between time, temperature, bill, and pizzas for Laura in the East branch.
Assign the matrix to Q2. It should look something like this:
time temperature bill pizzas
time [value] [value] [value] [value]
temperature [value] [value] [value] [value]
bill [value] [value] [value] [value]
pizzas [value] [value] [value] [value]
Run a regression predicting whether or not wine was ordered from temperature, bill, and pizza.
Assign the coefficients of the summary of the model to Q3. It should look something like this:
Estimate Std. Error z value Pr(>|z|)
(Intercept) [value] [value] [value] [value]
temperature [value] [value] [value] [value]
bill [value] [value] [value] [value]
pizzas [value] [value] [value] [value]
Run a regression predicting bill from temperature, pizzas, and got_wine.
Assign the standardized regression coefficients to Q4 by using the lm.beta() function. You should not round these values. The output should look something like this:
Call:
lm(formula = [label] ~ [label] + [label] + [label], data = [label])
Standardized Coefficients::
(Intercept) temperature pizzas got_wine
[value] [value] [value] [value]
Note: CodeGrade will be grading you based on the last line (just the values).
Add operator to the regression from Q4. Which is the better model?
Assign the better model’s AIC to Q5.
Use the classical AIC (k=2).