question archive This assessment has 6 multi-part questions that will all use the setup below
Subject:StatisticsPrice:2.87 Bought15
This assessment has 6 multi-part questions that will all use the setup below.
1) Game attendance in baseball varies partly as a function of how well a team is playing. Load the Lahman library. The Teams data frame contains an attendance column. This is the total attendance for the season. To calculate average attendance, divide by the number of games played, as follows:
library(tidyverse) library(broom) library(Lahman) Teams_small <- Teams %>% filter(yearID %in% 1961:2001) %>% mutate(avg_attendance = attendance/G)
Use linear models to answer the following 3-part question about Teams_small.
Use runs (R) per game to predict average attendance.
For every 1 run scored per game, average attendance increases by how much?
Use home runs (HR) per game to predict average attendance.
For every 1 home run hit per game, average attendance increases by how much?
Question:2
Game wins, runs per game and home runs per game are positively correlated with attendance. We saw in the course material that runs per game and home runs per game are correlated with each other. Are wins and runs per game or wins and home runs per game correlated?
What is the correlation coefficient for wins and runs per game?
What is the correlation coefficient for wins and home runs per game?
Answer:
The following is codes to solve the question along with the output:
> rpg=Teams_small$R/Teams_small$G #defining runs per game
> hrpg=Teams_small$HR/Teams_small$G #defining home runs per game
> m1=lm(Teams_small$avg_attendance~rpg) #in general, model_name=lm(independent_variable~predictor)
> m1
Call:
lm(formula = Teams_small$avg_attendance ~ rpg)
Coefficients:
(Intercept) rpg
-7213 4117
> m2=lm(Teams_small$avg_attendance~hrpg) #in general, model_name=lm(independent_variable~predictor)
> m2
Call:
lm(formula = Teams_small$avg_attendance ~ hrpg)
Coefficients:
(Intercept) hrpg
3783 8113
> cor(Teams_small$W,rpg) #correlation coefficient between wins and runs per game
[1] 0.4116491
> cor(Teams_small$W,hrpg) #correlation coefficient between wins and home runs per game
[1] 0.2744313
Question 1:
From the coefficients given for model m1, we know that the regression equation for predicting average attendance using runs per game as predictor is,
avg_attendance = -7213 + 4117 * runs_per_game
The coefficient for runs per game (rpg) in the model m1 is given as 4117.
The interpretation of this coefficient is, with every increase of 1 in the runs per game, the average attendance increases by 4117.
From the coefficients given for model m2, we know that the regression equation for predicting average attendance using home runs per game as predictor is,
avg_attendance = 3783 + 8113 * home_runs_per_game
The coefficient for home runs per game (rpg) in the model m2 is given as 8113.
The interpretation of this coefficient is, with every increase of 1 in the home runs per game, the average attendance increases by 8113.
Question 2:
From the command mentioned above, we understand that,
Correlation coefficient between wins and runs per game is 0.4116 and
correlation coefficient between wins and home runs per game is 0.2744
Thus, the correlation between wins and runs per game is fair, while that between wins and home runs per game is very low.