question archive Instructions: The assignment involves conducting a regression and correlation analysis on any topic of your choosing

Instructions: The assignment involves conducting a regression and correlation analysis on any topic of your choosing

Subject:MathPrice:4.86 Bought11

Share With

Instructions:

The assignment involves conducting a regression and correlation analysis on any topic of your choosing. It must be based on yearly data for any economic or business variable, for a period of at least 20 years

· The assignment can be between three to five pages.

· The assignment should distinguish between dependent and independent variables; determine the regression equation by the least squares method; plot the regression line on a scatter diagram; interpret the meaning of regression coefficients; use the regression equation to predict values of the dependent variable for selected values of the independent variable and construct forecast intervals and calculate the standard error of estimate, coefficients of determination (r2) and correlation (r) and interpret the meaning of the coefficients (r2) and (r). The regression and correlation analysis must:

1. Graph the data (scatter diagram)

2. Use the method of least squares to derive a trend equation and trend values?

3. Use check column to verify computations ∑ (Y-Yc)=0?

4. Superimpose trend equation on scatter diagram.?

5. Use the model to predict the movement of the variable for the next year.?

6. Compare the predictions with the actual behavior of the variable during the 21styear .

Name: Instructions: The assignment involves conducting a regression and correlation analysis on any top
Brand: Study Help Me
SKU: 49583
Price: 10 USD
Availability: LimitedAvailability
Rating: 5 (11 reviews)

pur-new-sol

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Answer Preview

Suppose you have N pairs of data points (X_1, Y_1), (X_2, Y_2), ... (X_N, Y_N) and you would like to know the relationship of the Y's to the X's and, if possible, predict one from the other. The X's and Y's could be whatever you like, but let's say for now that they are heights and weights. You would expect there to be a clear relationship between the two; on average, you would expect taller people to be heavier than shorter people.

A good way to start might be to model the relationship with a linear equation, Y = mX + b, where m is the slope of the line and b is the Y-intercept, as you learned in high school algebra. The question now is to determine the best way to estimate m and be given your pairs of X's and Y's.

It turns out that the best way to do this is to use least squares regression. We call it least squares regression because the line that we choose will be the one for which the sum of the squares of the differences between predicted and observed values is as small as possible.

1. Scatter Plot of Y and X1

Scatter plot of sales and calls shows that there can be a linear trend between the both. The trendline indicates that it looks like higher the number of calls, higher will be sales

2. Best fit line

Using the Regression option in Excel Data analysis menu, we obtain the following output

From this, bestfit line equation is Sales=Intercept+Coefficient of Calls *Calls

i.e., Sales = 22.52 + 0.1237 * Calls

3. Coefficient of Correlation

It denotes the strength of association between two variables. The sign denotes the direction of association.

In Exel, we calculate Correlation coefficient as Correl(X1Array,Yarray)

We get the value as 0.318

This means that calls and sales are slightly positively associated. With increase in one quantiity, the other is also showing an increasing trend. Please note that this does not imply causation, i.e.,we CANNOT say that the rise or fall in one is causing the change in other.

4. Coefficient of Determination

It is more commonly known as R squared value. It gives the measure of how close the data points are to the best fit line. In other words, it gives the proportion of variability in dependent variable that can be explained by the independent variable. Higher the Rsquared value, better the model is.

From Excel regression output, we get R squared value or Coefficient of Determination as 0.101

5. Utility of Regression model

F test can be used to test the utility of the model.

Null Hypothesis: Beta coefficient of call = 0; i.e., Calls is NOT linearly associated with sales

Alternate Hypothesis: Beta coefficient of call 0; Calls is linearly associated with sales

Let us choose significance level, = 0.05.

From the regression ANOVA output, we get p value (or significance value) of F test as 0.0012 (<0.05) for the given degrees of freedom (highlighted)

The 95% confidence interval for the coefficient of Calls (1) is [0.0498, 0.1976]

Interpretation: 95% confidence interval means that if this regression analysis is to be repeated for other samples from population, 95% of the intervals will contain the true value of 1. In simpler terms, we can say that we are 95% confident that the true value of 1 is in our interval.

8. Sales = 22.52 + 0.1237 * Calls

Let us say calls = 100.

95% confidence interval for 1 = [0.0498, 0.1976]

Thus, lower limit of Sales value, Y_low = 22.52 + 0.0498 * 100 = 27.5

Upper limit of Sales value, Yhigh = 22.52 + 0.1976 * 100 = 42.28

Thus, for calls = 100, Sales can be expected to be in the range of [27.5, 42.28

Please see the attached file for the complete solution

Instructions: The assignment involves conducting a regression and correlation analysis on any topic of your choosing

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Answer Preview

Download Attached File

Sitejabber (5.0)

Merchant Circle (4.8)

Trustpilot (4.6)

Study Help Me (4.9)

Related Questions

Address

Phone Number

Email Address