question archive Descriptive Statistics and Data Visualization with R: The purpose of this assignment is to help you familiarized with the foundations of Data Mining through using descriptive statistics and data visualization with R

Descriptive Statistics and Data Visualization with R: The purpose of this assignment is to help you familiarized with the foundations of Data Mining through using descriptive statistics and data visualization with R

Subject:BusinessPrice: Bought3

Descriptive Statistics and Data Visualization with R:

  • The purpose of this assignment is to help you familiarized with the foundations of Data Mining through using descriptive statistics and data visualization with R.
  • In this assignment we shall use R functions to get hands-on experience to calculate the correlations through graphical and numerical methods.
  • We will create heatmap correlation plots for observing the distributions of correlations among those variables and calculate the descriptive statistics among those correlation coefficients.
  • Please refer to Chapter 3, R codes for creating Figure 3 and 7 in Data Mining for Business Analytics: Concepts, Techniques, and Applications in R (in this week's Reading & Resources). You may refer to the publisher website for the open resources to create the same Figures in chapter 3 to get familiarized with the chapter 3 contents.
  • Then use the attached dataset NewYorkHousing.csv to answer the following questions. Please also open the second attached file for the sample R codes which you can easily revise to generate the Figures 3.1 to 3.4 and 3.5 to 3.8 required in this assignment:
  1. Create a heatmap with values (just run the R codes will get it).
  2. Calculate the minimum, maximum, medium, standard deviation of ALL the correlations, except those correlations which are equal to 1 in the diagonal cells in the heatmap. (Hints: use functions in R instead of finding them in the heatmap visually. Use the summary(cor.mat) will get the min, max and medium, and use function sd() for the standard deviation).
  3. Create scatterplot matrix (hints: using ggpairs in R) using MDEV with these predictors: INDUS, CHAS, NOX, RM, AGE, DIS, TAX and state which predictor has strongest correlation with MEDV?
  4. Please copy/paste screen images of your work in R, and put into a Word document for submission. Be sure to provide narrative of your answers (i.e., do not just copy/paste your answers without providing some explanation of what you did or your findings).
  5. Please make sure you use install.packages("????") before you invoke the library(????) otherwise you will have errors.
  6. Please include Introudction, R codes with outputs, Figures and explanations with cover and reference pages. A good conclusion to wrap up the assignment is also expected. 
  7. Please follow APA format.

References:

The R Guide (http://cran.fhcrc.org/doc/contrib/Owen-TheRGuide.pdf)

pur-new-sol

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Related Questions