question archive How to develop a VBA program to determine the quality and category of each wine sample in the data collection? Background info: An organization collected a list of physicochemical test result about red wine samples from the north of Portugal

How to develop a VBA program to determine the quality and category of each wine sample in the data collection? Background info: An organization collected a list of physicochemical test result about red wine samples from the north of Portugal

Subject:Computer SciencePrice: Bought3

How to develop a VBA program to determine the quality and category of each wine sample in the data collection?

Background info:

An organization collected a list of physicochemical test result about red wine samples from the north of Portugal. The goal is to determine the wine quality and category based on the test result of the samples.

In the template input file of this assignment, you should be able to find two worksheets. In the worksheet named "Raw Data", you can find the test result of 1599 wine samples. And the title row specifies the name of each attribute of the wine being tested.

Note that the "Sample ID" is not part of the test result. Values in the "Sample ID" column are used to identify the samples.

When the researchers establish the model to predict the wine quality, they use standardized data (between 0 and 1) instead of raw data. To standardize the data, you treat data in each column as a sequence, find the maximum number (max_v) and the minimum number (min_v) in this sequence, then for each data item (data_v) in this sequence, the calculated result of the following formula is the standardized data of this data item (standardized):

standardized = (data_v - min_v) / (max_v - min_v)

The model the researchers has established is a Linear Regression Model. This model calculates the quality of each sample as the sum of the product result of each attribute's value (of the sample) times the attribute's coefficient. You can find the coefficients of the attributes in the worksheet named "Coefficients".

If the calculated quality result of a sample is 100 or above, this wine sample is categorized as very good.

If the calculated quality result of a sample is between 50 and 100, this wine sample is categorized as good.

If the calculated quality result of a sample is between 20 and 50, this wine sample is categorized as average.

If the calculated quality result of a sample is below 20, this wine sample is categorized as bad.

 

The template file A10-input.xlsx. https://easyupload.io/h8yrap

 

I am not sure how to:

  1. calculate the standardized data of each attribute (except the Sample ID) of the data collection, and write the processed data, title row and the original sample ID to the newly added "Standardized Data" worksheet. The "Standardized Data" worksheet should have the same layout as the "Raw Data" worksheet.
  2. Note that to standardize data, you'll need to find the minimum and maximum values in each attribute (column).
  3. because the coefficients of the attributes (in the worksheet named "Coefficients" will be used many times in the next step, your program should read the coefficient values into an array first.
  4. then, for each sample (for each row), use the Linear Regression Model to calculate the quality of the sample wine and write the quality result into the column M of the corresponding row in the worksheet "Standardized Data", and write the sample's category based on its quality into the column N in the same row and same worksheet "Standardized Data".

 

Any help or suggestions would be great!

 

pur-new-sol

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE