question archive This data set consists of observations taken from account holders at a large financial services firm
Subject:MathPrice: Bought3
This data set consists of observations taken from account holders at a large financial services firm. The accounts represent consumers of home equity lines of credit, automobile loans, and other short- to medium-term credit instruments. The BANK data set contains the original data in its raw form. The target variables relate to whether that account holder purchased a new product from the bank in the past year. The data sets contain more than one million rows and 24 columns. The “dataset” column further identifies whether the observation is to be used for training (60% of the observations), validation (20%), and testing (20%).
Client's Need:
· Your client would like to embark on a direct marketing campaign to increase bank revenues. To do that, they would like to understand what drives a customer to try new products/ services (B_TGT), understand what drives the total new sales (INT_TGT) as well as the total number of new products and services purchased by customers (CNT_TGT). Thus, you have been given sample data from which you are to build, validate, and test your predictive models. Your client requires three models, one for each of the three variables they would like to predict to better help them target their direct marketing campaign. Your client would also like to see how your model performs against the test holdout dataset and will use your model’s performance on the test dataset for their long-term consulting relationships as well as to determine whose model will be deployed.
Data Dictionary:
Target Variables
B_TGT |
New Product (Binary) (yes/no) |
INT_TGT |
New Sales (Interval) |
CNT_TGT |
Count Number New Products |
Categorical Inputs
CAT_INPUT1 |
Account Activity Level |
CAT_INPUT2 |
Customer Value Level |
Interval Inputs
RFM1 |
Average Sales Past Three Years |
RFM2 |
Average Sales Lifetime |
RFM3 |
Average Sales Past Three Years Dir Promo Resp |
RFM4 |
Last Product Purchase Amount |
RFM5 |
Count Purchased Past 3 Years |
RFM6 |
Count Purchased Lifetime |
RFM7 |
Count Purchased Past 3 Years Dir Promo Resp |
RFM8 |
Count Purchased Lifetime Dir Promo Resp |
RFM9 |
Months Since Last Purchase |
RFM10 |
Count Total Promos Past Year |
RFM11 |
Count Direct Promos Past Year |
RFM12 |
Customer Tenure |
Demographic Inputs
DEMOG_AGE |
Customer Age |
DEMOG_GENF |
Female Binary (yes/no) |
DEMOG_GENM |
Male Binary (yes/no) |
DEMOG_HO |
Homeowner Binary (yes/no) |
DEMOG_HOMEVAL |
Home Value |
DEMOG_INC |
Income |
DEMOG_PR |
Percentage retired in the area |
Dataset (NOTE: you do not to use the test or validation dataset to train your models)
dataset |
1 = training dataset 2 = validation dataset 3= test dataset |