Please carefully understand the question!! MS Excel & Word will be used.
Baker Bank & Trust, Inc. is interested in identifying different attributes of its customers, and below is the sample data of 30 customers. For a Personal loan, 0 represents a customer who has not taken a personal loan, and 1 represents a customer who has taken a personal loan.
Use k-Nearest Neighbors (KNN) approach to classify the data, setting k-nearest neighbors with up to k = 5 (cutoff value = 0.5). Use Age and Income as input variables and Personal loan as the output variable. Be sure to normalize input data (i.e., using z-score) if necessary and classify a new client Billy Lee’s (33 years old, $ 80 k income) personal loan status (i.e., whether he has taken a personal loan) based on the similarity to the values of Age and Income of the observations in the training set (the 30 customer sample data).
(Hints: you may want to use Euclidean distance to assess the nearest neighbor observations)
Obs. Age Income (in $1000s) Personal loan
1 47 53 1
2 26 22 1
3 38 29 1
4 37 32 1
5 44 32 0
6 55 45 0
7 44 50 0
8 30 22 0
9 63 56 0
10 34 23 0
11 52 29 1
12 55 34 1
13 52 45 1
14 63 23 1
15 51 32 0
16 41 21 1
17 37 43 1
18 46 23 1
19 30 18 1
20 48 34 0
21 50 21 1
22 56 24 0
23 35 23 1
24 39 29 1
25 48 34 0
26 51 39 1
27 27 26 1
28 57 49 1
29 33 39 1
30 58 32 0