Problem Statement¶
Context¶
Businesses like banks which provide service have to worry about problem of 'Customer Churn' i.e. customers leaving and joining another service provider. It is important to understand which aspects of the service influence a customer's decision in this regard. Management can concentrate efforts on improvement of service, keeping in mind these priorities.
Objective¶
You as a Data scientist with the bank need to build a neural network based classifier that can determine whether a customer will leave the bank or not in the next 6 months.
Data Dictionary¶
CustomerId: Unique ID which is assigned to each customer
Surname: Last name of the customer
CreditScore: It defines the credit history of the customer.
Geography: A customer’s location
Gender: It defines the Gender of the customer
Age: Age of the customer
Tenure: Number of years for which the customer has been with the bank
NumOfProducts: refers to the number of products that a customer has purchased through the bank.
Balance: Account balance
HasCrCard: It is a categorical variable which decides whether the customer has credit card or not.
EstimatedSalary: Estimated salary
isActiveMember: Is is a categorical variable which decides whether the customer is active member of the bank or not ( Active member in the sense, using bank products regularly, making transactions etc )
Exited : whether or not the customer left the bank within six month. It can take two values
** 0=No ( Customer did not leave the bank ) ** 1=Yes ( Customer left the bank )
Importing necessary libraries¶
#importing tensorflow
import tensorflow as tf
print(tf.__version__)
WARNING:tensorflow:From C:\Users\bruce\AppData\Roaming\Python\Python311\site-packages\keras\src\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead. 2.15.0
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
import tensorflow as tf
from sklearn import preprocessing
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score, f1_score, precision_recall_curve, classification_report
from sklearn.preprocessing import LabelEncoder, OneHotEncoder,StandardScaler
import matplotlib.pyplot as plt
from tensorflow.keras import optimizers
from sklearn.decomposition import PCA
import seaborn as sns
import keras
import tensorflow as tf
from keras import backend as K
from keras.models import Sequential
from keras.layers import Dense, Dropout
import time
# import SMOTE for sampling
# Oversample with SMOTE and random undersample for imbalanced dataset
from collections import Counter
from imblearn.over_sampling import SMOTE
from imblearn.under_sampling import RandomUnderSampler
from imblearn.pipeline import Pipeline
Loading the dataset¶
#Defining the path of the dataset
dataset_file = 'Churn.csv'
#reading dataset
data = pd.read_csv(dataset_file)
Data Overview¶
data.head()
| RowNumber | CustomerId | Surname | CreditScore | Geography | Gender | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 15634602 | Hargrave | 619 | France | Female | 42 | 2 | 0.00 | 1 | 1 | 1 | 101348.88 | 1 |
| 1 | 2 | 15647311 | Hill | 608 | Spain | Female | 41 | 1 | 83807.86 | 1 | 0 | 1 | 112542.58 | 0 |
| 2 | 3 | 15619304 | Onio | 502 | France | Female | 42 | 8 | 159660.80 | 3 | 1 | 0 | 113931.57 | 1 |
| 3 | 4 | 15701354 | Boni | 699 | France | Female | 39 | 1 | 0.00 | 2 | 0 | 0 | 93826.63 | 0 |
| 4 | 5 | 15737888 | Mitchell | 850 | Spain | Female | 43 | 2 | 125510.82 | 1 | 1 | 1 | 79084.10 | 0 |
data.shape
(10000, 14)
Observations
There are 10,000 rows with 14 columns.
Check for missing values¶
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 10000 entries, 0 to 9999 Data columns (total 14 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 RowNumber 10000 non-null int64 1 CustomerId 10000 non-null int64 2 Surname 10000 non-null object 3 CreditScore 10000 non-null int64 4 Geography 10000 non-null object 5 Gender 10000 non-null object 6 Age 10000 non-null int64 7 Tenure 10000 non-null int64 8 Balance 10000 non-null float64 9 NumOfProducts 10000 non-null int64 10 HasCrCard 10000 non-null int64 11 IsActiveMember 10000 non-null int64 12 EstimatedSalary 10000 non-null float64 13 Exited 10000 non-null int64 dtypes: float64(2), int64(9), object(3) memory usage: 1.1+ MB
# checking shape of the data
print("There are", data.shape[0], 'rows and', data.shape[1], "columns.")
There are 10000 rows and 14 columns.
Observation
- This shows that there are 10000 instances and 14 attributes including
RowNumberandExitedattribute.
As you can see there are no null values in any of the colum
- There are three columns that are object data types:
Surname,Geography, andGendern
data.isnull().sum()
RowNumber 0 CustomerId 0 Surname 0 CreditScore 0 Geography 0 Gender 0 Age 0 Tenure 0 Balance 0 NumOfProducts 0 HasCrCard 0 IsActiveMember 0 EstimatedSalary 0 Exited 0 dtype: int64
Observation
There are no missing values
Check for duplicate values¶
data.duplicated().sum()
0
Observation
There are no duplicate values, which you would expect with a column named RowNumber.
df_no_row_number = data.drop('RowNumber', axis=1)
df_no_row_number.duplicated().sum()
0
Observation
There are no duplicate values without a column named RowNumber.
View object data type¶
obj_columns = data.select_dtypes(include='object').columns
# Print the count of unique categorical levels in each column
for column in obj_columns:
print(data[column].value_counts())
print("-" * 50)
Smith 32
Scott 29
Martin 29
Walker 28
Brown 26
..
Izmailov 1
Bold 1
Bonham 1
Poninski 1
Burbidge 1
Name: Surname, Length: 2932, dtype: int64
--------------------------------------------------
France 5014
Germany 2509
Spain 2477
Name: Geography, dtype: int64
--------------------------------------------------
Male 5457
Female 4543
Name: Gender, dtype: int64
--------------------------------------------------
# Print the percentage of unique categorical levels in each column
for column in obj_columns:
print(data[column].value_counts(normalize=True))
print("-" * 50)
Smith 0.0032
Scott 0.0029
Martin 0.0029
Walker 0.0028
Brown 0.0026
...
Izmailov 0.0001
Bold 0.0001
Bonham 0.0001
Poninski 0.0001
Burbidge 0.0001
Name: Surname, Length: 2932, dtype: float64
--------------------------------------------------
France 0.5014
Germany 0.2509
Spain 0.2477
Name: Geography, dtype: float64
--------------------------------------------------
Male 0.5457
Female 0.4543
Name: Gender, dtype: float64
--------------------------------------------------
Observations
- Half of the data shows rows from France, with Germany and Spain splitting the other half
- There are a few more males than females
Exploratory Data Analysis¶
pd.options.display.float_format = "{:,.2f}".format
df_no_row_number.describe().T
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| CustomerId | 10,000.00 | 15,690,940.57 | 71,936.19 | 15,565,701.00 | 15,628,528.25 | 15,690,738.00 | 15,753,233.75 | 15,815,690.00 |
| CreditScore | 10,000.00 | 650.53 | 96.65 | 350.00 | 584.00 | 652.00 | 718.00 | 850.00 |
| Age | 10,000.00 | 38.92 | 10.49 | 18.00 | 32.00 | 37.00 | 44.00 | 92.00 |
| Tenure | 10,000.00 | 5.01 | 2.89 | 0.00 | 3.00 | 5.00 | 7.00 | 10.00 |
| Balance | 10,000.00 | 76,485.89 | 62,397.41 | 0.00 | 0.00 | 97,198.54 | 127,644.24 | 250,898.09 |
| NumOfProducts | 10,000.00 | 1.53 | 0.58 | 1.00 | 1.00 | 1.00 | 2.00 | 4.00 |
| HasCrCard | 10,000.00 | 0.71 | 0.46 | 0.00 | 0.00 | 1.00 | 1.00 | 1.00 |
| IsActiveMember | 10,000.00 | 0.52 | 0.50 | 0.00 | 0.00 | 1.00 | 1.00 | 1.00 |
| EstimatedSalary | 10,000.00 | 100,090.24 | 57,510.49 | 11.58 | 51,002.11 | 100,193.91 | 149,388.25 | 199,992.48 |
| Exited | 10,000.00 | 0.20 | 0.40 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 |
Observation
- There are binary columns:
Exited,HasCrCard, andIsActiveMemberalthough both are represented asint64in the data CreditScore,Age,Tenure,ExpectedSalaryseem to be normal distribution (not skewed)Balance
# skewness along the index axis
df_no_row_number.skew(axis = 0, skipna = True, numeric_only=True)
CustomerId 0.00 CreditScore -0.07 Age 1.01 Tenure 0.01 Balance -0.14 NumOfProducts 0.75 HasCrCard -0.90 IsActiveMember -0.06 EstimatedSalary 0.00 Exited 1.47 dtype: float64
The following functions need to be defined to carry out the Exploratory Data Analysis.¶
# function to plot a boxplot and a histogram along the same scale.
def histogram_boxplot(data, feature, figsize=(12, 7), kde=False, bins=None):
"""
Boxplot and histogram combined
data: dataframe
feature: dataframe column
figsize: size of figure (default (12,7))
kde: whether to the show density curve (default False)
bins: number of bins for histogram (default None)
"""
f2, (ax_box2, ax_hist2) = plt.subplots(
nrows=2, # Number of rows of the subplot grid= 2
sharex=True, # x-axis will be shared among all subplots
gridspec_kw={"height_ratios": (0.25, 0.75)},
figsize=figsize,
) # creating the 2 subplots
sns.boxplot(
data=data, x=feature, ax=ax_box2, showmeans=True, color="violet"
) # boxplot will be created and a triangle will indicate the mean value of the column
sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter",
hue=feature
) if bins else sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2
) # For histogram
ax_hist2.axvline(
data[feature].mean(), color="green", linestyle="--"
) # Add mean to the histogram
ax_hist2.axvline(
data[feature].median(), color="black", linestyle="-"
) # Add median to the histogram
# function to create labeled barplots
def labeled_barplot(data, feature, perc=False, n=None):
"""
Barplot with percentage at the top
data: dataframe
feature: dataframe column
perc: whether to display percentages instead of count (default is False)
n: displays the top n category levels (default is None, i.e., display all levels)
"""
total = len(data[feature]) # length of the column
count = data[feature].nunique()
if n is None:
plt.figure(figsize=(count + 1, 5))
else:
plt.figure(figsize=(n + 1, 5))
plt.xticks(rotation=90, fontsize=15)
ax = sns.countplot(
data=data,
x=feature,
palette="Paired",
order=data[feature].value_counts().index[:n].sort_values(),
legend=False,
hue=feature
)
for p in ax.patches:
if perc == True:
label = "{:.1f}%".format(
100 * p.get_height() / total
) # percentage of each class of the category
else:
label = p.get_height() # count of each level of the category
x = p.get_x() + p.get_width() / 2 # width of the plot
y = p.get_height() # height of the plot
ax.annotate(
label,
(x, y),
ha="center",
va="center",
size=12,
xytext=(0, 5),
textcoords="offset points",
) # annotate the percentage
plt.show() # show the plot
# function to plot stacked bar chart
def stacked_barplot(data, predictor, target):
"""
Print the category counts and plot a stacked bar chart
data: dataframe
predictor: independent variable
target: target variable
"""
count = data[predictor].nunique()
sorter = data[target].value_counts().index[-1]
tab1 = pd.crosstab(data[predictor], data[target], margins=True).sort_values(
by=sorter, ascending=False
)
print(tab1)
print("-" * 120)
tab = pd.crosstab(data[predictor], data[target], normalize="index").sort_values(
by=sorter, ascending=False
)
tab.plot(kind="bar", stacked=True, figsize=(count + 1, 5))
plt.legend(
loc="lower left", frameon=False,
)
plt.legend(loc="upper left", bbox_to_anchor=(1, 1))
plt.show()
### Function to plot distributions
def distribution_plot_wrt_target(data, predictor, target):
fig, axs = plt.subplots(2, 2, figsize=(12, 10))
target_uniq = data[target].unique()
axs[0, 0].set_title("Distribution of target for target=" + str(target_uniq[0]))
sns.histplot(
data=data[data[target] == target_uniq[0]],
x=predictor,
kde=True,
ax=axs[0, 0],
color="teal",
hue=target,
legend=False
)
axs[0, 1].set_title("Distribution of target for target=" + str(target_uniq[1]))
sns.histplot(
data=data[data[target] == target_uniq[1]],
x=predictor,
kde=True,
ax=axs[0, 1],
color="orange",
hue=target,
legend=False
)
axs[1, 0].set_title("Boxplot w.r.t target")
sns.boxplot(data=data, x=target, y=predictor, ax=axs[1, 0], palette="gist_rainbow",
hue=target,
legend=False
)
axs[1, 1].set_title("Boxplot (without outliers) w.r.t target")
sns.boxplot(
data=data,
x=target,
y=predictor,
ax=axs[1, 1],
showfliers=False,
palette="gist_rainbow",
hue=target,
legend=False
)
plt.tight_layout()
plt.show()
Univariate Analysis¶
Credit Score¶
histogram_boxplot(data, 'CreditScore')
Geography¶
labeled_barplot(data=data,feature='Geography',perc=True)
Age¶
histogram_boxplot(data, 'Age')
Observation
Age is slightly skewed right. Should investigate how Age is correlated to other features.
Tenure¶
histogram_boxplot(data, 'Tenure', bins=11)
Observation
Tenure has a normal distribution. Not skewed left or right.
Balance¶
histogram_boxplot(data, 'Balance')
Observation
Balance overall is a normal distribution with the exception of a large number with a zero balance. This makes the feature left skewed.
Number of products¶
histogram_boxplot(data, 'NumOfProducts', bins=4)
Observation
Balance is right skewed.
Has credit card¶
labeled_barplot(data=data,feature='HasCrCard', perc=True)
Observation
70% has a credit card.
Is active member¶
labeled_barplot(data=data,feature='IsActiveMember',perc=True)
Observation
Active membership is just about evenly split.
Gender¶
labeled_barplot(data=data,feature='Gender',perc=True)
Observation
There are more males with accounts by about a 10% margin.
Surname¶
print(data['Surname'].nunique())
2932
Observation
There are 2932 distinct surnames.
filtered_df = data.groupby('Surname').filter(lambda x: len(x) > 10)
labeled_barplot(data=filtered_df,feature='Surname')
Estimated Salary¶
histogram_boxplot(data, 'EstimatedSalary')
Observation
Salary is normalized without skewedness.
Exited¶
labeled_barplot(data=data,feature='Exited',perc=True)
Observation
About 80% of the target (customers who have exited) have accounts. About 20% have exited.
Bivariate Analysis¶
plt.figure(figsize=(10,5))
numeric_data = df_no_row_number.select_dtypes(include='number')
sns.heatmap(numeric_data.corr(),annot=True,cmap='Spectral',vmin=-1,vmax=1)
plt.show()
Observations
Generally there is no strong correlations in either direction between the features.
Credit Score vs Exited¶
distribution_plot_wrt_target(numeric_data, 'CreditScore', 'Exited')
sns.scatterplot(data=data, x='CreditScore', y='Balance', hue='Exited');
Observation
There doesn't seem to be a grouping between CreditScore and Balance wrt Exited
Age vs Exited¶
sns.stripplot(data=data, x='Tenure', y='Age', hue='Exited');
Geography vs Exited¶
stacked_barplot(data, 'Geography', 'Exited')
Exited 0 1 All Geography All 7963 2037 10000 Germany 1695 814 2509 France 4204 810 5014 Spain 2064 413 2477 ------------------------------------------------------------------------------------------------------------------------
Credit Score vs Age¶
sns.scatterplot(data=data, x='CreditScore', y='Age', hue='Exited');
sns.lmplot(data=data, x='CreditScore', y='Age', hue='Exited',ci=False);
Observation
Generally older customers have exited regardless of credit score.
Age vs Exited¶
distribution_plot_wrt_target(numeric_data, 'Age', 'Exited')
Observation
This pair (age and exited) were the highest correlation. The chart shows a higher mean age for those exiting and higher IRQ. The data holds the same with outliers and without outliers.
Is Active Member vs Exited¶
stacked_barplot(data, 'IsActiveMember', 'Exited')
Exited 0 1 All IsActiveMember All 7963 2037 10000 0 3547 1302 4849 1 4416 735 5151 ------------------------------------------------------------------------------------------------------------------------
Has Credit Card vs Exited¶
stacked_barplot(data, 'HasCrCard', 'Exited')
Exited 0 1 All HasCrCard All 7963 2037 10000 1 5631 1424 7055 0 2332 613 2945 ------------------------------------------------------------------------------------------------------------------------
Observation
Although many more have a credit card, it does not seem to be correlated to whether the customer has exited.
Estimated Salary vs Exited¶
distribution_plot_wrt_target(numeric_data, 'EstimatedSalary', 'Exited')
Balance vs Exited¶
distribution_plot_wrt_target(numeric_data, 'Balance', 'Exited')
Observation
Those leaving the bank had an average higher balance.
Gender vs Exited¶
stacked_barplot(data, 'Gender', 'Exited')
Exited 0 1 All Gender All 7963 2037 10000 Female 3404 1139 4543 Male 4559 898 5457 ------------------------------------------------------------------------------------------------------------------------
Observation
Even though there are more males, females are more likely to leave the bank than males.
sns.stripplot(data=data, x='HasCrCard', y='Balance', hue='Exited');
Data Preprocessing¶
### Drop the column as they will not add value to the modeling
## Also drop the target
X = data.drop(['RowNumber', 'CustomerId', 'Surname', 'Exited'], axis=1)
Y = data['Exited']
Column binning¶
For this run, we will not "bin" any of the features. Although in future runs it might be helpful to bin:
SalaryBalanceThis might be valuable because so many have zero balance
Dummy Variable Creation¶
object_columns = X.select_dtypes(include='object').columns
print("object columns:", object_columns)
object columns: Index(['Geography', 'Gender'], dtype='object')
Convert the objects to data
# Encoding the categorical variables using one-hot encoding
X = pd.get_dummies(
X,
columns=object_columns,
drop_first=True,
)
Train-validation-test Split¶
# Splitting the dataset into the Training and Test set.
X_train, X_test, y_train, y_test = train_test_split(X,Y, test_size = 0.2, random_state = 42,stratify = Y)
# Splitting the Train dataset into the Training and Validation set.
X_train, X_valid, y_train, y_valid = train_test_split(X_train,y_train, test_size = 0.2, random_state = 42,stratify = y_train)
# Printing the shapes.
print(X_train.shape,y_train.shape)
print(X_valid.shape,y_valid.shape)
print(X_test.shape,y_test.shape)
(6400, 11) (6400,) (1600, 11) (1600,) (2000, 11) (2000,)
Data Normalization¶
We will have the tensorflow function split the validation data sets
num_columns = X.select_dtypes(include='number').columns
print(num_columns)
Index(['CreditScore', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard',
'IsActiveMember', 'EstimatedSalary', 'Geography_Germany',
'Geography_Spain', 'Gender_Male'],
dtype='object')
#Standardizing the numerical variables to zero mean and unit variance.
autosacler = StandardScaler()
X = autosacler.fit_transform(X)
X
array([[-0.32622142, 0.29351742, -1.04175968, ..., -0.57873591,
-0.57380915, -1.09598752],
[-0.44003595, 0.19816383, -1.38753759, ..., -0.57873591,
1.74273971, -1.09598752],
[-1.53679418, 0.29351742, 1.03290776, ..., -0.57873591,
-0.57380915, -1.09598752],
...,
[ 0.60498839, -0.27860412, 0.68712986, ..., -0.57873591,
-0.57380915, -1.09598752],
[ 1.25683526, 0.29351742, -0.69598177, ..., 1.72790383,
-0.57380915, 0.91241915],
[ 1.46377078, -1.04143285, -0.35020386, ..., -0.57873591,
-0.57380915, -1.09598752]])
Utility functions¶
def plot(history, name):
"""
Function to plot loss/accuracy
history: an object which stores the metrics and losses.
name: can be one of Loss or Accuracy
"""
fig, ax = plt.subplots() #Creating a subplot with figure and axes.
plt.plot(history.history[name]) #Plotting the train accuracy or train loss
plt.plot(history.history['val_'+name]) #Plotting the validation accuracy or validation loss
plt.title('Model ' + name.capitalize()) #Defining the title of the plot.
plt.ylabel(name.capitalize()) #Capitalizing the first letter.
plt.xlabel('Epoch') #Defining the label for the x-axis.
fig.legend(['Train', 'Validation'], loc="outside right upper") #Defining the legend, loc controls the position of the legend.
# defining a function to compute different metrics to check performance of a classification model built using statsmodels
def model_performance_classification(
model, predictors, target, threshold=0.5
):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
threshold: threshold for classifying the observation as class 1
"""
# checking which probabilities are greater than threshold
pred = model.predict(predictors) > threshold
# pred_temp = model.predict(predictors) > threshold
# # rounding off the above values to get classes
# pred = np.round(pred_temp)
acc = accuracy_score(target, pred) # to compute Accuracy
recall = recall_score(target, pred, average='weighted') # to compute Recall
precision = precision_score(target, pred, average='weighted') # to compute Precision
f1 = f1_score(target, pred, average='weighted') # to compute F1-score
# creating a dataframe of metrics
df_perf = pd.DataFrame(
{"Accuracy": acc, "Recall": recall, "Precision": precision, "F1 Score": f1,},
index=[0],
)
return df_perf
def make_confusion_matrix(cf,
group_names=None,
categories='auto',
count=True,
percent=True,
cbar=True,
xyticks=True,
xyplotlabels=True,
sum_stats=True,
figsize=None,
cmap='Blues',
title=None):
'''
This function will make a pretty plot of an sklearn Confusion Matrix cm using a Seaborn heatmap visualization.
Arguments
'''
# CODE TO GENERATE TEXT INSIDE EACH SQUARE
blanks = ['' for i in range(cf.size)]
if group_names and len(group_names)==cf.size:
group_labels = ["{}\n".format(value) for value in group_names]
else:
group_labels = blanks
if count:
group_counts = ["{0:0.0f}\n".format(value) for value in cf.flatten()]
else:
group_counts = blanks
if percent:
group_percentages = ["{0:.2%}".format(value) for value in cf.flatten()/np.sum(cf)]
else:
group_percentages = blanks
box_labels = [f"{v1}{v2}{v3}".strip() for v1, v2, v3 in zip(group_labels,group_counts,group_percentages)]
box_labels = np.asarray(box_labels).reshape(cf.shape[0],cf.shape[1])
# CODE TO GENERATE SUMMARY STATISTICS & TEXT FOR SUMMARY STATS
if sum_stats:
#Accuracy is sum of diagonal divided by total observations
accuracy = np.trace(cf) / float(np.sum(cf))
#if it is a binary confusion matrix, show some more stats
if len(cf)==2:
#Metrics for Binary Confusion Matrices
precision = cf[1,1] / sum(cf[:,1])
recall = cf[1,1] / sum(cf[1,:])
f1_score = 2*precision*recall / (precision + recall)
stats_text = "\n\nAccuracy={:0.3f}\nPrecision={:0.3f}\nRecall={:0.3f}\nF1 Score={:0.3f}".format(
accuracy,precision,recall,f1_score)
else:
stats_text = "\n\nAccuracy={:0.3f}".format(accuracy)
else:
stats_text = ""
# SET FIGURE PARAMETERS ACCORDING TO OTHER ARGUMENTS
if figsize==None:
#Get default figure size if not set
figsize = plt.rcParams.get('figure.figsize')
if xyticks==False:
#Do not show categories if xyticks is False
categories=False
# MAKE THE HEATMAP VISUALIZATION
plt.figure(figsize=figsize)
sns.heatmap(cf,annot=box_labels,fmt="",cmap=cmap,cbar=cbar,xticklabels=categories,yticklabels=categories)
if xyplotlabels:
plt.ylabel('True label')
plt.xlabel('Predicted label' + stats_text)
else:
plt.xlabel(stats_text)
if title:
plt.title(title)
Random Forest¶
Before we dive into the neural network model building, let's see how Random Forest behaves -- as a baseline
from sklearn.ensemble import RandomForestClassifier
random_forest = RandomForestClassifier(n_estimators=100)
# Pandas Series.ravel() function returns the flattened underlying data as an ndarray.
random_forest.fit(X_train,y_train.values.ravel()) # np.ravel() Return a contiguous flattened array
RandomForestClassifier()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestClassifier()
random_forest_prediction = random_forest.predict(X_test)
random_forest.score(X_test,y_test)
0.8645
labels = ['True Negative','False Positive','False Negative','True Positive']
categories=['Not exited', 'Exited']
make_confusion_matrix(confusion_matrix(y_test, random_forest_prediction),
group_names=labels,
categories=categories,
cmap='Blues')
Neural Network with SGD Optimizer without class weight¶
Let's use the same batch size and the same number of epochs for this run
# defining the batch size and # epochs upfront as we'll be using the same values for all models
epochs = 25
batch_size = 64
Let's set a baseline for our model, and then improve it step by step.
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
WARNING:tensorflow:From C:\Users\bruce\AppData\Roaming\Python\Python311\site-packages\keras\src\backend.py:277: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.
#Initializing the neural network
model = Sequential()
model.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
model.add(Dense(7,activation="relu"))
model.add(Dense(1,activation="sigmoid")) # binary classification exiting or not
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 14) 168
dense_1 (Dense) (None, 7) 105
dense_2 (Dense) (None, 1) 8
=================================================================
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
optimizer = tf.keras.optimizers.SGD() # defining SGD as the optimizer to be used
model.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
# this includes the class_weight parameter
history = model.fit(X_train, y_train, validation_data=(X_valid,y_valid) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/25 WARNING:tensorflow:From C:\Users\bruce\AppData\Roaming\Python\Python311\site-packages\keras\src\utils\tf_utils.py:492: The name tf.ragged.RaggedTensorValue is deprecated. Please use tf.compat.v1.ragged.RaggedTensorValue instead. WARNING:tensorflow:From C:\Users\bruce\AppData\Roaming\Python\Python311\site-packages\keras\src\engine\base_layer_utils.py:384: The name tf.executing_eagerly_outside_functions is deprecated. Please use tf.compat.v1.executing_eagerly_outside_functions instead. 100/100 [==============================] - 1s 5ms/step - loss: 15917120.0000 - accuracy: 0.7828 - val_loss: 0.6250 - val_accuracy: 0.7962 Epoch 2/25 100/100 [==============================] - 0s 1ms/step - loss: 0.6022 - accuracy: 0.7962 - val_loss: 0.5823 - val_accuracy: 0.7962 Epoch 3/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5682 - accuracy: 0.7962 - val_loss: 0.5558 - val_accuracy: 0.7962 Epoch 4/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5469 - accuracy: 0.7962 - val_loss: 0.5389 - val_accuracy: 0.7962 Epoch 5/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5332 - accuracy: 0.7962 - val_loss: 0.5280 - val_accuracy: 0.7962 Epoch 6/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5243 - accuracy: 0.7962 - val_loss: 0.5209 - val_accuracy: 0.7962 Epoch 7/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5184 - accuracy: 0.7962 - val_loss: 0.5161 - val_accuracy: 0.7962 Epoch 8/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5144 - accuracy: 0.7962 - val_loss: 0.5129 - val_accuracy: 0.7962 Epoch 9/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5117 - accuracy: 0.7962 - val_loss: 0.5107 - val_accuracy: 0.7962 Epoch 10/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5099 - accuracy: 0.7962 - val_loss: 0.5091 - val_accuracy: 0.7962 Epoch 11/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5086 - accuracy: 0.7962 - val_loss: 0.5081 - val_accuracy: 0.7962 Epoch 12/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5077 - accuracy: 0.7962 - val_loss: 0.5073 - val_accuracy: 0.7962 Epoch 13/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5071 - accuracy: 0.7962 - val_loss: 0.5068 - val_accuracy: 0.7962 Epoch 14/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5066 - accuracy: 0.7962 - val_loss: 0.5065 - val_accuracy: 0.7962 Epoch 15/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5063 - accuracy: 0.7962 - val_loss: 0.5062 - val_accuracy: 0.7962 Epoch 16/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5061 - accuracy: 0.7962 - val_loss: 0.5060 - val_accuracy: 0.7962 Epoch 17/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5060 - accuracy: 0.7962 - val_loss: 0.5059 - val_accuracy: 0.7962 Epoch 18/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5058 - accuracy: 0.7962 - val_loss: 0.5058 - val_accuracy: 0.7962 Epoch 19/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5058 - accuracy: 0.7962 - val_loss: 0.5057 - val_accuracy: 0.7962 Epoch 20/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5057 - accuracy: 0.7962 - val_loss: 0.5057 - val_accuracy: 0.7962 Epoch 21/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5057 - accuracy: 0.7962 - val_loss: 0.5056 - val_accuracy: 0.7962 Epoch 22/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5056 - accuracy: 0.7962 - val_loss: 0.5056 - val_accuracy: 0.7962 Epoch 23/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5056 - accuracy: 0.7962 - val_loss: 0.5056 - val_accuracy: 0.7962 Epoch 24/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5056 - accuracy: 0.7962 - val_loss: 0.5056 - val_accuracy: 0.7962 Epoch 25/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5056 - accuracy: 0.7962 - val_loss: 0.5056 - val_accuracy: 0.7962
print("Time taken in seconds ",end-start)
Time taken in seconds 3.936260461807251
plot(history,'loss')
model_0_train_perf = model_performance_classification(model, X_train, y_train)
model_0_train_perf
200/200 [==============================] - 0s 670us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.80 | 0.80 | 0.63 | 0.71 |
model_0_valid_perf = model_performance_classification(model, X_valid, y_valid)
model_0_valid_perf
50/50 [==============================] - 0s 687us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.80 | 0.80 | 0.63 | 0.71 |
The warning may be indicating something isn't working as expected from my data
from collections import Counter
print(Counter(y_train))
print(Counter(y_valid))
print(Counter(y_test))
Counter({0: 5096, 1: 1304})
Counter({0: 1274, 1: 326})
Counter({0: 1593, 1: 407})
Model Building¶
Model Evaluation Criterion¶
A model can make wrong predictions in the following ways:¶
- Predicting a customer will exit when they are loyal.
- Predicting customer will stay when they are leaving.
Which case is more important?¶
Both cases are actually important for the purposes of this case study. For this particular business case, we want to minimize the number of false positives and the number of false negatives.
By predicting a customer to exit when they are staying, it means banking insitutution resources are not being used effectively. And predicting a customer will stay when they are leaving is equally resources being used ineffectively.
How to reduce this loss i.e need to reduce False Negatives as well as False Positives?¶
Since both errors are important for us to minimize, the company would want the F1 Score evaluation metric to be maximized
Hence, the focus should be on increasing the F1 score rather than focusing on just one metric, such as Recall or Precision
As we have are dealing with an imbalance in class distribution, use class weights to allow the model to give proportionally more importance to the minority class.
'''
# Calculate class weights for imbalanced dataset
cw = (y_train.shape[0]) / np.bincount(y_train)
# Create a dictionary mapping class indices to their respective class weights
cw_dict = {}
for i in range(cw.shape[0]):
cw_dict[i] = cw[i]
cw_dict
'''
from sklearn.utils import class_weight
class_weight_array = class_weight.compute_class_weight(class_weight=None,
classes=np.unique(y_train),
y=y_train)
# convert the array into a dictionary for keras to use
cw_dict = {index: value for index, value in enumerate(class_weight_array)}
Neural Network with SGD Optimizer with class weight¶
To establish a baseline, let's start with a neural network consisting of:
- two hidden layers with 14 and 7 neurons respectively
- activation function of ReLU.- SGD as the optimizer
- class weight as described in the dictionary in the previous section
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network
model = Sequential()
model.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
model.add(Dense(7,activation="relu"))
model.add(Dense(1,activation="sigmoid")) # binary classification exiting or not
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 14) 168
dense_1 (Dense) (None, 7) 105
dense_2 (Dense) (None, 1) 8
=================================================================
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
optimizer = tf.keras.optimizers.SGD() # defining SGD as the optimizer to be used
model.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
Include the class_weight parameter in the model
start = time.time()
# this includes the class_weight parameter
history = model.fit(X_train, y_train, validation_data=(X_valid,y_valid) , batch_size=batch_size, epochs=epochs,class_weight=cw_dict)
end=time.time()
Epoch 1/25 100/100 [==============================] - 1s 2ms/step - loss: 1985.8293 - accuracy: 0.7925 - val_loss: 0.6241 - val_accuracy: 0.7962 Epoch 2/25 100/100 [==============================] - 0s 1ms/step - loss: 0.6015 - accuracy: 0.7962 - val_loss: 0.5818 - val_accuracy: 0.7962 Epoch 3/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5678 - accuracy: 0.7962 - val_loss: 0.5555 - val_accuracy: 0.7962 Epoch 4/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5466 - accuracy: 0.7962 - val_loss: 0.5387 - val_accuracy: 0.7962 Epoch 5/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5330 - accuracy: 0.7962 - val_loss: 0.5279 - val_accuracy: 0.7962 Epoch 6/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5241 - accuracy: 0.7962 - val_loss: 0.5208 - val_accuracy: 0.7962 Epoch 7/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5183 - accuracy: 0.7962 - val_loss: 0.5160 - val_accuracy: 0.7962 Epoch 8/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5143 - accuracy: 0.7962 - val_loss: 0.5128 - val_accuracy: 0.7962 Epoch 9/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5117 - accuracy: 0.7962 - val_loss: 0.5106 - val_accuracy: 0.7962 Epoch 10/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5098 - accuracy: 0.7962 - val_loss: 0.5091 - val_accuracy: 0.7962 Epoch 11/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5086 - accuracy: 0.7962 - val_loss: 0.5081 - val_accuracy: 0.7962 Epoch 12/25 100/100 [==============================] - 0s 2ms/step - loss: 0.5077 - accuracy: 0.7962 - val_loss: 0.5073 - val_accuracy: 0.7962 Epoch 13/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5071 - accuracy: 0.7962 - val_loss: 0.5068 - val_accuracy: 0.7962 Epoch 14/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5066 - accuracy: 0.7962 - val_loss: 0.5065 - val_accuracy: 0.7962 Epoch 15/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5063 - accuracy: 0.7962 - val_loss: 0.5062 - val_accuracy: 0.7962 Epoch 16/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5061 - accuracy: 0.7962 - val_loss: 0.5060 - val_accuracy: 0.7962 Epoch 17/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5060 - accuracy: 0.7962 - val_loss: 0.5059 - val_accuracy: 0.7962 Epoch 18/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5059 - accuracy: 0.7962 - val_loss: 0.5058 - val_accuracy: 0.7962 Epoch 19/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5058 - accuracy: 0.7962 - val_loss: 0.5057 - val_accuracy: 0.7962 Epoch 20/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5057 - accuracy: 0.7962 - val_loss: 0.5057 - val_accuracy: 0.7962 Epoch 21/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5057 - accuracy: 0.7962 - val_loss: 0.5056 - val_accuracy: 0.7962 Epoch 22/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5056 - accuracy: 0.7962 - val_loss: 0.5056 - val_accuracy: 0.7962 Epoch 23/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5056 - accuracy: 0.7962 - val_loss: 0.5056 - val_accuracy: 0.7962 Epoch 24/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5056 - accuracy: 0.7962 - val_loss: 0.5056 - val_accuracy: 0.7962 Epoch 25/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5056 - accuracy: 0.7962 - val_loss: 0.5056 - val_accuracy: 0.7962
print("Time taken in seconds ",end-start)
Time taken in seconds 3.598320722579956
plot(history,'loss')
model_1_train_perf = model_performance_classification(model, X_train, y_train)
model_1_train_perf
200/200 [==============================] - 0s 667us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.80 | 0.80 | 0.63 | 0.71 |
model_1_valid_perf = model_performance_classification(model, X_valid, y_valid)
model_1_valid_perf
50/50 [==============================] - 0s 820us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.80 | 0.80 | 0.63 | 0.71 |
Observation
The model using class_weight starts at a very high value and then fails to learn.
Model Performance Improvement¶
Neural Network with Adam Optimizer¶
Let's change the optimizer to Adam. This will introduce momentum as well as an adaptive learning rate
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network
model = Sequential()
model.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
model.add(Dense(7,activation="relu"))
model.add(Dense(1,activation="sigmoid")) # binary classification: exiting or not
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 14) 168
dense_1 (Dense) (None, 7) 105
dense_2 (Dense) (None, 1) 8
=================================================================
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
optimizer = tf.keras.optimizers.Adam() # defining Adam as the optimizer to be used
model.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history = model.fit(X_train, y_train, validation_data=(X_valid,y_valid) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/25 100/100 [==============================] - 1s 2ms/step - loss: 26.2987 - accuracy: 0.7573 - val_loss: 0.8260 - val_accuracy: 0.7894 Epoch 2/25 100/100 [==============================] - 0s 1ms/step - loss: 0.8251 - accuracy: 0.7870 - val_loss: 0.7847 - val_accuracy: 0.7794 Epoch 3/25 100/100 [==============================] - 0s 1ms/step - loss: 0.7782 - accuracy: 0.7861 - val_loss: 0.7292 - val_accuracy: 0.7906 Epoch 4/25 100/100 [==============================] - 0s 1ms/step - loss: 0.7258 - accuracy: 0.7884 - val_loss: 0.6795 - val_accuracy: 0.7906 Epoch 5/25 100/100 [==============================] - 0s 1ms/step - loss: 0.6744 - accuracy: 0.7895 - val_loss: 0.6345 - val_accuracy: 0.7919 Epoch 6/25 100/100 [==============================] - 0s 1ms/step - loss: 0.6292 - accuracy: 0.7925 - val_loss: 0.6012 - val_accuracy: 0.7925 Epoch 7/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5966 - accuracy: 0.7917 - val_loss: 0.5841 - val_accuracy: 0.7844 Epoch 8/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5636 - accuracy: 0.7931 - val_loss: 0.5563 - val_accuracy: 0.7937 Epoch 9/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5488 - accuracy: 0.7947 - val_loss: 0.5451 - val_accuracy: 0.7962 Epoch 10/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5421 - accuracy: 0.7950 - val_loss: 0.5372 - val_accuracy: 0.7962 Epoch 11/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5336 - accuracy: 0.7959 - val_loss: 0.5313 - val_accuracy: 0.7962 Epoch 12/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5278 - accuracy: 0.7966 - val_loss: 0.5261 - val_accuracy: 0.7962 Epoch 13/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5236 - accuracy: 0.7962 - val_loss: 0.5222 - val_accuracy: 0.7962 Epoch 14/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5198 - accuracy: 0.7959 - val_loss: 0.5190 - val_accuracy: 0.7962 Epoch 15/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5166 - accuracy: 0.7962 - val_loss: 0.5162 - val_accuracy: 0.7962 Epoch 16/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5141 - accuracy: 0.7962 - val_loss: 0.5141 - val_accuracy: 0.7962 Epoch 17/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5122 - accuracy: 0.7962 - val_loss: 0.5127 - val_accuracy: 0.7956 Epoch 18/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5107 - accuracy: 0.7962 - val_loss: 0.5114 - val_accuracy: 0.7962 Epoch 19/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5094 - accuracy: 0.7961 - val_loss: 0.5096 - val_accuracy: 0.7962 Epoch 20/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5084 - accuracy: 0.7962 - val_loss: 0.5090 - val_accuracy: 0.7962 Epoch 21/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5076 - accuracy: 0.7962 - val_loss: 0.5084 - val_accuracy: 0.7962 Epoch 22/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5070 - accuracy: 0.7962 - val_loss: 0.5083 - val_accuracy: 0.7956 Epoch 23/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5067 - accuracy: 0.7962 - val_loss: 0.5071 - val_accuracy: 0.7962 Epoch 24/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5062 - accuracy: 0.7962 - val_loss: 0.5068 - val_accuracy: 0.7962 Epoch 25/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5060 - accuracy: 0.7962 - val_loss: 0.5066 - val_accuracy: 0.7962
print("Time taken in seconds ",end-start)
Time taken in seconds 3.829555034637451
plot(history,'loss')
model_2_train_perf = model_performance_classification(model, X_train, y_train)
model_2_train_perf
200/200 [==============================] - 0s 665us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.80 | 0.80 | 0.63 | 0.71 |
model_2_valid_perf = model_performance_classification(model, X_valid, y_valid)
model_2_valid_perf
50/50 [==============================] - 0s 656us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.80 | 0.80 | 0.63 | 0.71 |
Neural Network with Adam Optimizer and Dropout¶
X_train.shape
(6400, 11)
Model-3
- We will use a simple NN made of 5 fully-connected layers with ReLu activation. The NN takes a vector of length 11 as input. This represents the information related to each transactions, ie each line with 11 columns from the dataset. For each transaction, the final layer will output a probability distribution (sigmoid activation function) and classify either as not exiting (0) or exiting (1).
- Two dropout steps are included to prevent overfitting.
Dropout
Dropout is a regularization technique for neural network models proposed by Srivastava, et al. in their 2014 paper Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Dropout is a technique where randomly selected neurons are ignored during training. They are “dropped-out” randomly.
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
model = Sequential()
model.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
model.add(Dropout(0.5))
model.add(Dense(7,activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(1,activation="sigmoid"))
# Create optimizer with default learning rate
# Compile the model
optimizer = tf.keras.optimizers.Adam() # defining Adam as the optimizer to be used
model.compile(optimizer=optimizer,loss='binary_crossentropy',metrics=['accuracy'])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 14) 168
dropout (Dropout) (None, 14) 0
dense_1 (Dense) (None, 7) 105
dropout_1 (Dropout) (None, 7) 0
dense_2 (Dense) (None, 1) 8
=================================================================
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
start = time.time()
history = model.fit(X_train, y_train, validation_data=(X_valid,y_valid) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/25 100/100 [==============================] - 1s 3ms/step - loss: 15722.9365 - accuracy: 0.4459 - val_loss: 3570.3752 - val_accuracy: 0.3019 Epoch 2/25 100/100 [==============================] - 0s 1ms/step - loss: 6158.7163 - accuracy: 0.5323 - val_loss: 1367.8448 - val_accuracy: 0.3094 Epoch 3/25 100/100 [==============================] - 0s 1ms/step - loss: 3067.6174 - accuracy: 0.5462 - val_loss: 482.8052 - val_accuracy: 0.3719 Epoch 4/25 100/100 [==============================] - 0s 1ms/step - loss: 1524.9984 - accuracy: 0.5945 - val_loss: 66.0271 - val_accuracy: 0.4569 Epoch 5/25 100/100 [==============================] - 0s 1ms/step - loss: 872.7789 - accuracy: 0.6827 - val_loss: 0.5723 - val_accuracy: 0.7962 Epoch 6/25 100/100 [==============================] - 0s 1ms/step - loss: 527.4734 - accuracy: 0.6978 - val_loss: 0.5366 - val_accuracy: 0.7962 Epoch 7/25 100/100 [==============================] - 0s 1ms/step - loss: 332.3477 - accuracy: 0.7073 - val_loss: 0.5300 - val_accuracy: 0.7962 Epoch 8/25 100/100 [==============================] - 0s 1ms/step - loss: 156.6253 - accuracy: 0.7391 - val_loss: 0.5937 - val_accuracy: 0.7956 Epoch 9/25 100/100 [==============================] - 0s 1ms/step - loss: 94.9946 - accuracy: 0.7603 - val_loss: 0.5209 - val_accuracy: 0.7962 Epoch 10/25 100/100 [==============================] - 0s 1ms/step - loss: 64.5001 - accuracy: 0.7722 - val_loss: 0.5453 - val_accuracy: 0.7962 Epoch 11/25 100/100 [==============================] - 0s 1ms/step - loss: 39.9264 - accuracy: 0.7709 - val_loss: 0.5265 - val_accuracy: 0.7962 Epoch 12/25 100/100 [==============================] - 0s 1ms/step - loss: 24.1317 - accuracy: 0.7788 - val_loss: 0.5404 - val_accuracy: 0.7962 Epoch 13/25 100/100 [==============================] - 0s 1ms/step - loss: 21.4563 - accuracy: 0.7847 - val_loss: 0.5320 - val_accuracy: 0.7962 Epoch 14/25 100/100 [==============================] - 0s 1ms/step - loss: 13.0958 - accuracy: 0.7875 - val_loss: 0.5405 - val_accuracy: 0.7962 Epoch 15/25 100/100 [==============================] - 0s 1ms/step - loss: 14.7540 - accuracy: 0.7802 - val_loss: 0.5050 - val_accuracy: 0.7962 Epoch 16/25 100/100 [==============================] - 0s 1ms/step - loss: 6.1719 - accuracy: 0.7852 - val_loss: 0.5116 - val_accuracy: 0.7962 Epoch 17/25 100/100 [==============================] - 0s 1ms/step - loss: 9.7568 - accuracy: 0.7788 - val_loss: 0.5095 - val_accuracy: 0.7962 Epoch 18/25 100/100 [==============================] - 0s 1ms/step - loss: 8.3190 - accuracy: 0.7822 - val_loss: 0.5226 - val_accuracy: 0.7962 Epoch 19/25 100/100 [==============================] - 0s 1ms/step - loss: 7.7841 - accuracy: 0.7770 - val_loss: 0.5129 - val_accuracy: 0.7962 Epoch 20/25 100/100 [==============================] - 0s 1ms/step - loss: 5.5043 - accuracy: 0.7788 - val_loss: 0.5027 - val_accuracy: 0.7962 Epoch 21/25 100/100 [==============================] - 0s 1ms/step - loss: 5.9339 - accuracy: 0.7823 - val_loss: 0.5007 - val_accuracy: 0.7962 Epoch 22/25 100/100 [==============================] - 0s 1ms/step - loss: 2.9321 - accuracy: 0.7819 - val_loss: 0.5010 - val_accuracy: 0.7962 Epoch 23/25 100/100 [==============================] - 0s 1ms/step - loss: 4.2278 - accuracy: 0.7825 - val_loss: 0.5033 - val_accuracy: 0.7962 Epoch 24/25 100/100 [==============================] - 0s 1ms/step - loss: 3.7639 - accuracy: 0.7822 - val_loss: 0.5077 - val_accuracy: 0.7962 Epoch 25/25 100/100 [==============================] - 0s 1ms/step - loss: 3.8067 - accuracy: 0.7830 - val_loss: 0.5018 - val_accuracy: 0.7962
print("Time taken in seconds ",end-start)
Time taken in seconds 3.9581198692321777
plot(history,'loss')
model_3_train_perf = model_performance_classification(model, X_train, y_train)
model_3_train_perf
200/200 [==============================] - 0s 697us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.80 | 0.80 | 0.63 | 0.71 |
model_3_valid_perf = model_performance_classification(model, X_valid, y_valid)
model_3_valid_perf
50/50 [==============================] - 0s 951us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.80 | 0.80 | 0.63 | 0.71 |
Neural Network with Balanced Data (by applying SMOTE) and SGD Optimizer¶
X_train.shape
(6400, 11)
y_train.shape
(6400,)
Synthetic Minority Oversampling Technique to solve the idea that with imbalanced classification is that there may be too few examples of the minority class for a model to effectively learn the decision boundary.
Run without SMOTE Kernal initializer with batch normalization¶
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
'''#initialize the model
model = Sequential()
# This adds the input layer (by specifying input dimension) AND the first hidden layer (units)
model.add(Dense(units=24, input_dim = X_train.shape[1],activation='relu')) # input of 29 columns as shown above
# hidden layer
model.add(Dense(units=24,activation='relu'))
model.add(Dense(24,activation='relu'))
model.add(Dense(24,activation='relu'))
# Adding the output layer
# Notice that we do not need to specify input dim.
# we have an output of 1 node, which is the the desired dimensions of our output (fraud or not)
# We use the sigmoid because we want probability outcomes
model.add(Dense(1,activation='sigmoid')) # binary classification fraudulent or not
'''
from tensorflow.keras.layers import BatchNormalization
model = Sequential()
model.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
model.add(BatchNormalization())
model.add(Dense(7,activation="relu"))
model.add(BatchNormalization())
model.add(Dense(1,activation="sigmoid"))
Use the SGD optimizer
# Create optimizer with default learning rate
# Compile the model
optimizer = tf.keras.optimizers.Adam() # defining SGD as the optimizer to be used
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 14) 168
batch_normalization (Batch (None, 14) 56
Normalization)
dense_1 (Dense) (None, 7) 105
batch_normalization_1 (Bat (None, 7) 28
chNormalization)
dense_2 (Dense) (None, 1) 8
=================================================================
Total params: 365 (1.43 KB)
Trainable params: 323 (1.26 KB)
Non-trainable params: 42 (168.00 Byte)
_________________________________________________________________
start = time.time()
history = model.fit(X_train, y_train, validation_data=(X_valid,y_valid) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/25 100/100 [==============================] - 2s 3ms/step - loss: 0.6951 - accuracy: 0.5692 - val_loss: 0.6614 - val_accuracy: 0.7075 Epoch 2/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5909 - accuracy: 0.7717 - val_loss: 0.6013 - val_accuracy: 0.7506 Epoch 3/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5436 - accuracy: 0.7956 - val_loss: 0.5332 - val_accuracy: 0.7962 Epoch 4/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5163 - accuracy: 0.7962 - val_loss: 0.5182 - val_accuracy: 0.7962 Epoch 5/25 100/100 [==============================] - 0s 1ms/step - loss: 0.5032 - accuracy: 0.7962 - val_loss: 0.5129 - val_accuracy: 0.7962 Epoch 6/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4979 - accuracy: 0.7961 - val_loss: 0.5064 - val_accuracy: 0.7962 Epoch 7/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4974 - accuracy: 0.7964 - val_loss: 0.5073 - val_accuracy: 0.7962 Epoch 8/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4973 - accuracy: 0.7962 - val_loss: 0.5047 - val_accuracy: 0.7962 Epoch 9/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4960 - accuracy: 0.7962 - val_loss: 0.5063 - val_accuracy: 0.7962 Epoch 10/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4960 - accuracy: 0.7962 - val_loss: 0.5055 - val_accuracy: 0.7962 Epoch 11/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4953 - accuracy: 0.7961 - val_loss: 0.5069 - val_accuracy: 0.7962 Epoch 12/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4970 - accuracy: 0.7962 - val_loss: 0.5048 - val_accuracy: 0.7962 Epoch 13/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4953 - accuracy: 0.7962 - val_loss: 0.5048 - val_accuracy: 0.7962 Epoch 14/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4961 - accuracy: 0.7962 - val_loss: 0.5040 - val_accuracy: 0.7962 Epoch 15/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4959 - accuracy: 0.7962 - val_loss: 0.5043 - val_accuracy: 0.7962 Epoch 16/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4960 - accuracy: 0.7962 - val_loss: 0.5049 - val_accuracy: 0.7962 Epoch 17/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4962 - accuracy: 0.7962 - val_loss: 0.5049 - val_accuracy: 0.7962 Epoch 18/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4953 - accuracy: 0.7962 - val_loss: 0.5051 - val_accuracy: 0.7962 Epoch 19/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4954 - accuracy: 0.7964 - val_loss: 0.5048 - val_accuracy: 0.7962 Epoch 20/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4955 - accuracy: 0.7961 - val_loss: 0.5041 - val_accuracy: 0.7962 Epoch 21/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4954 - accuracy: 0.7962 - val_loss: 0.5048 - val_accuracy: 0.7962 Epoch 22/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4960 - accuracy: 0.7964 - val_loss: 0.5052 - val_accuracy: 0.7962 Epoch 23/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4958 - accuracy: 0.7962 - val_loss: 0.5045 - val_accuracy: 0.7962 Epoch 24/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4960 - accuracy: 0.7962 - val_loss: 0.5056 - val_accuracy: 0.7962 Epoch 25/25 100/100 [==============================] - 0s 1ms/step - loss: 0.4953 - accuracy: 0.7962 - val_loss: 0.5050 - val_accuracy: 0.7962
print("Time taken in seconds ",end-start)
Time taken in seconds 5.147068977355957
plot(history,'loss')
model_4_train_perf = model_performance_classification(model, X_train, y_train)
model_4_train_perf
200/200 [==============================] - 0s 706us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.80 | 0.80 | 0.63 | 0.71 |
model_4_valid_perf = model_performance_classification(model, X_valid, y_valid)
model_4_valid_perf
50/50 [==============================] - 0s 741us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.80 | 0.80 | 0.63 | 0.71 |
Run with SMOTE¶
# check version number
import imblearn
print(imblearn.__version__)
0.11.0
Use SMOTE
The challenge of working with imbalanced datasets is that most machine learning techniques will ignore, and in turn have poor performance on, the minority class, although typically it is performance on the minority class that is most important.
One approach to addressing imbalanced datasets is to oversample the minority class. The simplest approach involves duplicating examples in the minority class, although these examples don’t add any new information to the model. Instead, new examples can be synthesized from the existing examples. This is a type of data augmentation for the minority class and is referred to as th_e Synthetic Minority Oversampling Techniq_ue, o_r SMO_TE for short.
X_train.shape
(6400, 11)
# summarize class distribution
counter = Counter(y_train)
print(counter)
y_train.shape
Counter({0: 5096, 1: 1304})
(6400,)
The original paper on SMOTE suggested combining SMOTE with random undersampling of the majority class.
The imbalanced-learn library supports random undersampling via the RandomUnderSampler clas2,000).).
# define pipeline
over = SMOTE(random_state=42)
under = RandomUnderSampler(random_state=42)
steps = [('o', over), ('u', under)]
pipeline = Pipeline(steps=steps)
# transform the dataset
X_train_res, y_train_res = pipeline.fit_resample(X_train, y_train)
# summarize the new class distribution
counter = Counter(y_train_res)
print(counter)
y_train_res.shape
Counter({0: 5096, 1: 5096})
(10192,)
Apply balanced data using SMOTE as the training data and execute the same model. Use the same validation data
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network
model = Sequential()
model.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
model.add(Dense(7,activation="relu"))
model.add(Dense(1,activation="sigmoid"))
# Create optimizer with default learning rate
# Compile the model
optimizer = tf.keras.optimizers.SGD() # defining SGD as the optimizer to be used
model.compile(optimizer=optimizer,loss='binary_crossentropy',metrics=['accuracy'])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 14) 168
dense_1 (Dense) (None, 7) 105
dense_2 (Dense) (None, 1) 8
=================================================================
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
start = time.time()
history = model.fit(X_train_res, y_train_res, validation_data=(X_valid,y_valid) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/25 160/160 [==============================] - 1s 2ms/step - loss: 1496480150293177171968.0000 - accuracy: 0.4959 - val_loss: 0.6929 - val_accuracy: 0.7962 Epoch 2/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4973 - val_loss: 0.6924 - val_accuracy: 0.7962 Epoch 3/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4984 - val_loss: 0.6927 - val_accuracy: 0.7962 Epoch 4/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4912 - val_loss: 0.6928 - val_accuracy: 0.7962 Epoch 5/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4939 - val_loss: 0.6927 - val_accuracy: 0.7962 Epoch 6/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4937 - val_loss: 0.6926 - val_accuracy: 0.7962 Epoch 7/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4933 - val_loss: 0.6921 - val_accuracy: 0.7962 Epoch 8/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4959 - val_loss: 0.6928 - val_accuracy: 0.7962 Epoch 9/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4925 - val_loss: 0.6930 - val_accuracy: 0.7962 Epoch 10/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4967 - val_loss: 0.6929 - val_accuracy: 0.7962 Epoch 11/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4914 - val_loss: 0.6931 - val_accuracy: 0.7962 Epoch 12/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4953 - val_loss: 0.6929 - val_accuracy: 0.7962 Epoch 13/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4965 - val_loss: 0.6925 - val_accuracy: 0.7962 Epoch 14/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4947 - val_loss: 0.6926 - val_accuracy: 0.7962 Epoch 15/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4920 - val_loss: 0.6933 - val_accuracy: 0.2037 Epoch 16/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4955 - val_loss: 0.6932 - val_accuracy: 0.2037 Epoch 17/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4973 - val_loss: 0.6925 - val_accuracy: 0.7962 Epoch 18/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4955 - val_loss: 0.6920 - val_accuracy: 0.7962 Epoch 19/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4965 - val_loss: 0.6919 - val_accuracy: 0.7962 Epoch 20/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4927 - val_loss: 0.6921 - val_accuracy: 0.7962 Epoch 21/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4978 - val_loss: 0.6928 - val_accuracy: 0.7962 Epoch 22/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4894 - val_loss: 0.6926 - val_accuracy: 0.7962 Epoch 23/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4973 - val_loss: 0.6931 - val_accuracy: 0.7962 Epoch 24/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4904 - val_loss: 0.6932 - val_accuracy: 0.2037 Epoch 25/25 160/160 [==============================] - 0s 1ms/step - loss: 0.6932 - accuracy: 0.4865 - val_loss: 0.6934 - val_accuracy: 0.2037
print("Time taken in seconds ",end-start)
Time taken in seconds 4.89260458946228
plot(history,'loss')
model_5_train_perf = model_performance_classification(model, X_train_res, y_train_res)
model_5_train_perf
319/319 [==============================] - 0s 652us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.50 | 0.50 | 0.25 | 0.33 |
model_5_valid_perf = model_performance_classification(model, X_valid, y_valid)
model_5_valid_perf
50/50 [==============================] - 0s 698us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.20 | 0.20 | 0.04 | 0.07 |
Observation
This model fails to learn.
Neural Network with Balanced Data (by applying SMOTE) and Adam Optimizer¶
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network
model = Sequential()
model.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
model.add(Dense(7,activation="relu"))
model.add(Dense(1,activation="sigmoid"))
# Create optimizer with default learning rate
# Compile the model
optimizer = tf.keras.optimizers.Adam() # defining Adam as the optimizer to be used
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 14) 168
dense_1 (Dense) (None, 7) 105
dense_2 (Dense) (None, 1) 8
=================================================================
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Use the SMOTE generated data
start = time.time()
history = model.fit(X_train_res, y_train_res, validation_data=(X_valid,y_valid) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/25 160/160 [==============================] - 1s 2ms/step - loss: 2070.7112 - accuracy: 0.5360 - val_loss: 160.4182 - val_accuracy: 0.6162 Epoch 2/25 160/160 [==============================] - 0s 1ms/step - loss: 129.1697 - accuracy: 0.5230 - val_loss: 97.9669 - val_accuracy: 0.6531 Epoch 3/25 160/160 [==============================] - 0s 1ms/step - loss: 65.6014 - accuracy: 0.5182 - val_loss: 32.3367 - val_accuracy: 0.5181 Epoch 4/25 160/160 [==============================] - 0s 1ms/step - loss: 37.4658 - accuracy: 0.5124 - val_loss: 33.8103 - val_accuracy: 0.6150 Epoch 5/25 160/160 [==============================] - 0s 1ms/step - loss: 28.7437 - accuracy: 0.5167 - val_loss: 25.9918 - val_accuracy: 0.7781 Epoch 6/25 160/160 [==============================] - 0s 1ms/step - loss: 21.5707 - accuracy: 0.5229 - val_loss: 14.6681 - val_accuracy: 0.5481 Epoch 7/25 160/160 [==============================] - 0s 1ms/step - loss: 23.9261 - accuracy: 0.5233 - val_loss: 10.2083 - val_accuracy: 0.6906 Epoch 8/25 160/160 [==============================] - 0s 1ms/step - loss: 34.2104 - accuracy: 0.5272 - val_loss: 38.7474 - val_accuracy: 0.2450 Epoch 9/25 160/160 [==============================] - 0s 1ms/step - loss: 25.7206 - accuracy: 0.5099 - val_loss: 16.1738 - val_accuracy: 0.6513 Epoch 10/25 160/160 [==============================] - 0s 1ms/step - loss: 19.5789 - accuracy: 0.5268 - val_loss: 18.9150 - val_accuracy: 0.5231 Epoch 11/25 160/160 [==============================] - 0s 1ms/step - loss: 21.0254 - accuracy: 0.5313 - val_loss: 41.1450 - val_accuracy: 0.3319 Epoch 12/25 160/160 [==============================] - 0s 1ms/step - loss: 23.2318 - accuracy: 0.5221 - val_loss: 21.5615 - val_accuracy: 0.3856 Epoch 13/25 160/160 [==============================] - 0s 1ms/step - loss: 19.1610 - accuracy: 0.5315 - val_loss: 16.4162 - val_accuracy: 0.4344 Epoch 14/25 160/160 [==============================] - 0s 1ms/step - loss: 26.6901 - accuracy: 0.5277 - val_loss: 14.2036 - val_accuracy: 0.6913 Epoch 15/25 160/160 [==============================] - 0s 1ms/step - loss: 24.2173 - accuracy: 0.5259 - val_loss: 20.7533 - val_accuracy: 0.6769 Epoch 16/25 160/160 [==============================] - 0s 1ms/step - loss: 22.1955 - accuracy: 0.5284 - val_loss: 22.8077 - val_accuracy: 0.5825 Epoch 17/25 160/160 [==============================] - 0s 1ms/step - loss: 19.7749 - accuracy: 0.5266 - val_loss: 25.6457 - val_accuracy: 0.5188 Epoch 18/25 160/160 [==============================] - 0s 1ms/step - loss: 21.7492 - accuracy: 0.5268 - val_loss: 10.2733 - val_accuracy: 0.7638 Epoch 19/25 160/160 [==============================] - 0s 1ms/step - loss: 22.8700 - accuracy: 0.5357 - val_loss: 28.6685 - val_accuracy: 0.4006 Epoch 20/25 160/160 [==============================] - 0s 1ms/step - loss: 24.7165 - accuracy: 0.5314 - val_loss: 5.4990 - val_accuracy: 0.6112 Epoch 21/25 160/160 [==============================] - 0s 1ms/step - loss: 19.4898 - accuracy: 0.5380 - val_loss: 21.3184 - val_accuracy: 0.2569 Epoch 22/25 160/160 [==============================] - 0s 1ms/step - loss: 18.5398 - accuracy: 0.5362 - val_loss: 16.3858 - val_accuracy: 0.7056 Epoch 23/25 160/160 [==============================] - 0s 1ms/step - loss: 22.2999 - accuracy: 0.5367 - val_loss: 7.8195 - val_accuracy: 0.5550 Epoch 24/25 160/160 [==============================] - 0s 1ms/step - loss: 12.5105 - accuracy: 0.5502 - val_loss: 27.5460 - val_accuracy: 0.5806 Epoch 25/25 160/160 [==============================] - 0s 2ms/step - loss: 18.1980 - accuracy: 0.5468 - val_loss: 26.8802 - val_accuracy: 0.5138
print("Time taken in seconds ",end-start)
Time taken in seconds 5.451287746429443
plot(history,'loss')
model_6_train_perf = model_performance_classification(model, X_train_res, y_train_res)
model_6_train_perf
319/319 [==============================] - 0s 670us/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.58 | 0.58 | 0.59 | 0.58 |
model_6_valid_perf = model_performance_classification(model, X_valid, y_valid)
model_6_valid_perf
50/50 [==============================] - 0s 703us/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.51 | 0.51 | 0.73 | 0.56 |
Neural Network with Balanced Data (by applying SMOTE), Adam Optimizer, and Dropout¶
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
Include dropout to the model
#Initializing the neural network
model = Sequential()
model.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
model.add(Dropout(0.5))
model.add(Dense(7,activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(1,activation="sigmoid"))
Use Adam Optimizer
# Create optimizer with default learning rate
# Compile the model
optimizer = tf.keras.optimizers.Adam() # defining Adam as the optimizer to be used
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
Use SMOTE data
start = time.time()
history = model.fit(X_train_res, y_train_res, validation_data=(X_valid,y_valid) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/25 160/160 [==============================] - 1s 2ms/step - loss: 5737.2070 - accuracy: 0.5182 - val_loss: 9.5873 - val_accuracy: 0.7381 Epoch 2/25 160/160 [==============================] - 0s 1ms/step - loss: 1096.4552 - accuracy: 0.5134 - val_loss: 1.4944 - val_accuracy: 0.4725 Epoch 3/25 160/160 [==============================] - 0s 1ms/step - loss: 409.4463 - accuracy: 0.4982 - val_loss: 0.8929 - val_accuracy: 0.7962 Epoch 4/25 160/160 [==============================] - 0s 1ms/step - loss: 206.0640 - accuracy: 0.5093 - val_loss: 1.0632 - val_accuracy: 0.7962 Epoch 5/25 160/160 [==============================] - 0s 1ms/step - loss: 116.9651 - accuracy: 0.5033 - val_loss: 1.9489 - val_accuracy: 0.4725 Epoch 6/25 160/160 [==============================] - 0s 1ms/step - loss: 74.9371 - accuracy: 0.5047 - val_loss: 0.6003 - val_accuracy: 0.7962 Epoch 7/25 160/160 [==============================] - 0s 1ms/step - loss: 60.3058 - accuracy: 0.5126 - val_loss: 0.9451 - val_accuracy: 0.4725 Epoch 8/25 160/160 [==============================] - 0s 1ms/step - loss: 51.5475 - accuracy: 0.5008 - val_loss: 0.9024 - val_accuracy: 0.4725 Epoch 9/25 160/160 [==============================] - 0s 1ms/step - loss: 40.9379 - accuracy: 0.4974 - val_loss: 2.1763 - val_accuracy: 0.4725 Epoch 10/25 160/160 [==============================] - 0s 1ms/step - loss: 28.0309 - accuracy: 0.5030 - val_loss: 0.6803 - val_accuracy: 0.7962 Epoch 11/25 160/160 [==============================] - 0s 1ms/step - loss: 18.8631 - accuracy: 0.5005 - val_loss: 0.9078 - val_accuracy: 0.7962 Epoch 12/25 160/160 [==============================] - 0s 1ms/step - loss: 16.0324 - accuracy: 0.5079 - val_loss: 0.8146 - val_accuracy: 0.4725 Epoch 13/25 160/160 [==============================] - 0s 1ms/step - loss: 13.8177 - accuracy: 0.5107 - val_loss: 0.8078 - val_accuracy: 0.4725 Epoch 14/25 160/160 [==============================] - 0s 1ms/step - loss: 7.5131 - accuracy: 0.5058 - val_loss: 1.6459 - val_accuracy: 0.4725 Epoch 15/25 160/160 [==============================] - 0s 1ms/step - loss: 6.7458 - accuracy: 0.5041 - val_loss: 1.3728 - val_accuracy: 0.4725 Epoch 16/25 160/160 [==============================] - 0s 1ms/step - loss: 5.3867 - accuracy: 0.5090 - val_loss: 0.7230 - val_accuracy: 0.4787 Epoch 17/25 160/160 [==============================] - 0s 1ms/step - loss: 6.1176 - accuracy: 0.5023 - val_loss: 0.6234 - val_accuracy: 0.7962 Epoch 18/25 160/160 [==============================] - 0s 1ms/step - loss: 2.9383 - accuracy: 0.5117 - val_loss: 0.6369 - val_accuracy: 0.7962 Epoch 19/25 160/160 [==============================] - 0s 1ms/step - loss: 4.1585 - accuracy: 0.5052 - val_loss: 0.6959 - val_accuracy: 0.5894 Epoch 20/25 160/160 [==============================] - 0s 1ms/step - loss: 2.3146 - accuracy: 0.5091 - val_loss: 0.6842 - val_accuracy: 0.6519 Epoch 21/25 160/160 [==============================] - 0s 1ms/step - loss: 2.2331 - accuracy: 0.4996 - val_loss: 0.8741 - val_accuracy: 0.5725 Epoch 22/25 160/160 [==============================] - 0s 1ms/step - loss: 3.1681 - accuracy: 0.5082 - val_loss: 0.6725 - val_accuracy: 0.7962 Epoch 23/25 160/160 [==============================] - 0s 1ms/step - loss: 1.9785 - accuracy: 0.5120 - val_loss: 0.6885 - val_accuracy: 0.6313 Epoch 24/25 160/160 [==============================] - 0s 1ms/step - loss: 1.1643 - accuracy: 0.5049 - val_loss: 0.6661 - val_accuracy: 0.7962 Epoch 25/25 160/160 [==============================] - 0s 1ms/step - loss: 1.6094 - accuracy: 0.5079 - val_loss: 0.6490 - val_accuracy: 0.7962
print("Time taken in seconds ",end-start)
Time taken in seconds 5.657803297042847
plot(history,'loss')
model_7_train_perf = model_performance_classification(model, X_train_res, y_train_res)
model_7_train_perf
319/319 [==============================] - 0s 675us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.50 | 0.50 | 0.25 | 0.33 |
model_7_valid_perf = model_performance_classification(model, X_valid, y_valid)
model_7_valid_perf
50/50 [==============================] - 0s 563us/step
C:\Users\bruce\anaconda3\Lib\site-packages\sklearn\metrics\_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.80 | 0.80 | 0.63 | 0.71 |
Model Performance Comparison and Final Model Selection¶
Model comparision¶
# training performance comparison
models_train_comp_df = pd.concat(
[
model_0_train_perf.T,
model_1_train_perf.T,
model_2_train_perf.T,
model_3_train_perf.T,
model_4_train_perf.T,
model_5_train_perf.T,
model_6_train_perf.T,
model_7_train_perf.T
],
axis=1,
)
models_train_comp_df.columns = [
"Neural Network with SGD Optimizer without class weight",
"Neural Network with SGD Optimizer with class weight",
"Neural Network with Adam Optimizer",
"Neural Network with Adam Optimizer and Dropout",
"Neural Network with Balanced Data (by applying SMOTE) with Batch Normalization",
"Neural Network with Balanced Data (by applying SMOTE) and SGD Optimizer",
"Neural Network with Balanced Data (by applying SMOTE) and Adam Optimizer",
"Neural Network with Balanced Data (by applying SMOTE), Adam Optimizer, and Dropout"
]
#Validation performance comparison
models_valid_comp_df = pd.concat(
[
model_0_valid_perf.T,
model_1_valid_perf.T,
model_2_valid_perf.T,
model_3_valid_perf.T,
model_4_valid_perf.T,
model_5_valid_perf.T,
model_6_valid_perf.T,
model_7_valid_perf.T,
],
axis=1,
)
models_valid_comp_df.columns = [
"Neural Network with SGD Optimizer without class weight",
"Neural Network with SGD Optimizer with class weight",
"Neural Network with Adam Optimizer",
"Neural Network with Adam Optimizer and Dropout",
"Neural Network with Balanced Data (by applying SMOTE) with Batch Normalization",
"Neural Network with Balanced Data (by applying SMOTE) and SGD Optimizer",
"Neural Network with Balanced Data (by applying SMOTE) and Adam Optimizer",
"Neural Network with Balanced Data (by applying SMOTE), Adam Optimizer, and Dropout"
]
models_train_comp_df
| Neural Network with SGD Optimizer without class weight | Neural Network with SGD Optimizer with class weight | Neural Network with Adam Optimizer | Neural Network with Adam Optimizer and Dropout | Neural Network with Balanced Data (by applying SMOTE) with Batch Normalization | Neural Network with Balanced Data (by applying SMOTE) and SGD Optimizer | Neural Network with Balanced Data (by applying SMOTE) and Adam Optimizer | Neural Network with Balanced Data (by applying SMOTE), Adam Optimizer, and Dropout | |
|---|---|---|---|---|---|---|---|---|
| Accuracy | 0.80 | 0.80 | 0.80 | 0.80 | 0.80 | 0.50 | 0.58 | 0.50 |
| Recall | 0.80 | 0.80 | 0.80 | 0.80 | 0.80 | 0.50 | 0.58 | 0.50 |
| Precision | 0.63 | 0.63 | 0.63 | 0.63 | 0.63 | 0.25 | 0.59 | 0.25 |
| F1 Score | 0.71 | 0.71 | 0.71 | 0.71 | 0.71 | 0.33 | 0.58 | 0.33 |
models_valid_comp_df
| Neural Network with SGD Optimizer without class weight | Neural Network with SGD Optimizer with class weight | Neural Network with Adam Optimizer | Neural Network with Adam Optimizer and Dropout | Neural Network with Balanced Data (by applying SMOTE) with Batch Normalization | Neural Network with Balanced Data (by applying SMOTE) and SGD Optimizer | Neural Network with Balanced Data (by applying SMOTE) and Adam Optimizer | Neural Network with Balanced Data (by applying SMOTE), Adam Optimizer, and Dropout | |
|---|---|---|---|---|---|---|---|---|
| Accuracy | 0.80 | 0.80 | 0.80 | 0.80 | 0.80 | 0.20 | 0.51 | 0.80 |
| Recall | 0.80 | 0.80 | 0.80 | 0.80 | 0.80 | 0.20 | 0.51 | 0.80 |
| Precision | 0.63 | 0.63 | 0.63 | 0.63 | 0.63 | 0.04 | 0.73 | 0.63 |
| F1 Score | 0.71 | 0.71 | 0.71 | 0.71 | 0.71 | 0.07 | 0.56 | 0.71 |
models_train_comp_df.loc["F1 Score"] - models_valid_comp_df.loc["F1 Score"]
Neural Network with SGD Optimizer without class weight 0.00 Neural Network with SGD Optimizer with class weight 0.00 Neural Network with Adam Optimizer 0.00 Neural Network with Adam Optimizer and Dropout 0.00 Neural Network with Balanced Data (by applying SMOTE) with Batch Normalization 0.00 Neural Network with Balanced Data (by applying SMOTE) and SGD Optimizer 0.26 Neural Network with Balanced Data (by applying SMOTE) and Adam Optimizer 0.02 Neural Network with Balanced Data (by applying SMOTE), Adam Optimizer, and Dropout -0.37 Name: F1 Score, dtype: float64
Final model selection¶
Several models tied with Neural Network with Adam Optimizer and Dropout had slightly better accuracy than Neural Network with Adam Optimizer. Although Neural Network with SGD Optimizer with class weight was a simpler model. These two models were a quite a bit better with F1 score than our Random Forest model that served as a baseline.
Neural Network with Adam Optimizer and Dropout seemed to learn the result over several epochs.
Each of the models tested well and did not overfit.
Balancing the data with SMOTE was expected to provide a better fit, did not seem to live up to its promise in all cases. A more sophisticated neural net may have gotten better results. Batch Normalization did not improve the results either.
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
model = Sequential()
model.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
model.add(Dropout(0.5))
model.add(Dense(7,activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(1,activation="sigmoid"))
# Create optimizer with default learning rate
# Compile the model
optimizer = tf.keras.optimizers.Adam() # defining Adam as the optimizer to be used
model.compile(optimizer=optimizer,loss='binary_crossentropy',metrics=['accuracy'])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 14) 168
dropout (Dropout) (None, 14) 0
dense_1 (Dense) (None, 7) 105
dropout_1 (Dropout) (None, 7) 0
dense_2 (Dense) (None, 1) 8
=================================================================
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
start = time.time()
history = model.fit(X_train, y_train, validation_data=(X_valid,y_valid) , batch_size=batch_size, epochs=epochs, class_weight=cw_dict)
end=time.time()
Epoch 1/25 100/100 [==============================] - 1s 5ms/step - loss: 15709.1074 - accuracy: 0.5131 - val_loss: 2989.4175 - val_accuracy: 0.7962 Epoch 2/25 100/100 [==============================] - 0s 1ms/step - loss: 5375.0508 - accuracy: 0.5788 - val_loss: 1243.5782 - val_accuracy: 0.7962 Epoch 3/25 100/100 [==============================] - 0s 1ms/step - loss: 2576.6531 - accuracy: 0.6580 - val_loss: 296.6782 - val_accuracy: 0.7962 Epoch 4/25 100/100 [==============================] - 0s 1ms/step - loss: 1200.6157 - accuracy: 0.6812 - val_loss: 135.3955 - val_accuracy: 0.7962 Epoch 5/25 100/100 [==============================] - 0s 1ms/step - loss: 706.2908 - accuracy: 0.7000 - val_loss: 57.9750 - val_accuracy: 0.7962 Epoch 6/25 100/100 [==============================] - 0s 1ms/step - loss: 382.9190 - accuracy: 0.7255 - val_loss: 17.1516 - val_accuracy: 0.7962 Epoch 7/25 100/100 [==============================] - 0s 1ms/step - loss: 261.1572 - accuracy: 0.7344 - val_loss: 0.5685 - val_accuracy: 0.7962 Epoch 8/25 100/100 [==============================] - 0s 1ms/step - loss: 165.5640 - accuracy: 0.7478 - val_loss: 0.5664 - val_accuracy: 0.7956 Epoch 9/25 100/100 [==============================] - 0s 1ms/step - loss: 118.9462 - accuracy: 0.7569 - val_loss: 0.5504 - val_accuracy: 0.7962 Epoch 10/25 100/100 [==============================] - 0s 1ms/step - loss: 107.8973 - accuracy: 0.7600 - val_loss: 0.5425 - val_accuracy: 0.7962 Epoch 11/25 100/100 [==============================] - 0s 1ms/step - loss: 76.8876 - accuracy: 0.7678 - val_loss: 0.5358 - val_accuracy: 0.7962 Epoch 12/25 100/100 [==============================] - 0s 1ms/step - loss: 59.1148 - accuracy: 0.7622 - val_loss: 0.5295 - val_accuracy: 0.7962 Epoch 13/25 100/100 [==============================] - 0s 1ms/step - loss: 63.2185 - accuracy: 0.7725 - val_loss: 0.5246 - val_accuracy: 0.7962 Epoch 14/25 100/100 [==============================] - 0s 1ms/step - loss: 42.8717 - accuracy: 0.7705 - val_loss: 0.5209 - val_accuracy: 0.7962 Epoch 15/25 100/100 [==============================] - 0s 1ms/step - loss: 32.6726 - accuracy: 0.7672 - val_loss: 0.5172 - val_accuracy: 0.7962 Epoch 16/25 100/100 [==============================] - 0s 1ms/step - loss: 29.6535 - accuracy: 0.7766 - val_loss: 0.5147 - val_accuracy: 0.7962 Epoch 17/25 100/100 [==============================] - 0s 1ms/step - loss: 16.6469 - accuracy: 0.7800 - val_loss: 0.5127 - val_accuracy: 0.7962 Epoch 18/25 100/100 [==============================] - 0s 1ms/step - loss: 15.9111 - accuracy: 0.7769 - val_loss: 0.5111 - val_accuracy: 0.7962 Epoch 19/25 100/100 [==============================] - 0s 1ms/step - loss: 14.7296 - accuracy: 0.7816 - val_loss: 0.5098 - val_accuracy: 0.7962 Epoch 20/25 100/100 [==============================] - 0s 1ms/step - loss: 11.6211 - accuracy: 0.7797 - val_loss: 0.5088 - val_accuracy: 0.7962 Epoch 21/25 100/100 [==============================] - 0s 1ms/step - loss: 9.0398 - accuracy: 0.7808 - val_loss: 0.5080 - val_accuracy: 0.7962 Epoch 22/25 100/100 [==============================] - 0s 1ms/step - loss: 7.3933 - accuracy: 0.7817 - val_loss: 0.5074 - val_accuracy: 0.7962 Epoch 23/25 100/100 [==============================] - 0s 1ms/step - loss: 6.8600 - accuracy: 0.7817 - val_loss: 0.5069 - val_accuracy: 0.7962 Epoch 24/25 100/100 [==============================] - 0s 1ms/step - loss: 3.3733 - accuracy: 0.7816 - val_loss: 0.5066 - val_accuracy: 0.7962 Epoch 25/25 100/100 [==============================] - 0s 1ms/step - loss: 1.8299 - accuracy: 0.7834 - val_loss: 0.5063 - val_accuracy: 0.7962
print("Time taken in seconds ",end-start)
Time taken in seconds 4.231794834136963
plot(history,'loss')
y_train_pred = model.predict(X_train)
y_valid_pred = model.predict(X_valid)
y_test_pred = model.predict(X_test)
200/200 [==============================] - 0s 647us/step 50/50 [==============================] - 0s 636us/step 63/63 [==============================] - 0s 647us/step
import warnings
warnings.filterwarnings('ignore')
print("Classification Report - Train data",end="\n\n")
cr = classification_report(y_train,y_train_pred>0.5)
print(cr)
Classification Report - Train data
precision recall f1-score support
0 0.80 1.00 0.89 5096
1 0.00 0.00 0.00 1304
accuracy 0.80 6400
macro avg 0.40 0.50 0.44 6400
weighted avg 0.63 0.80 0.71 6400
print("Classification Report - Validation data",end="\n\n")
cr = classification_report(y_valid,y_valid_pred>0.5)
print(cr)
Classification Report - Validation data
precision recall f1-score support
0 0.80 1.00 0.89 1274
1 0.00 0.00 0.00 326
accuracy 0.80 1600
macro avg 0.40 0.50 0.44 1600
weighted avg 0.63 0.80 0.71 1600
print("Classification Report - Test data",end="\n\n")
cr = classification_report(y_test,y_test_pred>0.5)
print(cr)
Classification Report - Test data
precision recall f1-score support
0 0.80 1.00 0.89 1593
1 0.00 0.00 0.00 407
accuracy 0.80 2000
macro avg 0.40 0.50 0.44 2000
weighted avg 0.63 0.80 0.71 2000
Observation
The weighted F1 score on the test data is ~0.71 with an accuracy of about 80%/
An F1 score of ~0.71 indicates a good balance between precision and recall, suggesting moderate performance in accurately classifying instances with minimal false positives and false negatives.
Model can be further tuned to deal with minority class.
Actionable Insights and Business Recommendations¶
Observations
The financial institution can deploy the final model from this exercise to determine whether a customer will leave the bank or not in the next 6 months.
The company should prioritize gender diversity initiatives
Although women have a smaller representation, they represent a high number of exixits.
Bank activity shows a slight correlation to exiting, meaning those who are have activity with their accounts are less likely
Additional research
Aging clients may represent an opportunity for reducing exits, although additional research may be required. Age may related to deaths, severe illness, and a consolidation of accounts with a single financial institution. Additional research may indicate there is an opportunity to grow this business, depending on the underlying reason for the exits.to exit.
Additional study using more complex neural nets could lead to better results, such as the following, may show better results.tions.
'''
#initialize the model
model = Sequential()
#This adds the input layer (by specifying input dimension) AND the first hidden layer (units)
model.add(Dense(units=24, input_dim = X_train.shape[1],activation='relu')) # input of 29 columns as shown above
#hidden layer
model.add(Dense(units=24,activation='relu'))
model.add(Dense(24,activation='relu'))
model.add(Dense(24,activation='relu'))
#Adding the output layer
#Notice that we do not need to specify input dim.
#we have an output of 1 node, which is the the desired dimensions of our output (fraud or not)
#We use the sigmoid because we want probability outcomes
model.add(Dense(1,activation='sigmoid')) # binary classification fraudulent or not
'''
print("Power Ahead")
Power Ahead
Power Ahead ___
Bruce D. Kyle, Oct 14, 2024