5 Ways to Detect Fake Dollar Bills Using Python Machine Learning

It has become the need of the hour to use machine learning models and neural networks to detect fraud. The availability of modern technology has enabled us to detect scams without having to identify and scrutinize everything manually. In the era where digital currency dominates, the authentication of physical currency becomes more and more questionable.

Counterfeit money can pose a problem to all of the stakeholders in an economy. It can result in financial losses to businesses, individuals, and also to the government. With the increase in technological advancement, fake currency has become more sophisticated and tougher to differentiate from actual notes or bills.

In this article, we will detect fake bills(one dollar) in a dataset using 5 different types of classification techniques namely, Logistic Regression, Naive-Bayes, K-Nearest Neighbour, Support Vector Machine(SVM), and Neural Network. In the end, we will compare the results of each of these techniques based on various metrics such as true positive rate, f1 score, geometric mean, and many more. Then, we will conclude which one of these techniques is the most accurate in detecting fake bills apart from real dollar bills. Let’s get started!

A brief overview of the fake bills dataset

The fake bills dataset contains 1500 entries, where 1000 entries are instances of a real one-dollar bill and the remaining 500 entries are instances of fake dollar bills. You can find the dataset here.

Besides having 1500 rows, it also has 7 columns representing the characteristics of a dollar bill. They are:

is_genuine: This is our target variable, it contains boolean values of True or False. If the dollar bill is genuine, then it is classified as True else, it is classified as False.
diagonal: This is the measure of the diagonal of the dollar bill and is float in nature.
height_left: This is the height of the left side of the bill and is float in nature.
height_right: This is the right-hand-side height of the dollar bill and is a float data type.
margin_low: This is the lower margin length of the dollar bill and is a float data type.
margin_upper: This is the upper margin length of the dollar bill and is a float data type.
length: This is measure of the length of the dollar bill and is also float in nature.

Fake Bills Dataset 1 — Fake Bills Dataset

We will be using the sklearn library, the numpy and the pandas libraries extensively in this article along with TensorFlow and keras.

Method 1: Logistic Regression

Logistic regression is a method of statistical analysis that is used mainly for binary classification tasks where the target variable can take up any one of two given values. It estimates the probability that a given instance belongs to a particular category by fitting a logistic regression function to the data.

The code for fitting a logistic regression model with the fake bill dataset is given below:

#lrr
# Importing the libraries needed for running the Logistic Regression Model
print("Logistic Regression classification=")
import pandas as pd
from math import sqrt
import statsmodels.api as sm
from sklearn.model_selection import cross_val_score, StratifiedKFold, cross_val_predict
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import  confusion_matrix, accuracy_score, precision_score, recall_score, f1_score

# Read the data from a CSV file
data = pd.read_csv("/content/fake_bills (1).csv", delimiter=";")

# Drop rows with missing values
data.dropna(inplace=True)

# Split the data into features (X) and the target variable (y)
x = data[['diagonal', 'height_left', 'height_right', 'margin_low', 'margin_up', 'length']]
y = data['is_genuine']
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

# Train logistic regression model
logreg_model = LogisticRegression(random_state=42)
logreg_model.fit(x_train, y_train)
log_reg = sm.Logit(y, x).fit()
print(log_reg.summary())

# Predictions on the test set
y_pred = logreg_model.predict(x_test)


  # confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred)
print ("Confusion Matrix : \n", conf_matrix)

  # accuracy score of the model
    # accuracy score of the model
      # Precision
precision = precision_score(y_test, y_pred)
print(f"Precision: {precision:.4f}")

    # F-Measure (F1 Score)
f_measure = f1_score(y_test, y_pred)
print(f"F-Measure (F1 Score): {f_measure:.4f}")


    # Accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")

  # True Positive Rate (Sensitivity, Recall)
tpr = recall_score(y_test, y_pred)
print(f"True Positive Rate (Sensitivity, Recall): {tpr:.4f}")

  # False Positive Rate
fpr = conf_matrix[0, 1] / (conf_matrix[0, 1] + conf_matrix[0, 0])
print(f"False Positive Rate: {fpr:.4f}")

  # True Negative Rate (Specificity)
tnr = conf_matrix[0, 0] / (conf_matrix[0, 1] + conf_matrix[0, 0])
print(f"True Negative Rate (Specificity): {tnr:.4f}")

  # False Negative Rate
fnr = conf_matrix[1, 0] / (conf_matrix[1, 0] + conf_matrix[1, 1])
print(f"False Negative Rate: {fnr:.4f}")


  # Geometric Mean
g_mean = sqrt(tpr * tnr)
print(f"Geometric Mean: {g_mean:.4f}")

  # Specificity
specificity = tnr
print(f"Specificity: {specificity:.4f}")

print()

The output of the above code is:

Logistic Regression classification=
Optimization terminated successfully.
         Current function value: 0.027098
         Iterations 12
                           Logit Regression Results                           
==============================================================================
Dep. Variable:             is_genuine   No. Observations:                 1463
Model:                          Logit   Df Residuals:                     1457
Method:                           MLE   Df Model:                            5
Date:                Tue, 05 Mar 2024   Pseudo R-squ.:                  0.9576
Time:                        11:30:16   Log-Likelihood:                -39.644
converged:                       True   LL-Null:                       -934.20
Covariance Type:            nonrobust   LLR p-value:                     0.000
================================================================================
                   coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------
diagonal        -0.4755      0.727     -0.654      0.513      -1.901       0.950
height_left     -1.5227      1.053     -1.446      0.148      -3.587       0.541
height_right    -3.4686      1.145     -3.030      0.002      -5.712      -1.225
margin_low      -6.0609      0.993     -6.103      0.000      -8.007      -4.115
margin_up      -10.4068      2.183     -4.768      0.000     -14.685      -6.129
length           5.8826      0.874      6.734      0.000       4.170       7.595
================================================================================

Possibly complete quasi-separation: A fraction 0.53 of observations can be
perfectly predicted. This might indicate that there is complete
quasi-separation. In this case some parameters will not be identified.
Confusion Matrix : 
 [[ 96   0]
 [  0 197]]
Precision: 1.0000
F-Measure (F1 Score): 1.0000
Accuracy: 1.0000
True Positive Rate (Sensitivity, Recall): 1.0000
False Positive Rate: 0.0000
True Negative Rate (Specificity): 1.0000
False Negative Rate: 0.0000
Geometric Mean: 1.0000
Specificity: 1.0000

Interpretation: The logistic regression summary depicts the relationship of the target variable with the independent variable. The p-values of the independent variables can help you perform feature selection and only select those variables whose p-value is less than 0.05.

But as seen from the f-score, accuracy, geometric mean and the true and false positive rates, the model performs extremely well since all of these values are perfect. This means that the model is 100% accurate.

Note: The quasi-separation message tells you that some distinct values can easily help you detect fake bills from real bills in the dataset. This makes detecting fake bills easier. To know more about quasi-separation, click here.

Method 2: Naive Bayes

The Naive Bayes classifier is a probabilistic classifier based on the Bayes theorem and assumes independence between the predictor or independent variables. Probabilities are calculated for each class and the class with the highest probability is assigned. Let’s get into the code for this classifier.

#naive bayes
print("Naive Bayes Classification=")
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score, precision_score, recall_score, f1_score, roc_auc_score
from math import sqrt
import pandas as pd
import random


# Read the data from a CSV file
data = pd.read_csv("/content/fake_bills (1).csv", delimiter=";")


# Drop rows with missing values
data.dropna(inplace=True)

# Split the data into features (X) and the target variable (y)
X = data[['diagonal', 'height_left', 'height_right', 'margin_low', 'margin_up', 'length']]
y = data['is_genuine']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=random.randint(30,40))

#gaussianNB
gnb = GaussianNB()
#for loop for as many iterations you want.
for i in range(1):
  print(i+1,"th iteration is ")
  y_pred = GaussianNB().fit(X_train, y_train).predict(X_test)
  print("Number of mislabeled points out of a total %d points : %d"
      % (X_test.shape[0], (y_test != y_pred).sum()))
  print(classification_report(y_test,y_pred))
  conf_matrix = confusion_matrix(y_test, y_pred)
  print('Confusion matrix= ',conf_matrix)
  accuracy_score(y_test, y_pred)

  # True Positive Rate (Sensitivity, Recall)
  tpr = recall_score(y_test, y_pred)
  print(f"True Positive Rate (Sensitivity, Recall): {tpr:.4f}")

  # False Positive Rate
  fpr = conf_matrix[0, 1] / (conf_matrix[0, 1] + conf_matrix[0, 0])
  print(f"False Positive Rate: {fpr:.4f}")

  # True Negative Rate (Specificity)
  tnr = conf_matrix[0, 0] / (conf_matrix[0, 1] + conf_matrix[0, 0])
  print(f"True Negative Rate (Specificity): {tnr:.4f}")

  # False Negative Rate
  fnr = conf_matrix[1, 0] / (conf_matrix[1, 0] + conf_matrix[1, 1])
  print(f"False Negative Rate: {fnr:.4f}")

  # Precision
  precision = precision_score(y_test, y_pred)
  print(f"Precision: {precision:.4f}")

  # F-Measure (F1 Score)
  f_measure = f1_score(y_test, y_pred)
  print(f"F-Measure (F1 Score): {f_measure:.4f}")

  # Geometric Mean
  g_mean = sqrt(tpr * tnr)
  print(f"Geometric Mean: {g_mean:.4f}")

  # Accuracy
  accuracy = accuracy_score(y_test, y_pred)
  print(f"Accuracy: {accuracy:.4f}")

  # Specificity
  specificity = tnr
  print(f"Specificity: {specificity:.4f}")

  print()

The output would be:

Naive Bayes Classification=
1 th iteration is 
Number of mislabeled points out of a total 439 points : 2
              precision    recall  f1-score   support

       False       0.99      0.99      0.99       156
        True       1.00      1.00      1.00       283

    accuracy                           1.00       439
   macro avg       1.00      1.00      1.00       439
weighted avg       1.00      1.00      1.00       439

Confusion matrix=  [[155   1]
 [  1 282]]
True Positive Rate (Sensitivity, Recall): 0.9965
False Positive Rate: 0.0064
True Negative Rate (Specificity): 0.9936
False Negative Rate: 0.0035
Precision: 0.9965
F-Measure (F1 Score): 0.9965
Geometric Mean: 0.9950
Accuracy: 0.9954
Specificity: 0.9936

Interpretation: This classifier also works pretty well, with minimal false negative and false positive rates. It gives us 99% accuracy and specificity, which is pretty good. But when compared to the logistic regression model, it is still less.

Method 3: K-Nearest Neighbour (KNN)

K-nearest neighbor or knn is a classification technique which is instance based and classifies a data point by assigning it to the majority class among its k nearest neighbors, which is a predefined positive integer. Implementation of knn to detect fake bills is given below. The k in knn can be any number of neighbours starting from 1 to the total number of data points in the given dataset. You can set it as you please or you can experiment with various values of k and compare the metrics such as accuracy, specificity etc. and select accordingly. We have selected k=10 here.

#knn
print("KNN classification=")
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import roc_auc_score, roc_curve, confusion_matrix
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score, roc_auc_score
from math import sqrt
import pandas as pd

# Read the data from a CSV file
data = pd.read_csv("/content/fake_bills (1).csv", delimiter=";")

# Drop rows with missing values
data.dropna(inplace=True)

# Split the data into features (X) and the target variable (y)
X = data[['diagonal', 'height_left', 'height_right', 'margin_low', 'margin_up', 'length']]
y = data['is_genuine']

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


# Define the K values to try
k_values = [10]  # Three k values (adjust as needed)

# Initialize lists to store AUC values for each K
auc_values = []

for k in k_values:
    # Initialize the KNN classifier for the current k value
    knn_classifier = KNeighborsClassifier(n_neighbors=k)

    # Train the KNN classifier
    knn_classifier.fit(X_train, y_train)

    print('for k= ',k,'the measures are= ')

    # Calculate the confusion matrix
    y_pred = knn_classifier.predict(X_test)
    conf_matrix = confusion_matrix(y_test, y_pred)
    print('Confusion matrix= ',conf_matrix)


    # True Positive Rate (Sensitivity, Recall)
    tpr = recall_score(y_test, y_pred)
    print(f"True Positive Rate (Sensitivity, Recall): {tpr:.4f}")

    # False Positive Rate
    fpr = conf_matrix[0, 1] / (conf_matrix[0, 1] + conf_matrix[0, 0])
    print(f"False Positive Rate: {fpr:.4f}")

    # True Negative Rate (Specificity)
    tnr = conf_matrix[0, 0] / (conf_matrix[0, 1] + conf_matrix[0, 0])
    print(f"True Negative Rate (Specificity): {tnr:.4f}")

    # False Negative Rate
    fnr = conf_matrix[1, 0] / (conf_matrix[1, 0] + conf_matrix[1, 1])
    print(f"False Negative Rate: {fnr:.4f}")

    # Precision
    precision = precision_score(y_test, y_pred)
    print(f"Precision: {precision:.4f}")

    # F-Measure (F1 Score)
    f_measure = f1_score(y_test, y_pred)
    print(f"F-Measure (F1 Score): {f_measure:.4f}")

    # Geometric Mean
    g_mean = sqrt(tpr * tnr)
    print(f"Geometric Mean: {g_mean:.4f}")

    # Accuracy
    accuracy = accuracy_score(y_test, y_pred)
    print(f"Accuracy: {accuracy:.4f}")

    # Specificity
    specificity = tnr
    print(f"Specificity: {specificity:.4f}")

    # You can also calculate other metrics like ROC-AUC
    roc_auc = roc_auc_score(y_test, y_pred)
    print(f"ROC-AUC Score: {roc_auc:.4f}")

    print()

The output would be:

KNN classification=
for k=  10 the measures are= 
Confusion matrix=  [[ 96   0]
 [  1 196]]
True Positive Rate (Sensitivity, Recall): 0.9949
False Positive Rate: 0.0000
True Negative Rate (Specificity): 1.0000
False Negative Rate: 0.0051
Precision: 1.0000
F-Measure (F1 Score): 0.9975
Geometric Mean: 0.9975
Accuracy: 0.9966
Specificity: 1.0000
ROC-AUC Score: 0.9975

Interpretation: This classifier also works well with our dataset with 100% specificity and 99% accuracy.

Method 3: Support Vector Machine

A support vector machine is a supervised learning algorithm that finds the optimal hyperplane that best separates the classes in the input dataset by finding the optimal hyperplane that maximizes the margin between the closest data points, called support vectors. Let’s take a look at how to implement it in Python.

# svm classifier
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
import pandas as pd
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score, f1_score, roc_auc_score, roc_curve, auc
from sklearn.metrics import confusion_matrix

# Read the data from a CSV file
df = pd.read_csv("/content/fake_bills (1).csv", delimiter=";")

# Drop rows with missing values
df.dropna(inplace=True)

for i in range(1):
  print(i+1,"iteration is :")
  y = df.iloc[:, 0].values
  X = df.iloc[:, 1:].values
  svm = SVC(kernel="rbf", gamma=0.5, C=1.0)

  # Trained the model
  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
  svm.fit(X_train, y_train)
  y_pred = svm.predict(X_test)

  cm = confusion_matrix(y_test,y_pred)
  cm
  print(cm)

#hold out validation technique
  TP = cm[1, 1]
  FP = cm[0, 1]
  TN = cm[0, 0]
  FN = cm[1, 0]

    # Metrics
  accuracy = accuracy_score(y_test, y_pred)
  precision = precision_score(y_test, y_pred)
  recall = recall_score(y_test, y_pred)
  f1 = f1_score(y_test, y_pred)
  specificity = TN / (TN + FP)
  tpr = recall
  fpr = 1 - specificity
  tnr = specificity
  fnr = 1 - tpr

    # Geometric Mean
  geometric_mean = (tpr * tnr)**0.5

  print(f"Accuracy: {accuracy}")
  print(f"Precision: {precision}")
  print(f"Recall (True Positive Rate): {tpr}")
  print(f"False Positive Rate: {fpr}")
  print(f"True Negative Rate: {tnr}")
  print(f"False Negative Rate: {fnr}")
  print(f"F1 Score: {f1}")
  print(f"Geometric Mean: {geometric_mean}")
  print(f"Specificity: {specificity}")

  print()

The output would be:

1 iteration is :
[[ 96   0]
 [  0 197]]
Accuracy: 1.0
Precision: 1.0
Recall (True Positive Rate): 1.0
False Positive Rate: 0.0
True Negative Rate: 1.0
False Negative Rate: 0.0
F1 Score: 1.0
Geometric Mean: 1.0
Specificity: 1.0

Interpretation: The SVM classifier works perfectly without a dataset as all of the measures signify 100% accuracy and 0 false positive or false negative rate. This method is at par with the logistic regression method and is the best one for our dataset.

Method 5: Neural Network

A neural network in computer science mimics the working of a human brain where it takes an input, transforms it through multiple layers and gives an output to classify it into predefined classes by first training itself and then predicting.

#neural network
from keras.models import Sequential
from keras.layers import Dense, Activation
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd
dataset = pd.read_excel("/content/fake_bills.xlsx",header = 0)

# creation of Skeleton model
model = Sequential()
model.add(Dense(40, input_shape=(6,))) # Defining the Hidden layers and neurons
model.add(Activation('sigmoid'))      # Activation function in the HIDDEN L
model.add(Dense(1))                   # Output Layer definition
model.add(Activation('sigmoid'))      # Activation function in OUTPUT L
# Compile the model and calculate its accuracy:
model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['accuracy']) # sgd - stochastic gradient Descent
# Print a summary of the Keras model:
model.summary()

y = dataset.is_genuine.values
x = dataset.drop('is_genuine', axis=1)

x_train, x_test, y_train, y_test= train_test_split(x, y, train_size=0.8)

mdl = model.fit(x_train,y_train, epochs=100)

y_pred = model.predict(x_test)

y_pclass = np.where(y_pred>=0.5,1,0)

cm = confusion_matrix(y_test, y_pclass)
print (cm)
print(f1_score(y_test, y_pclass))
print(accuracy_score(y_test, y_pclass))

#hold out validation technique
TP = cm[1, 1]
FP = cm[0, 1]
TN = cm[0, 0]
FN = cm[1, 0]

    # Metrics
accuracy = accuracy_score(y_test, y_pclass)
precision = precision_score(y_test, y_pclass)
recall = recall_score(y_test, y_pclass)
f1 = f1_score(y_test, y_pclass)
specificity = TN / (TN + FP)
tpr = recall
fpr = 1 - specificity
tnr = specificity
fnr = 1 - tpr

    # Geometric Mean
geometric_mean = (tpr * tnr)**0.5

print(f"Accuracy: {accuracy}")
print(f"Precision: {precision}")
print(f"Recall (True Positive Rate): {tpr}")
print(f"False Positive Rate: {fpr}")
print(f"True Negative Rate: {tnr}")
print(f"False Negative Rate: {fnr}")
print(f"F1 Score: {f1}")
print(f"Geometric Mean: {geometric_mean}")
print(f"Specificity: {specificity}")

The output will be:

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_2 (Dense)             (None, 40)                280       
                                                                 
 activation_2 (Activation)   (None, 40)                0         
                                                                 
 dense_3 (Dense)             (None, 1)                 41        
                                                                 
 activation_3 (Activation)   (None, 1)                 0         
                                                                 
=================================================================
Total params: 321 (1.25 KB)
Trainable params: 321 (1.25 KB)
Non-trainable params: 0 (0.00 Byte)
Epoch 1/100
38/38 [==============================] - 1s 4ms/step - loss: 0.2395 - accuracy: 0.6083
Epoch 2/100
38/38 [==============================] - 0s 5ms/step - loss: 0.2265 - accuracy: 0.6550
.......
Epoch 100/100
38/38 [==============================] - 0s 2ms/step - loss: 0.2188 - accuracy: 0.6600
10/10 [==============================] - 0s 2ms/step
[[  0  86]
 [  0 214]]
0.8326848249027238
0.7133333333333334
Accuracy: 0.7133333333333334
Precision: 0.7133333333333334
Recall (True Positive Rate): 1.0
False Positive Rate: 1.0
True Negative Rate: 0.0
False Negative Rate: 0.0
F1 Score: 0.8326848249027238
Geometric Mean: 0.0
Specificity: 0.0

Interpretation: The neural network performs poorly on our model with specificity being 0 and accuracy is very low when compared to other models.

Inference

From the above inferences, we see that logistic regression and support vector machine perform the best classification with the given fake bills dataset when compared to other techniques. In logistic regression, there is complete quasi-separation which might be problematic in some cases. Hence, the best technique in this exercise is to support the vector machine in detecting fake dollar bills. Hope this was a fun project for you all to try out!