Supervised Models for Loan Fraud Analysis using Big Data Approach

Attigeri, Girija V and Pai, Manohara M.M. and Pai, Radhika M (2021) Supervised Models for Loan Fraud Analysis using Big Data Approach. Engineering Letters, 29 (4). ISSN 1816-093X

[img] PDF
13840.pdf - Published Version
Restricted to Registered users only

Download (2MB) | Request a copy

Abstract

Banking and Financial Institutions are facing the pressure of increased defaults by individuals and firms in the last few years repercussions due to fraudulent activities. It is not only adversely affecting banks but also other financial sectors which depend on them. This makes it imperative to study the ways to prevent them rather than curing the situations. However, banks face two challenges in identifying NPAs and Wilful defaults. The first one is the due diligence of firms/individuals before an extension of the loan. The second one is, need for the placement of automated safeguards to reduce frauds originating out from human behavior. The wilful defaults are committed mainly in loan and credit services for personal benefits and are getting converted into bad loans. Bad loans are the Non-Performing Assets (NPAs) and wilful defaults are a subset of these. Hence, it is very important to control NPAs. The objective of the paper is to design and evaluate machine learning based supervised models for NPA detection. To design models, the entire historical and current data needs to be considered, which requires, faster access to large volumes of heterogeneous data. Hence, the supervised models are implemented using big data techniques for fraud detection and analytics. The various supervised models namely Logistic Regression, Support Vector Machine, Random Forest, Neural Network, and Naive Bayes are designed for loan data and experimented using Map Reduce on Hadoop platform. These models are evaluated considering various performance metrics. The empirical result shows that the Neural Network model performs best considering precision, recall, relative commission error, and kappa statistics for NPA prediction. The best-performed model can be integrated into the existing loan management system for the early identification of NPA cases.

Item Type: Article
Uncontrolled Keywords: Loan Frauds, Non-Performing Assets, Ma�chine Learning, Supervised Models, Big Data Approach, Hadoop Platform
Subjects: Engineering > MIT Manipal > Information and Communication Technology
Depositing User: MIT Library
Date Deposited: 08 Jan 2022 05:25
Last Modified: 08 Jan 2022 05:25
URI: http://eprints.manipal.edu/id/eprint/158015

Actions (login required)

View Item View Item