An
integrated method for detecting financial statement fraud
Chandan
Goyal1*, Dr. Bharat Khurana2
1 Student, Bachelor of Commerce, Department of Commerce,
Zakir Husain Delhi College, University of Delhi, New Delhi
chandan.goyalstd@gmail.com
2 Professor,
Department of Commerce, Zakir Husain Delhi College, University of Delhi,
New Delhi
Abstract:
Corporate
transparency, investor trust, and the integrity of capital markets are all
jeopardised by financial statement fraud, which calls for better detection
methods. By merging narrative disclosure-based behavioural signals with
standardised financial indicators and governance monitoring factors, this study
suggests an integrated strategy for identifying financial statement fraud. The
analysis begins by standardising all variables using z-scores to guarantee
comparability across scales. The variables in the dataset are company-year
observations that have been labelled as High Risk or Low Risk based on
composite scoring. There is proper normalisation of the financial and
governance variables, according to descriptive statistics; nonetheless, early
comparisons show that High Risk observations show larger financial pressure
signals and lower governance monitoring. Financial pressure has a modest
association with governance opportunity and narrative behaviour indicators, and
a significant association with financial pressure, according to correlation
analysis, demonstrating that fraud risk is multifaceted. Highlighting accruals
as a key predictor of high-risk categorisation, interpretable insights are
provided by logistic regression utilising just underlying z-score variables.
Stratified training and holdout testing are used to incorporate several machine
learning models, such as SVM, Random Forest, and Decision Tree, in order to
further increase prediction performance. Including designed sub-scores Financial
Pressure Score (FPS), Governance Opportunity Score (GOS), and Narrative
Behaviour Score (NBS) with the z-score indications gives SVM and Random Forest
the best accuracy, according to the results. The results show that auditors,
regulators, and forensic practitioners may greatly benefit from an integrated,
multi-layered approach to fraud detection since it increases accuracy and
dependability.
Keywords: Fraud
Detection, Z-Score Indicators, Governance Monitoring, Narrative Behavior,
Machine Learning, Fraud Risk Scoring, SVM, Random Forest
INTRODUCTION
One of the most important problems facing modern financial
systems is financial statement fraud, which has been seriously undermining
market efficiency, shareholder confidence, and the reliability of company
reporting. The intentional manipulation, misrepresentation, or omission of
financial data with the intention of misleading the company's
stakeholders—including investors, regulators, creditors, and auditors—is known
as financial statement fraud (Beneish, M. D., 2017, 57–82). Due to the intense competition and globalisation of
today's economy, businesses are always under pressure to satisfy their
financial obligations, turn a profit, raise the value of their stock, and
maintain a favourable reputation in the marketplace. These include the pressure
often imposed by investors, legislation, performance requirements, and
executive compensation schemes that are dependent on financial success. As a
result, unethical practices including earnings management, liability blocking,
revenue recognition, and misrepresenting disclosures to reflect an artificially
improved financial position can be exploited by managers and corporate
executives. In addition to deceiving stakeholders, these actions have a
significant negative economic impact on investor wealth, financial market
systemic instability, and even business bankruptcies. The complexity of
corporate transactions in the modern world, together with the evolution of
financial instruments and accounting practices, have made frauds more difficult
to track down. However, some latent manipulation tendencies or the creation of
fraud risks may not be detected in time by the standard auditing processes and
regulatory systems, which might be crucial, because they rely on sampling
techniques, rule-related assessments, and periodic reviews (Chen,
J., 2018, 1009–1028).
Numerous high-profile corporate scandals that have been revealed globally have
demonstrated the shortcomings of conventional fraud detection techniques as
well as the system's inefficiencies in terms of governance, monitoring, and
analysis. These failures highlight the urgent need for more sophisticated,
proactive, and data-driven systems that may detect financial misconduct early
on before it becomes a major catastrophe.
Combining statistics and machine learning techniques
has drawn a lot of attention as a potential solution to improve fraud
detection. Compared to conventional approaches, these techniques can analyse
enormous volumes of organised and unstructured data, identify intricate
correlations between variables, and generate more accurate prediction insights (Dechow,
P. M., 2017, 881–915). Financial indicators such as accruals, leverage ratios,
and profitability metrics have been widely used to detect anomalies, and
governance-related factors such as board independence, audit quality, and
ownership structure have been widely used to provide an understanding of the
organisational environment that can either encourage or prevent fraudulent
activity. Furthermore, abundant information regarding behavioural facts that
form indicators of fraud risk can be found in narrative disclosures in annual
reports, management discussions and analyses, and other company communications.
Due to the intricacy of financial statement fraud, academics and practitioners
are increasingly realising that no one approach would be sufficient to handle
all aspect of fraudulent conduct. Instead, to improve detection efficacy, a
comprehensive approach that integrates financial, governance, and behavioural
indicators is required. In order to ensure that variables of varying lengths
will be compared and that the data may have a more robust statistical analysis
and model construction, standardisation techniques such as z-score
normalisation are crucial. Furthermore, machine learning models like Support
Vector Machines (SVM), Random Forest, and Decision Trees enhance the predictive
power of the model by addressing nonlinear relationships and interactions of
variables, while classification models like logistic regression can provide
comprehensible information on the significance of particular predictors.
This study contributes to the body of knowledge
already in existence by proposing an integrated framework for financial
statement fraud detection, which involves several analytical dimensions. In the
suggested manner, a comprehensive picture of the risk of fraud will be provided
by the combination of standardised financial metrics, indications of changeable
gauges in governance monitoring, and narrative-based indicators of behavioural
signals (Dong, W., 2020, 113–123). In order to ensure the dependability and
generalisability of the results, this study also employs a well-structured
research process that comprises data normalisation, a composite risk score, and
model validation utilising stratified training and testing data. It is
anticipated that the current study's findings will help auditors, regulators,
forensic accountants, and company management in general by enabling the early
identification of issue companies and improving decision-making processes. In
the end, putting these kinds of integrated and data-driven solutions into
practice can improve corporate governance, boost transparency, and win back the
trust of financial reporting systems, all of which will support the stability
and long-term viability of the world's financial markets.
Financial Statement Fraud and Its Determinants
Financial statement fraud is a deliberate crime that
involves attempting to present an inaccurate image of a company's performance
and financial situation. It is often driven by a variety of factors that are
frequently explained by the fraud triangle hypothesis, which includes pressure,
opportunity, and rationalisation. Financial pressure can arise from bad
performance, debt obligations, or fierce competition, which forces management
to manipulate profits to meet expectations. Inadequate governance frameworks,
lax internal controls, and a lack of regulatory supervision create
opportunities for fraudulent activity to occur and go unnoticed (Gepp,
A., 2018, 102–115).
On the other hand, rationalisation is the defence of an immoral behaviour by
those who believe it is harmless or that they are doing morally. High
incentives in executive positions, inappropriate corporate governance, a lack
of transparency, and complicated financial reporting are some of the elements
that contribute to financial statement fraud (Kotsiantis, S.,
2018, 326–336).
Additionally, off-balance-sheet transactions and sophisticated accounting
techniques have been used more often, and it has been simple to conceal
fraudulent activity. Since the stakeholders would be able to identify the risk
factors and take preventive action, understanding these variables will be
essential to provide efficient detection tools. Organisations can improve
internal controls and reduce the likelihood of false financial reporting by
addressing these underlying reasons (Kukreja, G., 2020,
773–784).
Need for an Integrated Fraud Detection Approach
Due to the complexity and growing prevalence of
financial statement fraud, detection techniques that go beyond traditional
auditing techniques must be used. The conventional methods mostly rely on
financial ratio analysis and manual analysis, which are not always enough to
spot hidden patterns or fraud risks (Li, Y., 2021,
145–160). The
emergence of fraud schemes necessitates the use of advanced analytical tools
that can analyse large amounts of data, identify irregularities, and provide
accurate hypothetical answers. In the past, the method was an integrated
process that combined several aspects of analysis, including financial
measures, governance issues, and behavioural indications derived from narrative
disclosures (Perols, J., 2018, 1–20). While governance variables provide information on
the effectiveness of an organization's monitoring procedures, financial
indicators help detect anomalies in accounting figures. On the other hand,
narrative analysis is used to examine the foundation of qualitative disclosures
in order to spot discrepancies or misleading messages that could point to
fraud. The consistency and comparability of variables are ensured by
statistical standardisation techniques (such as the z-score normalisation),
which improves analytical accuracy (Ravisankar, P., 2017,
491–500). Support
Vector Machines, Random Forest, and Decision Trees are examples of machine
learning models that are used to find complicated and nonlinear relationships
between variables that more traditional approaches could overlook.
Organisations may create more effective and efficient fraud detection systems
by combining these techniques, which eventually results in stakeholder
credibility, accountability, and transparency.
Objectives of the Study
Limitations of the Study and Managerial Implications
Limitations of the Study:
There are certain limitations to this research. First,
secondary data—which may be skewed or erroneous—is used for the study. Second,
information on governance and narrative disclosure may not always be readily
available in the companies and over time. Third, despite their power, machine
learning models can overfit or become incomprehensible under some
circumstances. Additionally, because various businesses and regions may have
different regulatory frameworks and reporting traditions, the results may not
be entirely applicable to them.
Managerial Implications:
Despite these drawbacks, managers, auditors, and
regulators can benefit from the research. In order to successfully manage
risks, the integrated fraud detection system may assist organisations in
identifying any early indicators of financial transaction fraud. The results
can help managers advance corporate governance procedures, internal controls,
and more transparent financial reporting. Auditors and forensic experts might
utilise the suggested models to highlight high-risk instances and allocate
resources more efficiently. In general, the research supports the necessity of
using data-driven decision making strategies to reduce fraud risk and guarantee
moral corporate conduct.
REVIEW
OF LITERATURE
Sodnomdavaa,
T. (2025) The use of financial ratios and mathematical models to identify
irregularities in corporate reporting has been extensively covered in previous
research on financial statement fraud detection. The first ones emphasised
ratio analysis as a key tool for spotting anomalies in profitability,
liquidity, and leverage, suggesting that unusual shifts in the metrics may
point to potential manipulation. Researchers found that the M-score and Z-score
Beneish models are useful for predicting financial crises and possible profits
management. However, the approaches typically rely heavily on historical
financial data and may not take into account the qualitative aspects of fraud.
Financial indicators are important indicators, but because fraud schemes are
dynamic, they cannot be utilised on their own, according to subsequent study.
As a result, academics began advocating for the inclusion of characteristics
other than financial data. According to this research, in order to improve
detection accuracy and reduce false positives in the fraud detection system, it
is necessary to take into account both conventional and more advanced financial
analysis techniques.
Haq,
M. A. (2024) The effectiveness of corporate governance in preventing and identifying
financial statement fraud has been extensively studied in the literature.
According to research, the most crucial factors in reducing the likelihood of
fraud are sound governance structures including an independent board, a strong
audit committee, and transparent ownership. Rather, inadequate governance
contributes to creating an environment where managerial opportunism may
flourish. Empirical data shows that companies with lax internal and regulating
procedures are more likely to deal with fraudulent records and profits
manipulation. Furthermore, it has been discovered that external auditors and
institutional investors enhance accountability and oversight. However, a number
of studies demonstrate that the governance mechanisms are insufficient to
completely eliminate the risk of fraud since sophisticated managers may still
exploit systemic flaws. As a result, researchers have presented hybrid models
that incorporate both financial measurements and governance indicators. The
research generally demonstrates that governance variables are important
components in the assessment of fraud risk and must be included in the overall
framework for fraud detection.
Al-Shammari,
M. (2024) The study on fraud
detection has been greatly impacted by the recent advancements in machine
learning and data analytics. Scholars have investigated the classification of
fraudulent and non-fraudulent financial statements using algorithms such as
Decision Trees, Support Vector Machines (SVM), Artificial Neural Networks, and
Random Forest. Because these models can handle big datasets and identify
complicated, nonlinear correlations between variables, they are typically more
successful than traditional statistical methods. Studies comparing machine
learning methods to logistic regression have demonstrated that ensemble methods,
such Random Forest, are often superior and more potent. However, problems with
model interpretability and transparency are still perceived as difficult,
particularly when it comes to regulatory and auditing matters where
explainability is crucial. To achieve an equivalent level of accuracy and
transparency, a potential solution to this issue is to strike a balance between
interpretable models and high-performance algorithms. The literature suggests
that machine learning can revolutionise fraud detection, especially when paired
with domain expertise and traditional analytical techniques.
Elshafie,
H. (2022) The analysis of narrative disclosures and textual data to detect
financial fraud is another important area of study. Scholars are now looking at
the linguistic and behavioural effects of scams as unstructured data continues
to be accessible in annual reports, management discussions, and corporate
communications. According to research, dishonest businesses frequently use
ambiguous language, overstated potential, and complex terminology when
reporting bad news. To extract useful information from textual data, sentiment
analysis, readability metrics, and keyword frequency analysis have been
employed. These techniques provide an additional perspective on the
quantitative financial analysis of managers' intentions and behavioural
patterns. However, in addition to the consistency of different reporting
formats, there are also problems with text data standardisation. Despite these
drawbacks, adding narrative analysis to fraud detection models has shown
positive outcomes in terms of improving the fineness of identifying attempts to
alter numerical data that would otherwise be hard to identify in numerical
data.
Zhao,
Q., Lai, D., (2022) The most recent
literature has emphasised the need for integrated and multifaceted fraud
detection techniques. According to scientists, financial statement fraud is a
complex phenomena that is influenced by a combination of behavioural
proclivities, governance failures, and financial hardship. To improve
exceptional prediction performance, research has proposed integrated frameworks
that include financial ratios, governance metrics, and textual data. It has
been emphasised that data standardisation techniques, such as z-score
normalisation, are crucial for ensuring that comparable variables are
comparable. Empirical data indicates that integrated models are more accurate,
dependable, and robust than single-method methods. The use of hybrid approaches
to combine statistical models with machine learning algorithms is another
recent development. These techniques not only enhance detection capacities but
also provide auditors and regulators with practical advice. Overall, the body
of research strongly supports the use of integrated frameworks as a more
effective way to address the complexity of financial statement fraud in the
modern company environment.
METHODOLOGY
Research
Design and Analytical Framework
An
integrated strategy for identifying financial statement fraud is developed and
evaluated in this work using a quantitative, multi-stage analytical design. An
all-encompassing fraud risk model is included into the study framework by
integrating narrative disclosure measures, governance monitoring indicators,
and conventional financial ratios. Descriptive analytics, correlation testing,
logistic regression for interpretability, and machine learning classification
for predictive evaluation follow a sequential framework in the design that
begins with data pretreatment and standardisation. The Financial Pressure Score
(FPS), Governance Opportunity Score (GOS), and Narrative Behaviour Score (NBS)
are the three main components of a composite fraud risk scoring system that the
study uses to determine which indicators are high and low risk. The efficacy of
the integrated model may be gauged from these scores, which reflect the many
factors that contribute to fraud risk. A comprehensive evaluation of the
explanatory and predictive components of fraud detection is guaranteed by the
overall architecture.
Data
Collection, Variables, and Standardization
Dataset entries for each years have been labelled as
either "High Risk" or "Low Risk" to facilitate comparisons.
To maintain uniformity among scale-dependent variables, all governance and
financial metrics were converted to standardised z-scores. According to
descriptive statistics, the normalisation was successful in making the
variables more comparable by bringing their means near to zero and their
standard deviations close to one. Accrual levels, audit independence metrics,
board independence measures, promoter shareholding patterns, liquidity measures
(such as the Current Ratio), leverage indicators (such as the Debt-to-Equity
Ratio), and profitability ratios are all important variables. Both the
financial performance pressure and the level of governance are important
factors that determine the likelihood of fraud, and these variables represent
both of them. In addition, narrative indicators obtained from textual analysis
of annual report disclosures were used to provide a behavioural dimension to
the dataset. The theoretical frameworks of the fraud triangle and fraud
diamond, which highlight the monetary, administrative, and behavioural aspects
of deceit, are in line with this organised set of factors.
Correlation
and Multicollinearity Assessment
We
used correlation analysis to look for patterns in the correlations between the
FRS, the underlying z-score variables, and the FPS, GOS, and NBS component
scores before we built our prediction models. By doing so, we could verify the
degree of multicollinearity and see if the variables acted in accordance with
our theoretical expectations. It was confirmed that each dimension represents
different risk features by the results, which showed that FFR had a large
association with financial pressure, a moderate correlation with governance
opportunity and narrative behaviour, and weak correlations among the
subcomponent scores. These variables might be used in logistic regression and
machine learning models without major multicollinearity difficulties due to
low-to-moderate correlations across predictors. The ensuing analyses were
guaranteed to be statistically sound by this evaluation.
Logistic
Regression for Interpretability
To
determine whether standardised factors are associated with a company-year
observation's High Risk classification, logistic regression was utilised as an
interpretable baseline model. The final regression model only used the raw
z-score variables since designed sub-scores like FPS, GOS, and NBS produced
flawless separation warnings. A 60-observation balanced subset was created,
with 30 classified as high risk and 30 as low risk, to eliminate
imbalance-related biases. A statistically significant predictor of high-risk
categorisation was accrual levels, according to the regression analysis.
However, owing to sample limits, the coefficients for profitability and
leverage variables should be regarded with care, even if they were
directionally meaningful. Providing interpretability and illustrating the
underlying contribution of various variables were the primary functions of the
logistic regression.
Machine
Learning Classification and Model Evaluation
There were three machine learning classifiers trained
using a 70/30 stratified train-test split: Decision Tree, Support Vector
Machine (SVM with Radial Basis Function (RBF) kernel), and Random Forest. Their
prediction performance was then evaluated. One set of features included just
standardised z-score predictors, whereas the other set included engineered
composite sub-scores, including the Financial Pressure Score (FPS), Governance
Oversight Score (GOS), and Narrative Behaviour Score (NBS). Confusion matrices
and total accuracy measures were used to evaluate the model's performance. When
taking into account the engineering scores, Support Vector Machine and Random
Forest both reached the maximum predicted accuracy of 0.9688, according to the
data. On the other hand, the Decision Tree classifier performed quite poorly.
These results underline the efficacy of a multidimensional framework in
boosting fraud detection, and they imply that the suggested integrated method
is resilient.
RESULTS
Descriptive
Statistics of Standardized Variables
Descriptive
statistics show how the fraud-risk metrics were built using standardised
financial and governance variables, which have distributional qualities. The
effective normalisation across variables with varying scales is confirmed by
the fact that all predictors were transformed into z-scores, which result in
their means clustering around zero and standard deviations approaching one. For
the sake of future comparisons in regression and machine-learning studies, this
standardisation was vital. There are no outliers that might skew the model
estimates for profitability (Z_ROA, Z_ROE), liquidity (Z_CurrentRatio),
leverage (Z_DebtEquity), accrual intensity (Z_Accrual), and governance measures
(Z_IndepDir, Z_AuditIndep, Z_Promoter). Notably, compared to Low Risk
companies, High Risk ones showed weaker governance and greater average
financial pressure signals. These trends are in line with fraud-risk theory,
which suggests that low monitoring systems and financial stress can raise the
chance of underreporting.
Table
1: Descriptive Statistics of Standardized Variables (Z-Scores)
|
Variable |
Mean |
Std Dev |
Min |
Max |
|
Z_ROA |
0.00 |
0.99 |
-2.64 |
4.39 |
|
Z_ROE |
0.00 |
0.99 |
-5.56 |
3.27 |
|
Z_CurrentRatio |
0.00 |
0.99 |
-1.60 |
6.20 |
|
Z_DebtEquity |
0.00 |
0.99 |
-1.44 |
4.09 |
|
Z_Accrual |
0.00 |
0.99 |
-3.28 |
4.93 |
|
Z_IndepDir |
0.00 |
0.99 |
-1.87 |
2.65 |
|
Z_AuditIndep |
0.00 |
0.99 |
-1.16 |
2.07 |
|
Z_Promoter |
0.06 |
1.00 |
-2.64 |
2.15 |
Correlation
Analysis of Composite and Underlying Indicators
The
Final Fraud Risk Score (FRS), the Financial Pressure Score (FPS), the
Governance Opportunity Score (GOS), and the Narrative Behaviour Score (NBS)
were all part of the composite scores that were examined using correlation
analysis. The goal was to evaluate the level of multicollinearity before to
modelling and to find out if the variables moved in the predicted directions
theoretically. A high positive correlation between the FRS and FPS (r =.851)
suggests that financial pressure has a significant impact on risk
categorisation. Both GOS (r =.410) and NBS (r =.352) showed somewhat positive
correlations, indicating that the elements of narrative behaviour and
governance have extra explanatory value. Each of the three variables—FPS, GOS,
and NBS—captures a different aspect of fraud risk, and their weaker
correlations prove it. This justifies combining them into one model.
Table
2: Correlation Matrix
|
FPS |
GOS |
NBS |
FRS |
|
|
FPS |
1 |
-0.067 |
0.079 |
0.851 |
|
GOS |
-0.067 |
1 |
0.095 |
0.410 |
|
NBS |
0.079 |
0.095 |
1 |
0.352 |
|
FRS |
0.851 |
0.410 |
0.352 |
1 |
Logistic
Regression Output for High-Risk Classification
The
correlation between standardised predictors and categorisation into High Risk
and Low Risk groups was examined using logistic regression, which provided an
interpretable baseline model. In order to prevent perfect-separation warnings
and circularity from manufactured scores, the final model only incorporated the
raw z-score predictors. To ensure transparency and minimise class-imbalance
bias, the analysis was performed on a balanced subsample of 60 observations,
with 30 being classified as high risk and 30 as low risk. The model's great
overall predictive power was indicated by its statistical significance (p
<.001). The accruals (Z_Accrual) variable stood out as a strong positive
predictor (p =.005) among the predictors, indicating that a higher intensity of
accruals greatly raises the probability of being classified as high-risk. A
negative coefficient for leverage (Z_DebtEquity) suggested connections with
financial-health abnormalities, further demonstrating its relevance.
Table
3: Binary Logistic Regression Results
|
Predictor |
B |
SE |
Z |
p-value |
Odds Ratio |
95% CI (OR) |
|
Constant |
0.081 |
0.776 |
0.104 |
.917 |
1.084 |
[0.237, 4.964] |
|
Z_ROA |
-4.370 |
2.779 |
-1.562 |
.118 |
0.013 |
[0.000, 3.049] |
|
Z_ROE |
2.104 |
3.012 |
0.699 |
.485 |
8.201 |
[0.022, 3001.030] |
|
Z_DebtEquity |
-4.557 |
2.125 |
-2.144 |
.032 |
0.010 |
[0.000, 0.676] |
|
Z_Accrual |
5.006 |
1.770 |
2.829 |
.005 |
149.299 |
[4.653, 4790.283] |
Machine
Learning Model Accuracy (Z-Only Feature Set)
Using
a stratified 70/30 train-test split, machine-learning classifiers were used to
evaluate prediction performance. Random Forest and Support Vector Machine (SVM)
(RBF) both obtained 0.9375 accuracy with 32 test observations under the Z-only
feature set, with just two misclassifications each. The accuracy rate of the
Decision Tree model was 0.8125, which is considered moderate. These results
show that non-linear classifiers may successfully identify patterns linked to
financial statement fraud risk even in the absence of manufactured scores.
Table
4: ML Accuracy: Z-Only Feature Set
|
Model |
Accuracy |
|
SVM (RBF) |
0.9375 |
|
Random Forest |
0.9375 |
|
Decision Tree |
0.8125 |
Performance
with Integrated Feature Set (Z + FPS/GOS/NBS)
All
of the models showed significant performance improvements when constructed
composite scores were included in the feature set. SVM and Random Forest, in
particular, achieved an accuracy of 0.9688. As shown by confusion matrices,
both models displayed remarkable discriminating power by incorrectly classifying
a single observation. A considerable improvement was shown by the decision
tree's accuracy, which rose to 0.8750. These results back up the idea that
combining financial, governance, and narrative aspects makes models more robust
and helps to better distinguish between observations with high risk and those
with low risk.
Table
5: ML Accuracy: Integrated (Z + FPS/GOS/NBS) Feature Set
|
Model |
Accuracy |
|
SVM (RBF) |
0.9688 |
|
Random Forest |
0.9688 |
|
Decision Tree |
0.8750 |
CONCLUSION
In
comparison to conventional single-indicator methods, this study's results show
that an integrated, multidimensional strategy provides a more solid and
trustworthy way to identify financial statement fraud. The suggested
methodology outperforms models that depend exclusively on financial or
structural data in capturing the complexity of fraud risk. It does this by
merging financial pressure indicators with governance opportunity metrics and
narrative behaviour signals. Logistic regression findings show that accrual
activity and leverage anomalies significantly impact fraud-risk categorisation,
while descriptive statistics and correlation analysis validate that each
dimension adds distinct explanatory power. It is worth noting that the combined
use of engineering risk ratings and standardised predictors is supported by the
high performance of machine-learning models, specifically SVM and Random
Forest. Predictive accuracy increased significantly with the inclusion of the
composite measures, demonstrating the value of integrating many fraud-risk
indicators into a single framework for analysis. A multi-layered phenomena
influenced by pressures, opportunities, and behavioural intent best describes
fraud risk, according to the consistently excellent performance across models.
In sum, the research lends credence to the idea that businesses would be wise
to implement data-driven, integrated strategies for fraud detection. The
results also have real-world implications for auditors, regulators, and
analysts in the financial sector who are looking for early warning systems that
can spot high-risk firms before fraudulent activities get out of hand. The
model's predictive potential and its application across businesses and
regulatory environments should be further enhanced with future study using
bigger samples and validated fraud instances, while the results are
encouraging.
References