Predictive Analytics and Machine Learning-Based Models for E-Commerce Fraud Prevention

Authors

  • Sachin Bagoria Research Scholar, SKD University, Hanumangarh, Rajasthan Author
  • Dr. Kavita Professor, SKD University, Hanumangarh, Rajasthan Author

DOI:

https://doi.org/10.29070/sjdaxv21

Keywords:

E-commerce Fraud Detection, Predictive Analytics, Machine Learning, XGBoost, Cybersecurity, Fraud Prevention, Classification Models, Data Mining, E-commerce Security, Artificial Intelligence

Abstract

The e-commerce market has grown but so have online fraud and criminality. Online marketplaces can be very complex and customers are becoming more adept at new methods of fraud, which make traditional fraud detection methods ineffective. The aim of this study is to design a machine learning and predictive analysis system for e-commerce fraud prevention. It takes into account the structure of the URL, the content of the HTML, the technology profiles, SSL certificates, HTTP headers and external reputation indicators. 2,031 ecommerce sites were used for model creation and evaluation, with 739 of them being fraudulent and 1,292 authentic. XGBoost, Odd Forest, Support Vector Machine, Logistic Regression, k-Nearest Neighbour, AdaBoost, & Naïve Bayes were used to extract and evaluate 50 features. Experimentally, XGBoost outperformed the baseline with all characteristics at 0.9688 F1-Score & 97.78 accuracy rate while the baseline is 0.9653 & 97.49%. The comparative investigation revealed the superiority of the proposed system over existing fraud detection systems. The results suggest that machine learning-based predictive analytics could be a scalable and powerful tool to protect online transactions and detect fraudulent ecommerce sites.

Downloads

Download data is not yet available.

References

1. Monteith, S., Bauer, M., Alda, M., Geddes, J., Whybrow, P. C., & Glenn, T. (2021). Increasing cybercrime since the pandemic: Concerns for psychiatry. Current Psychiatry Reports, 23(4), 18.

2. Kodate, S., Chiba, R., Kimura, S., & Masuda, N. (2020). Detecting problematic transactions in a consumer-to-consumer e-commerce network. Applied Network Science, 5(1), 90.

3. Samani, R., & Davis, G. (2019). McAfee mobile threat report. McAfee. https://www.mcafee.com/enterprise/en-us/assets/reports/rp-mobile-threat-report-2019.pdf

4. Smith, S., & Juniper Research. (2024). Online payment fraud: Market forecasts, emerging threats & segment analysis 2022–2027. Juniper Research. https://www.juniperresearch.com/press/losses-online-payment-fraud-exceed-362-billion/

5. Ngai, E. W. T., Hu, Y., Wong, Y. H., Chen, Y., & Sun, X. (2011). The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems, 50(3), 559–569.

6. Abdallah, A., Maarof, M. A., & Zainal, A. (2016). Fraud detection system: A survey. Journal of Network and Computer Applications, 68, 90–113.

7. Bolton, R. J., & Hand, D. J. (2002). Statistical fraud detection: A review. Statistical Science, 17(3), 235–255.

8. Phua, C., Lee, V., Smith, K., & Gayler, R. (2010). A comprehensive survey of data mining-based fraud detection research (arXiv:1009.6119) . arXiv. https://arxiv.org/abs/1009.6119

9. Akoglu, L., Tong, H., & Koutra, D. (2015). Graph based anomaly detection and description: A survey. Data Mining and Knowledge Discovery, 29(3), 626–688.

10. Irani, D., Webb, S., & Pu, C. (2010). Study of static classification of social spam profiles in MySpace. Proceedings of the International AAAI Conference on Web and Social Media, 4(1), 82–89.

11. Bhowmick, S., & Hazarika, S. M. (2016). Machine learning for E-mail spam filtering: Review, techniques and trends (arXiv:1606.01042) . arXiv. https://arxiv.org/abs/1606.01042

12. Savage, D., Zhang, X., Yu, X., Chou, P., & Wang, Q. (2014). Anomaly detection in online social networks. Social Networks, 39, 62–70.

13. Mostard, W., Zijlema, B., & Wiering, M. (2019). Combining visual and contextual information for fraudulent online store classification. In Proceedings of the International Conference (pp. 84–90). https://doi.org/10.1145/3350546.3352504

14. Beltzung, L., Lindley, A., Dinica, O., Hermann, N., & Lindner, R. (2020). Real-time detection of fake-shops through machine learning. In 2020 IEEE International Conference on Big Data (pp. 2254–2263). https://doi.org/10.1109/BigData50022.2020.9378204

15. Maktabar, M., Zainal, A., Maarof, M. A., & Kassim, M. N. (2018). Content based fraudulent website detection using supervised machine learning techniques. Advances in Intelligent Systems and Computing, 734, 294–304. https://doi.org/10.1007/978-3-319-76351-4_30

16. Khoo, E., Zainal, A., Ariffin, N., Kassim, M. N., Maarof, M. A., & Bakhtiari, M. (2021). Fraudulent e-commerce website detection model using HTML, text and image features. Advances in Intelligent Systems and Computing, 1182, 177–186. https://doi.org/10.1007/978-3-030-49345-5_19

17. Wu, K., Chou, S., Chen, S., Tsai, C., & Yuan, S. (2018). Application of machine learning to identify counterfeit websites. In Proceedings of the International Conference (pp. 321–324). https://doi.org/10.1145/3282373.3282407

18. Wadleigh, J., Drew, J., & Moore, T. (2015). The e-commerce market for “lemons”: Identification and analysis of websites selling counterfeit goods. In Proceedings of the 24th International Conference on World Wide Web (pp. 1188–1197). https://doi.org/10.1145/2736277.2741677

Downloads

Published

2026-06-01

How to Cite

[1]
“Predictive Analytics and Machine Learning-Based Models for E-Commerce Fraud Prevention”, JASRAE, vol. 23, no. 3, pp. 267–286, June 2026, doi: 10.29070/sjdaxv21.