Journal ArticleOpen Access

When to Use Standardization and Normalization: Empirical Evidence From Machine Learning Models and XAI

Authors

Khaled Mahmud Sujon, Rohayanti Hassan, Zeba Tusnia Towshi, Manal A. Othman, …

Author Affiliations

University of Technology Malaysia, Independent University, Princess Nourah bint Abdulrahman University, Yeungnam University

Published InIEEE Access

Year2024

Citations94

DOI10.1109/access.2024.3462434

Abstract

Optimizing machine learning (ML) model performance relies heavily on appropriate data preprocessing techniques. Despite the widespread use of standardization and normalization, empirical comparisons across different models, dataset sizes, and domains remain sparse. This study bridges this gap by evaluating five machine learning algorithms- Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Adaptive Boosting (AdaBoost)- on datasets of varying sizes from the business, health, and agriculture domains. This study assessed the models without scaling, with standardized data, and with normalized data. The comparative analysis reveals that while standardization consistently improves the performance of linear models like SVM and LR for large and medium datasets, normalization enhances the performance of linear models for small datasets.…

View at Publisher

BORR does not host full-text PDFs. The button above takes you to the original publisher.

Fields & Keywords

Social Sciences Business, Management and Accounting Management Information Systems Big Data and Business Intelligence Machine Learning and Data Classification Artificial intelligence Machine learning Data science Operating system Anthropology