Fayez, Mustafa AdilKurnaz, Sefer2024-12-052024-12-052024Fayez, M. A., Kurnaz, S. (2024). Advanced hybrid and preprocessing models for diagnosis challenges in data classification. Journal of Advances in Information Technology, 12(11), 1264-1272. 10.12720/jait.15.11.1264-1272https://hdl.handle.net/20.500.12939/5075Machine Learning (ML), often viewed as a cutting-edge technology best suited for qualified specialists, presents limited access for other physicians and scientists in the medical profession. In this work, we provide a new, sophisticated, and highly successful technology for medical applications, especially cardiac diagnostics. We propose a novel advanced hybrid optimization model with two essential parts. Initially, we apply a high-performance hybrid resampling technique for feature engineering and pre-processing. This approach, which combines Synthetic Minority Oversampling Technique Edited Nearest Neighbors (SMOTEENN) with Neighborhood Cleaning Rules (NCL), addresses class imbalance in the data. We developed a complex hybrid optimization model that incorporates hyper-parameter optimization, advanced Application Programming Interface (API) functions, and a super-learner ensemble model to enhance diagnosis accuracy in cases where datasets lack balance. Furthermore, we developed high-performance prediction models using sophisticated Support Vector Machines (SVMs). We show that, with re-sampled Cardiovascular Disease (CVD) data, the advanced hybrid optimization model attained an astounding accuracy of 98%. By comparison, an advanced SVM model obtained 96% accuracy, while an advanced deep learning model produced 95.5% accuracy. Our new sophisticated hybrid optimization machine learning models may significantly improve physicians’ interpretation of ML results. This strategy could make it easier to apply AI methods on a large scale in the clinic, which would eventually raise patient outcomes and diagnostic accuracy.eninfo:eu-repo/semantics/openAccessApplication Programming Interface (API) functionCardiovascular Disease (CVD)Hybrid advanced modelsNeighborhood Cleaning Rules (NCL)OptimizationSynthetic Minority Oversampling Technique Edited Nearest Neighbors (SMOTEENN)Advanced hybrid and preprocessing models for diagnosis challenges in data classificationArticle1511126412722-s2.0-85210263799Q2WOS:001374446300002N/A