Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification

Article Subjects > Engineering Europe University of Atlantic > Research > Scientific Production
Fundación Universitaria Internacional de Colombia > Research > Articles and books
Ibero-american International University > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Universidad Internacional do Cuanza > Research > Scientific Production
Abierto Inglés Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an "Ensemble Partition Sampling" method within the "one-vs-all" (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. The findings suggest that EPS is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets. metadata Jabir, Brahim and Díez, Isabel De la Torre and Bautista Thompson, Ernesto and Ramírez-Vargas, Debora L. and Kuc Castilla, Ángel Gabriel mail UNSPECIFIED (2023) Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification. IEEE Access. p. 1. ISSN 2169-3536

Full text not available from this repository.

Abstract

Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an "Ensemble Partition Sampling" method within the "one-vs-all" (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. The findings suggest that EPS is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets.

Item Type: Article
Uncontrolled Keywords: Ensemble Partition Sampling (EPS); One vs One (OvO); One vs All (OvA); Multi-Class Classification; Imbalanced learning; multiclass imbalanced classification
Subjects: Subjects > Engineering
Divisions: Europe University of Atlantic > Research > Scientific Production
Fundación Universitaria Internacional de Colombia > Research > Articles and books
Ibero-american International University > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Universidad Internacional do Cuanza > Research > Scientific Production
Date Deposited: 09 May 2023 23:30
Last Modified: 09 May 2023 23:30
URI: https://repositorio.unincol.edu.co/id/eprint/7028

Actions (login required)

View Item View Item

<a class="ep_document_link" href="/14282/1/s40537-024-00959-w.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

DiabSense: early diagnosis of non-insulin-dependent diabetes mellitus using smartphone-based human activity recognition and diabetic retinopathy analysis with Graph Neural Network

Non-Insulin-Dependent Diabetes Mellitus (NIDDM) is a chronic health condition caused by high blood sugar levels, and if not treated early, it can lead to serious complications i.e. blindness. Human Activity Recognition (HAR) offers potential for early NIDDM diagnosis, emerging as a key application for HAR technology. This research introduces DiabSense, a state-of-the-art smartphone-dependent system for early staging of NIDDM. DiabSense incorporates HAR and Diabetic Retinopathy (DR) upon leveraging the power of two different Graph Neural Networks (GNN). HAR uses a comprehensive array of 23 human activities resembling Diabetes symptoms, and DR is a prevalent complication of NIDDM. Graph Attention Network (GAT) in HAR achieved 98.32% accuracy on sensor data, while Graph Convolutional Network (GCN) in the Aptos 2019 dataset scored 84.48%, surpassing other state-of-the-art models. The trained GCN analyzed retinal images of four experimental human subjects for DR report generation, and GAT generated their average duration of daily activities over 30 days. The daily activities in non-diabetic periods of diabetic patients were measured and compared with the daily activities of the experimental subjects, which helped generate risk factors. Fusing risk factors with DR conditions enabled early diagnosis recommendations for the experimental subjects despite the absence of any apparent symptoms. The comparison of DiabSense system outcome with clinical diagnosis reports in the experimental subjects was conducted using the A1C test. The test results confirmed the accurate assessment of early diagnosis requirements for experimental subjects by the system. Overall, DiabSense exhibits significant potential for ensuring early NIDDM treatment, improving millions of lives worldwide.

Producción Científica

Md Nuho Ul Alam mail , Ibrahim Hasnine mail , Erfanul Hoque Bahadur mail , Abdul Kadar Muhammad Masum mail , Mercedes Briones Urbano mail mercedes.briones@uneatlantico.es, Manuel Masías Vergara mail manuel.masias@uneatlantico.es, Jia Uddin mail , Imran Ashraf mail , Md. Abdus Samad mail ,

Alam

<a href="/14278/1/s41746-024-01194-6.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Clinical phenotypes and short-term outcomes based on prehospital point-of-care testing and on-scene vital signs

Emergency medical services (EMSs) face critical situations that require patient risk classification based on analytical and vital signs. We aimed to establish clustering-derived phenotypes based on prehospital analytical and vital signs that allow risk stratification. This was a prospective, multicenter, EMS-delivered, ambulance-based cohort study considering six advanced life support units, 38 basic life support units, and four tertiary hospitals in Spain. Adults with unselected acute diseases managed by the EMS and evacuated with discharge priority to emergency departments were considered between January 1, 2020, and June 30, 2023. Prehospital point-of-care testing and on-scene vital signs were used for the unsupervised machine learning method (clustering) to determine the phenotypes. Then phenotypes were compared with the primary outcome (cumulative mortality (all-cause) at 2, 7, and 30 days). A total of 7909 patients were included. The median (IQR) age was 64 (51–80) years, 41% were women, and 26% were living in rural areas. Three clusters were identified: alpha 16.2% (1281 patients), beta 28.8% (2279), and gamma 55% (4349). The mortality rates for alpha, beta and gamma at 2 days were 18.6%, 4.1%, and 0.8%, respectively; at 7 days, were 24.7%, 6.2%, and 1.7%; and at 30 days, were 33%, 10.2%, and 3.2%, respectively. Based on standard vital signs and blood test biomarkers in the prehospital scenario, three clusters were identified: alpha (high-risk), beta and gamma (medium- and low-risk, respectively). This permits the EMS system to quickly identify patients who are potentially compromised and to proactively implement the necessary interventions.

Producción Científica

Raúl López-Izquierdo mail , Carlos del Pozo Vegas mail , Ancor Sanz-García mail , Agustín Mayo Íscar mail , Miguel A. Castro Villamor mail , Eduardo René Silva Alvarado mail eduardo.silva@funiber.org, Santos Gracia Villar mail santos.gracia@uneatlantico.es, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Silvia Aparicio Obregón mail silvia.aparicio@uneatlantico.es, Rubén Calderón Iglesias mail ruben.calderon@uneatlantico.es, Joan B. Soriano mail , Francisco Martín-Rodríguez mail ,

López-Izquierdo

<a href="/14344/1/journal.pone.0304774.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Novel model to authenticate role-based medical users for blockchain-based IoMT devices

The IoT (Internet of Things) has played a promising role in e-healthcare applications during the last decade. Medical sensors record a variety of data and transmit them over the IoT network to facilitate remote patient monitoring. When a patient visits a hospital he may need to connect or disconnect medical devices from the medical healthcare system frequently. Also, multiple entities (e.g., doctors, medical staff, etc.) need access to patient data and require distinct sets of patient data. As a result of the dynamic nature of medical devices, medical users require frequent access to data, which raises complex security concerns. Granting access to a whole set of data creates privacy issues. Also, each of these medical user need to grant access rights to a specific set of medical data, which is quite a tedious task. In order to provide role-based access to medical users, this study proposes a blockchain-based framework for authenticating multiple entities based on the trust domain to reduce the administrative burden. This study is further validated by simulation on the infura blockchain using solidity and Python. The results demonstrate that role-based authorization and multi-entities authentication have been implemented and the owner of medical data can control access rights at any time and grant medical users easy access to a set of data in a healthcare system. The system has minimal latency compared to existing blockchain systems that lack multi-entity authentication and role-based authorization.

Producción Científica

Shadab Alam mail , Muhammad Shehzad Aslam mail , Ayesha Altaf mail , Faiza Iqbal mail , Natasha Nigar mail , Juan Castanedo Galán mail juan.castanedo@uneatlantico.es, Daniel Gavilanes Aray mail daniel.gavilanes@uneatlantico.es, Isabel de la Torre Díez mail , Imran Ashraf mail ,

Alam

<a href="/12747/1/sensors-24-03754%20%281%29.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Ultra-Wide Band Radar Empowered Driver Drowsiness Detection with Convolutional Spatial Feature Engineering and Artificial Intelligence

Driving while drowsy poses significant risks, including reduced cognitive function and the potential for accidents, which can lead to severe consequences such as trauma, economic losses, injuries, or death. The use of artificial intelligence can enable effective detection of driver drowsiness, helping to prevent accidents and enhance driver performance. This research aims to address the crucial need for real-time and accurate drowsiness detection to mitigate the impact of fatigue-related accidents. Leveraging ultra-wideband radar data collected over five minutes, the dataset was segmented into one-minute chunks and transformed into grayscale images. Spatial features are retrieved from the images using a two-dimensional Convolutional Neural Network. Following that, these features were used to train and test multiple machine learning classifiers. The ensemble classifier RF-XGB-SVM, which combines Random Forest, XGBoost, and Support Vector Machine using a hard voting criterion, performed admirably with an accuracy of 96.6%. Additionally, the proposed approach was validated with a robust k-fold score of 97% and a standard deviation of 0.018, demonstrating significant results. The dataset is augmented using Generative Adversarial Networks, resulting in improved accuracies for all models. Among them, the RF-XGB-SVM model outperformed the rest with an accuracy score of 99.58%.

Producción Científica

Hafeez Ur Rehman Siddiqui mail , Ambreen Akmal mail , Muhammad Iqbal mail , Adil Ali Saleem mail , Muhammad Amjad Raza mail , Kainat Zafar mail , Aqsa Zaib mail , Sandra Dudley mail , Jon Arambarri mail jon.arambarri@uneatlantico.es, Ángel Gabriel Kuc Castilla mail , Furqan Rustam mail ,

Siddiqui

<a class="ep_document_link" href="/13000/1/diagnostics-14-01292.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

A Comparison of the Clinical Characteristics of Short-, Mid-, and Long-Term Mortality in Patients Attended by the Emergency Medical Services: An Observational Study

Aim: The development of predictive models for patients treated by emergency medical services (EMS) is on the rise in the emergency field. However, how these models evolve over time has not been studied. The objective of the present work is to compare the characteristics of patients who present mortality in the short, medium and long term, and to derive and validate a predictive model for each mortality time. Methods: A prospective multicenter study was conducted, which included adult patients with unselected acute illness who were treated by EMS. The primary outcome was noncumulative mortality from all causes by time windows including 30-day mortality, 31- to 180-day mortality, and 181- to 365-day mortality. Prehospital predictors included demographic variables, standard vital signs, prehospital laboratory tests, and comorbidities. Results: A total of 4830 patients were enrolled. The noncumulative mortalities at 30, 180, and 365 days were 10.8%, 6.6%, and 3.5%, respectively. The best predictive value was shown for 30-day mortality (AUC = 0.930; 95% CI: 0.919–0.940), followed by 180-day (AUC = 0.852; 95% CI: 0.832–0.871) and 365-day (AUC = 0.806; 95% CI: 0.778–0.833) mortality. Discussion: Rapid characterization of patients at risk of short-, medium-, or long-term mortality could help EMS to improve the treatment of patients suffering from acute illnesses.

Producción Científica

Rodrigo Enriquez de Salamanca Gambara mail , Ancor Sanz-García mail , Carlos del Pozo Vegas mail , Raúl López-Izquierdo mail , Irene Sánchez Soberón mail , Juan F. Delgado Benito mail , Raquel Martínez Díaz mail raquel.martinez@uneatlantico.es, Cristina Mazas Pérez-Oleaga mail cristina.mazas@uneatlantico.es, Nohora Milena Martínez López mail nohora.martinez@uneatlantico.es, Irma Dominguez Azpíroz mail irma.dominguez@unini.edu.mx, Francisco Martín-Rodríguez mail ,

Enriquez de Salamanca Gambara