Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification

Artículo Materias > Ingeniería Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Artículos y libros
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Producción Científica Abierto Inglés Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an "Ensemble Partition Sampling" method within the "one-vs-all" (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. The findings suggest that EPS is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets. metadata Jabir, Brahim; Díez, Isabel De la Torre; Bautista Thompson, Ernesto; Ramírez-Vargas, Debora L. y Kuc Castilla, Ángel Gabriel mail SIN ESPECIFICAR (2023) Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification. IEEE Access. p. 1. ISSN 2169-3536

Texto completo no disponible.

URL Oficial: http://doi.org/10.1109/ACCESS.2023.3273925

Resumen

Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an "Ensemble Partition Sampling" method within the "one-vs-all" (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. The findings suggest that EPS is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets.

Tipo de Documento:	Artículo
Palabras Clave:	Ensemble Partition Sampling (EPS); One vs One (OvO); One vs All (OvA); Multi-Class Classification; Imbalanced learning; multiclass imbalanced classification
Clasificación temática:	Materias > Ingeniería
Divisiones:	Universidad Europea del Atlántico > Investigación > Producción Científica Fundación Universitaria Internacional de Colombia > Investigación > Artículos y libros Universidad Internacional Iberoamericana México > Investigación > Producción Científica Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica Universidad Internacional do Cuanza > Investigación > Producción Científica
Depositado:	09 May 2023 23:30
Ultima Modificación:	21 Oct 2024 23:30
URI:	https://repositorio.unincol.edu.co/id/eprint/7028

Acciones (logins necesarios)

Ver Objeto

open

Ultra Wideband radar-based gait analysis for gender classification using artificial intelligence

Gender classification plays a vital role in various applications, particularly in security and healthcare. While several biometric methods such as facial recognition, voice analysis, activity monitoring, and gait recognition are commonly used, their accuracy and reliability often suffer due to challenges like body part occlusion, high computational costs, and recognition errors. This study investigates gender classification using gait data captured by Ultra-Wideband radar, offering a non-intrusive and occlusion-resilient alternative to traditional biometric methods. A dataset comprising 163 participants was collected, and the radar signals underwent preprocessing, including clutter suppression and peak detection, to isolate meaningful gait cycles. Spectral features extracted from these cycles were transformed using a novel integration of Feedforward Artificial Neural Networks and Random Forests , enhancing discriminative power. Among the models evaluated, the Random Forest classifier demonstrated superior performance, achieving 94.68% accuracy and a cross-validation score of 0.93. The study highlights the effectiveness of Ultra-wideband radar and the proposed transformation framework in advancing robust gender classification.

Producción Científica

Adil Ali Saleem mail , Hafeez Ur Rehman Siddiqui mail , Muhammad Amjad Raza mail , Sandra Dudley mail , Julio César Martínez Espinosa mail ulio.martinez@unini.edu.mx, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Isabel de la Torre Díez mail ,

Saleem

open

Association between blood cortisol levels and numerical rating scale in prehospital pain assessment

Background Nowadays, there is no correlation between levels of cortisol and pain in the prehospital setting. The aim of this work was to determine the ability of prehospital cortisol levels to correlate to pain. Cortisol levels were compared with those of the numerical rating scale (NRS). Methods This is a prospective observational study looking at adult patients with acute disease managed by Emergency Medical Services (EMS) and transferred to the emergency department of two tertiary care hospitals. Epidemiological variables, vital signs, and prehospital blood analysis data were collected. A total of 1516 patients were included, the median age was 67 years (IQR: 51–79; range: 18–103) with 42.7% of females. The primary outcome was pain evaluation by NRS, which was categorized as pain-free (0 points), mild (1–3), moderate (4–6), or severe (≥7). Analysis of variance, correlation, and classification capacity in the form area under the curve of the receiver operating characteristic (AUC) curve were used to prospectively evaluate the association of cortisol with NRS. Results The median NRS and cortisol level are 1 point (IQR: 0–4) and 282 nmol/L (IQR: 143–433). There are 584 pain-free patients (38.5%), 525 mild (34.6%), 244 moderate (16.1%), and 163 severe pain (10.8%). Cortisol levels in each NRS category result in p < 0.001. The correlation coefficient between the cortisol level and NRS is 0.87 (p < 0.001). The AUC of cortisol to classify patients into each NRS category is 0.882 (95% CI: 0.853–0.910), 0.496 (95% CI: 0.446–0.545), 0.837 (95% CI: 0.803–0.872), and 0.981 (95% CI: 0.970–0.991) for the pain-free, mild, moderate, and severe categories, respectively. Conclusions Cortisol levels show similar pain evaluation as NRS, with high-correlation for NRS pain categories, except for mild-pain. Therefore, cortisol evaluation via the EMS could provide information regarding pain status.

Producción Científica

Raúl López-Izquierdo mail , Elisa A. Ingelmo-Astorga mail , Carlos del Pozo Vegas mail , Santos Gracia Villar mail santos.gracia@uneatlantico.es, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Silvia Aparicio Obregón mail silvia.aparicio@uneatlantico.es, Rubén Calderón Iglesias mail ruben.calderon@uneatlantico.es, Ancor Sanz-García mail , Francisco Martín-Rodríguez mail ,

López-Izquierdo

open

Association between blood cortisol levels and numerical rating scale in prehospital pain assessment

Producción Científica

López-Izquierdo

open

Detecting hate in diversity: a survey of multilingual code-mixed image and video analysis

The proliferation of damaging content on social media in today’s digital environment has increased the need for efficient hate speech identification systems. A thorough examination of hate speech detection methods in a variety of settings, such as code-mixed, multilingual, visual, audio, and textual scenarios, is presented in this paper. Unlike previous research focusing on single modalities, our study thoroughly examines hate speech identification across multiple forms. We classify the numerous types of hate speech, showing how it appears on different platforms and emphasizing the unique difficulties in multi-modal and multilingual settings. We fill research gaps by assessing a variety of methods, including deep learning, machine learning, and natural language processing, especially for complicated data like code-mixed and cross-lingual text. Additionally, we offer key technique comparisons, suggesting future research avenues that prioritize multi-modal analysis and ethical data handling, while acknowledging its benefits and drawbacks. This study attempts to promote scholarly research and real-world applications on social media platforms by acting as an essential resource for improving hate speech identification across various data sources.

Producción Científica

Hafiz Muhammad Raza Ur Rehman mail , Mahpara Saleem mail , Muhammad Zeeshan Jhandir mail , Eduardo René Silva Alvarado mail eduardo.silva@funiber.org, Helena Garay mail helena.garay@uneatlantico.es, Imran Ashraf mail ,

Raza Ur Rehman

open

Ensemble stacked model for enhanced identification of sentiments from IMDB reviews

The emergence of social media platforms led to the sharing of ideas, thoughts, events, and reviews. The shared views and comments contain people’s sentiments and analysis of these sentiments has emerged as one of the most popular fields of study. Sentiment analysis in the Urdu language is an important research problem similar to other languages, however, it is not investigated very well. On social media platforms like X (Twitter), billions of native Urdu speakers use the Urdu script which makes sentiment analysis in the Urdu language important. In this regard, an ensemble model RRLS is proposed that stacks random forest, recurrent neural network, logistic regression (LR), and support vector machine (SVM). The Internet Movie Database (IMDB) movie reviews and Urdu tweets are examined in this study using Urdu sentiment analysis. The Urdu hack library was used to preprocess the Urdu data, which includes preprocessing operations including normalizing individual letters, merging them, including spaces, etc. concerning punctuation. The problem of accurately encoding Urdu characters and replacing Arabic letters with their Urdu equivalents is fixed by the normalization module. Several models are adopted in this study for extensive evaluation of their accuracy for Urdu sentiment analysis. While the results promising, among machine learning models, the SVM and LR attained an accuracy of 87%, according to performance criteria such as F-measure, accuracy, recall, and precision. The accuracy of the long short-term memory (LSTM) and bidirectional LSTM (BiLSTM) was 84%. The suggested ensemble RRLS model performs better than other learning algorithms and achieves a 90% accuracy rate, outperforming current methods. The use of the synthetic minority oversampling technique (SMOTE) is observed to improve the performance and lead to 92.77% accuracy.

Producción Científica

Komal Azim mail , Alishba Tahir mail , Mobeen Shahroz mail , Hanen Karamti mail , Annia A. Vázquez mail annia.almeyda@uneatlantico.es, Angel Olider Rojas Vistorte mail angel.rojas@uneatlantico.es, Imran Ashraf mail ,

Azim

Enlaces de interés

Enlaces de interés

Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification

Resumen

Acciones (logins necesarios)

TEMÁTICA

ACCESO

IDIOMA

Filtros

Información