Real Word Spelling Error Detection and Correction for Urdu Language

Artículo Materias > Ingeniería Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Artículos y libros
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Producción Científica
Abierto Inglés Non-word and real-word errors are generally two types of spelling errors. Non-word errors are misspelled words that are nonexistent in the lexicon while real-word errors are misspelled words that exist in the lexicon but are used out of context in a sentence. Lexicon-based lookup approach is widely used for non-word errors but it is incapable of handling real-word errors as they require contextual information. Contrary to the English language, real-word error detection and correction for low-resourced languages like Urdu is an unexplored area. This paper presents a real-word spelling error detection and correction approach for the Urdu language. We develop an extensive lexicon of 593,738 words and use this lexicon to develop a dataset for real-word errors comprising 125562 sentences and 2,552,735 words. Based on the developed lexicon and dataset, we then develop a contextual spell checker that detects and corrects real-word errors. For the real-word error detection phase, word-gram features are used along with five machine learning classifiers, achieving a precision, recall, and F1-score of 0.84,0.79, and 0.81 respectively. We also test the proposed approach with a 40% error density. For real-word error correction, the Damerau-Levenshtein distance is used along with the n-gram model for further ranking of the suggested candidate words, achieving an accuracy of up to 83.67%. metadata Aziz, Romila; Anwar, Muhammad Waqas; Jamal, Muhammad Hasan; Bajwa, Usama Ijaz; Kuc Castilla, Ángel Gabriel; Uc-Rios, Carlos; Bautista Thompson, Ernesto y Ashraf, Imran mail SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, carlos.uc@unini.edu.mx, ernesto.bautista@unini.edu.mx, SIN ESPECIFICAR (2023) Real Word Spelling Error Detection and Correction for Urdu Language. IEEE Access. p. 1. ISSN 2169-3536

[img] Texto
Real_Word_Spelling_Error_Detection_and_Correction_for_Urdu_Language.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Descargar (3MB)

Resumen

Non-word and real-word errors are generally two types of spelling errors. Non-word errors are misspelled words that are nonexistent in the lexicon while real-word errors are misspelled words that exist in the lexicon but are used out of context in a sentence. Lexicon-based lookup approach is widely used for non-word errors but it is incapable of handling real-word errors as they require contextual information. Contrary to the English language, real-word error detection and correction for low-resourced languages like Urdu is an unexplored area. This paper presents a real-word spelling error detection and correction approach for the Urdu language. We develop an extensive lexicon of 593,738 words and use this lexicon to develop a dataset for real-word errors comprising 125562 sentences and 2,552,735 words. Based on the developed lexicon and dataset, we then develop a contextual spell checker that detects and corrects real-word errors. For the real-word error detection phase, word-gram features are used along with five machine learning classifiers, achieving a precision, recall, and F1-score of 0.84,0.79, and 0.81 respectively. We also test the proposed approach with a 40% error density. For real-word error correction, the Damerau-Levenshtein distance is used along with the n-gram model for further ranking of the suggested candidate words, achieving an accuracy of up to 83.67%.

Tipo de Documento: Artículo
Palabras Clave: Real-word errors, spelling correction, spelling detection, spell checker
Clasificación temática: Materias > Ingeniería
Divisiones: Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Artículos y libros
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Producción Científica
Depositado: 14 Sep 2023 23:30
Ultima Modificación: 14 Sep 2023 23:30
URI: https://repositorio.unincol.edu.co/id/eprint/8800

Acciones (logins necesarios)

Ver Objeto Ver Objeto

<a href="/12747/1/sensors-24-03754%20%281%29.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Ultra-Wide Band Radar Empowered Driver Drowsiness Detection with Convolutional Spatial Feature Engineering and Artificial Intelligence

Driving while drowsy poses significant risks, including reduced cognitive function and the potential for accidents, which can lead to severe consequences such as trauma, economic losses, injuries, or death. The use of artificial intelligence can enable effective detection of driver drowsiness, helping to prevent accidents and enhance driver performance. This research aims to address the crucial need for real-time and accurate drowsiness detection to mitigate the impact of fatigue-related accidents. Leveraging ultra-wideband radar data collected over five minutes, the dataset was segmented into one-minute chunks and transformed into grayscale images. Spatial features are retrieved from the images using a two-dimensional Convolutional Neural Network. Following that, these features were used to train and test multiple machine learning classifiers. The ensemble classifier RF-XGB-SVM, which combines Random Forest, XGBoost, and Support Vector Machine using a hard voting criterion, performed admirably with an accuracy of 96.6%. Additionally, the proposed approach was validated with a robust k-fold score of 97% and a standard deviation of 0.018, demonstrating significant results. The dataset is augmented using Generative Adversarial Networks, resulting in improved accuracies for all models. Among them, the RF-XGB-SVM model outperformed the rest with an accuracy score of 99.58%.

Producción Científica

Hafeez Ur Rehman Siddiqui mail , Ambreen Akmal mail , Muhammad Iqbal mail , Adil Ali Saleem mail , Muhammad Amjad Raza mail , Kainat Zafar mail , Aqsa Zaib mail , Sandra Dudley mail , Jon Arambarri mail jon.arambarri@uneatlantico.es, Ángel Gabriel Kuc Castilla mail , Furqan Rustam mail ,

Siddiqui

<a href="/13000/1/diagnostics-14-01292.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

A Comparison of the Clinical Characteristics of Short-, Mid-, and Long-Term Mortality in Patients Attended by the Emergency Medical Services: An Observational Study

Aim: The development of predictive models for patients treated by emergency medical services (EMS) is on the rise in the emergency field. However, how these models evolve over time has not been studied. The objective of the present work is to compare the characteristics of patients who present mortality in the short, medium and long term, and to derive and validate a predictive model for each mortality time. Methods: A prospective multicenter study was conducted, which included adult patients with unselected acute illness who were treated by EMS. The primary outcome was noncumulative mortality from all causes by time windows including 30-day mortality, 31- to 180-day mortality, and 181- to 365-day mortality. Prehospital predictors included demographic variables, standard vital signs, prehospital laboratory tests, and comorbidities. Results: A total of 4830 patients were enrolled. The noncumulative mortalities at 30, 180, and 365 days were 10.8%, 6.6%, and 3.5%, respectively. The best predictive value was shown for 30-day mortality (AUC = 0.930; 95% CI: 0.919–0.940), followed by 180-day (AUC = 0.852; 95% CI: 0.832–0.871) and 365-day (AUC = 0.806; 95% CI: 0.778–0.833) mortality. Discussion: Rapid characterization of patients at risk of short-, medium-, or long-term mortality could help EMS to improve the treatment of patients suffering from acute illnesses.

Producción Científica

Rodrigo Enriquez de Salamanca Gambara mail , Ancor Sanz-García mail , Carlos del Pozo Vegas mail , Raúl López-Izquierdo mail , Irene Sánchez Soberón mail , Juan F. Delgado Benito mail , Raquel Martínez Díaz mail raquel.martinez@uneatlantico.es, Cristina Mazas Pérez-Oleaga mail cristina.mazas@uneatlantico.es, Nohora Milena Martínez López mail nohora.martinez@uneatlantico.es, Irma Dominguez Azpíroz mail irma.dominguez@unini.edu.mx, Francisco Martín-Rodríguez mail ,

Enriquez de Salamanca Gambara

<a href="/11265/1/Food%20Frontiers%20-%202024%20-%20Cassotta%20-%20Human%E2%80%90based%20new%20approach%20methodologies%20to%20accelerate%20advances%20in%20nutrition%20research.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Human‐based new approach methodologies to accelerate advances in nutrition research

Much of nutrition research has been conventionally based on the use of simplistic in vitro systems or animal models, which have been extensively employed in an effort to better understand the relationships between diet and complex diseases as well as to evaluate food safety. Although these models have undeniably contributed to increase our mechanistic understanding of basic biological processes, they do not adequately model complex human physiopathological phenomena, creating concerns about the translatability to humans. During the last decade, extraordinary advancement in stem cell culturing, three-dimensional cell cultures, sequencing technologies, and computer science has occurred, which has originated a wealth of novel human-based and more physiologically relevant tools. These tools, also known as “new approach methodologies,” which comprise patient-derived organoids, organs-on-chip, multi-omics approach, along with computational models and analysis, represent innovative and exciting tools to forward nutrition research from a human-biology-oriented perspective. After considering some shortcomings of conventional in vitro and vivo approaches, here we describe the main novel available and emerging tools that are appropriate for designing a more human-relevant nutrition research. Our aim is to encourage discussion on the opportunity to explore innovative paths in nutrition research and to promote a paradigm-change toward a more human biology-focused approach to better understand human nutritional pathophysiology, to evaluate novel food products, and to develop more effective targeted preventive or therapeutic strategies while helping in reducing the number and replacing animals employed in nutrition research.

Producción Científica

Manuela Cassotta mail manucassotta@gmail.com, Danila Cianciosi mail , Maria Elexpuru Zabaleta mail maria.elexpuru@uneatlantico.es, Iñaki Elío Pascual mail inaki.elio@uneatlantico.es, Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es, Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Maurizio Battino mail maurizio.battino@uneatlantico.es,

Cassotta

<a href="/11322/1/journal.pone.0298582.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Design and development of patient health tracking, monitoring and big data storage using Internet of Things and real time cloud computing

With the outbreak of the COVID-19 pandemic, social isolation and quarantine have become commonplace across the world. IoT health monitoring solutions eliminate the need for regular doctor visits and interactions among patients and medical personnel. Many patients in wards or intensive care units require continuous monitoring of their health. Continuous patient monitoring is a hectic practice in hospitals with limited staff; in a pandemic situation like COVID-19, it becomes much more difficult practice when hospitals are working at full capacity and there is still a risk of medical workers being infected. In this study, we propose an Internet of Things (IoT)-based patient health monitoring system that collects real-time data on important health indicators such as pulse rate, blood oxygen saturation, and body temperature but can be expanded to include more parameters. Our system is comprised of a hardware component that collects and transmits data from sensors to a cloud-based storage system, where it can be accessed and analyzed by healthcare specialists. The ESP-32 microcontroller interfaces with the multiple sensors and wirelessly transmits the collected data to the cloud storage system. A pulse oximeter is utilized in our system to measure blood oxygen saturation and body temperature, as well as a heart rate monitor to measure pulse rate. A web-based interface is also implemented, allowing healthcare practitioners to access and visualize the collected data in real-time, making remote patient monitoring easier. Overall, our IoT-based patient health monitoring system represents a significant advancement in remote patient monitoring, allowing healthcare practitioners to access real-time data on important health metrics and detect potential health issues before they escalate.

Producción Científica

Md. Milon Islam mail , Imran Shafi mail , Sadia Din mail , Siddique Farooq mail , Isabel de la Torre Díez mail , Jose Breñosa mail josemanuel.brenosa@uneatlantico.es, Julio César Martínez Espinosa mail ulio.martinez@unini.edu.mx, Imran Ashraf mail ,

Islam

<a href="/11666/1/Pneumonia_Detection_Using_Chest_Radiographs_With_Novel_EfficientNetV2L_Model.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Pneumonia Detection Using Chest Radiographs With Novel EfficientNetV2L Model

Pneumonia is a potentially life-threatening infectious disease that is typically diagnosed through physical examinations and diagnostic imaging techniques such as chest X-rays, ultrasounds or lung biopsies. Accurate diagnosis is crucial as wrong diagnosis, inadequate treatment or lack of treatment can cause serious consequences for patients and may become fatal. The advancements in deep learning have significantly contributed to aiding medical experts in diagnosing pneumonia by assisting in their decision-making process. By leveraging deep learning models, healthcare professionals can enhance diagnostic accuracy and make informed treatment decisions for patients suspected of having pneumonia. In this study, six deep learning models including CNN, InceptionResNetV2, Xception, VGG16, ResNet50 and EfficientNetV2L are implemented and evaluated. The study also incorporates the Adam optimizer, which effectively adjusts the epoch for all the models. The models are trained on a dataset of 5856 chest X-ray images and show 87.78%, 88.94%, 90.7%, 91.66%, 87.98% and 94.02% accuracy for CNN, InceptionResNetV2, Xception, VGG16, ResNet50 and EfficientNetV2L, respectively. Notably, EfficientNetV2L demonstrates the highest accuracy and proves its robustness for pneumonia detection. These findings highlight the potential of deep learning models in accurately detecting and predicting pneumonia based on chest X-ray images, providing valuable support in clinical decision-making and improving patient treatment.

Producción Científica

Mudasir Ali mail , Mobeen Shahroz mail , Urooj Akram mail , Muhammad Faheem Mushtaq mail , Stefanía Carvajal-Altamiranda mail stefania.carvajal@uneatlantico.es, Silvia Aparicio Obregón mail silvia.aparicio@uneatlantico.es, Isabel De La Torre Díez mail , Imran Ashraf mail ,

Ali