Smart Maternity: Machine Learning for Safer Prenatal Clinic

Loading...
Thumbnail Image
Date
2024-05
Journal Title
Journal ISSN
Volume Title
Publisher
The European Academic Journal (EAJ)
Abstract
Maternal health is a major public health concern because of its far-reaching implications for the well-being of both the mother and the child. Most maternal deaths can be prevented if there is a timely intervention that is offered to the mothers. It is therefore important to be able to predict if a mother is classified as being in high risk, low risk, and mid-risk to enable prompt attention to be given to the mother. In this study, we are using Machine learning to train a maternal data set having seven attributes and divided into three categories, high-risk, mid-risk, and low-risk pregnancies. The main aim of this study is to develop and evaluate machine learning models for predicting maternal risk levels, categorized as high risk, mid-risk, and low risk, based on a dataset containing seven attributes related to maternal health. The method involved training three Machine Learning algorithms, Logistic Regression, Random Forest and Support Vector Machine (SVM) using the dataset. The data had a significant difference in the categories thus, Synthetic Minority Over-sampling Technique (SMOTE) was used to address the class imbalance. Each algorithm was trained and evaluated on both the imbalanced and balanced datasets. To train the model, the data was divided into the training and testing sets split into 80 and 20 percent for the train and test data respectively to evaluate the model’s performance on unseen data. The performance of the algorithms was compared based on their accuracy in predicting maternal risk levels. Additionally, the study assessed the effectiveness of each algorithm in predicting risk levels for randomly entered datasets. The Random Forest achieved the highest accuracy of 85.71 and 81.77 percent for the balanced and imbalanced dataset respectively. Generally, algorithms trained with the smote-balanced dataset performed much better than with the imbalanced dataset. The risk level for a randomly entered dataset was predicted and Random Forest and Support Vector Machine predicted accurately.
Description
Keywords
Citation