Journal of Cancer Sciences
Research Article
Predictive Analytics for Disease Diagnosis: A Study on Healthcare Data with Machine Learning Algorithms and Big Data
Purna Chandra Rao Chinta1*, Chethan Sriharsha Moore2, Laxmana Murthy Karaka3, Manikanth Sakuru4, Varun Bodepudi5 and Srinivasa Rao Maka6
1Microsoft , Sr Technical Support Enginner
2Microsoft , Sr Technical Support Engineer
3Code Ace Solutions Inc, Software Engineer
4JP Morgan Chase, Lead Software Engineer
5Deloitte Consulting LLP, Senior Solution Specialist
6North Star Group Inc, Software Engineer
2Microsoft , Sr Technical Support Engineer
3Code Ace Solutions Inc, Software Engineer
4JP Morgan Chase, Lead Software Engineer
5Deloitte Consulting LLP, Senior Solution Specialist
6North Star Group Inc, Software Engineer
*Address for Correspondence:Purna Chandra Rao Chinta, Microsoft, Sr Technical Support Enginner
Email Id: chpurnachandrarao@gmail.com
Submission: 04 January, 2025
Accepted:31 January, 2025
Published:03 February, 2025
Copyright: © 2025 Chinta PCR, et al. This is an open access article
distributed under the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the
original work is properly cited.
Keywords: World Health Organization (WHO); Breast Cancer;
Tumour; Machine Learning; Healthcare; Disease Diagnosis;
Feedforward neural network (FFNN); Random Forest (RF); Decision Tree
(DT); Convolution Neural Network (CNN)
Abstract
At now, breast cancer ranks second among women in terms of
cancer-related deaths, making it a major epidemiological issue. The
illness is not caught early enough, and half of the one million women
diagnosed with breast cancer annually die from the condition. This
research aims to predict the occurrence of breast cancer using
various ML algorithms, including Feed forward Neural Network,
Random Forest, and Decision Tree, with the goal of reducing the risk
of death from this disease, which is a second most common cause of
death among women globally. This research uses the Breast Cancer
Wisconsin (Diagnostic) dataset to assess ML models that may diagnose
breast cancer. The FNN model outperformed RF and DT, achieving the
best overall performance with a precision, recall, and accuracy of
97.18%. These results highlight the FNN’s robustness in minimising false
positives and maximising true positives, making it a reliable tool for
breast cancer diagnosis. To further enhance the accuracy of feature
extraction and classification, future research may look at incorporating
stronger deep learning models such transformer architectures and
Convolution Neural Networks (CNNs). The model’s generalisability and
clinical usefulness might be further validated by using bigger and more
varied datasets.