Interpretable Machine Learning on Sanitation, Water, and Diarrheal Disease in Java Luhur Akbar Devianto1*, Chaterine Hanindya Putri Situmorang1, Putri Setiani1, Kiki Gustinasari1
1) Department of Biosystem Engineering, Faculty of Agriculture Technology, Universitas Brawijaya
Jalan Veteran, Malang 65145, Indonesia
*Email: luhur.devianto[at]ub.ac.id
Abstract
Diarrheal disease continues to pose a major public health concern in Indonesia, with Java Island being one of the most vulnerable regions due to its dense population and varied environmental conditions that heighten the risk of transmission. This study applies an interpretable machine learning framework to examine how sanitation, surface water quality, and environmental drivers shape the prevalence of diarrheal disease. To achieve this, an integrated dataset was compiled from official and reliable sources, incorporating household sanitation indicators, physicochemical and microbiological properties of surface water, as well as reported case data on diarrheal incidence across districts. Model interpretation was performed using SHapley Additive exPlanations (SHAP), which allowed for the quantification of each predictors contribution and direction of influence on disease risk. This analytical approach provides both global perspectives on population-level determinants and localized explanations for high-risk clusters. The findings offer actionable insights to guide interventions in water quality management, improvements in sanitation infrastructure, and the development of climate-resilient public health strategies tailored to regional needs.