Explainable machine learning models for predicting topsoil metals and oxides

Purpose: Interpretable machine learning (ML) models can help researchers and policymakers make well-informed decisions about soil and environmental protection. This study aimed to predict metals (including potential toxic elements) and their oxides in a topsoil using ML techniques with their subsequent interpretation. Materials and methods: We demonstrated it using a dataset containing eleven elements, including As, Co, Cr, Fe2O3, MnO, Ni, Pb, Sr, TiO2, V and Zn collected from mineral and organic soils. The prediction of each element’s concentration was based on a model incorporating other elements, soil properties (pH, SOC content and stock, bulk density), vegetation condition (as measured by NDVI), and land use type (abandoned and pristine soils, peatlands) as predictors. The Shapley Additive explanations (SHAP) approach, a technique from game theory, was used for interpretation of the ML models. Results and discussion: Cross-validation revealed the accurate prediction of most elements. Among them, vanadium (MAE = 3.72 mg/kg, RMSE = 4.59 mg/kg, R2 = 0.95, MEC = 0.93) and TiO2 (MAE = 0.03%, RMSE = 0.04%, R2 = 0.94, MEC = 0.93) were predicted with the highest accuracy. While elements served as primary predictors in ML models, the SHAP analysis allowed us to identify the specific contribution (positive or negative) of each element, as well as soil property and land use type. For instance, soil parameters and land use type exhibited distinct contributions to the prediction of higher or lower concentrations of specific elements, reflecting the different pedological and geochemical processes across mineral and organic soils. Furthermore, the method provided thresholds that indicate the levels at which predictors exert a positive or negative influence on the outputs. Conclusion: Interpretable ML models are essential for understanding the complex relationships between soil elements and properties, paving the way for more accurate predictions and informed environmental management decisions. • Topsoil metals and oxides were modeled using machine learning techniques. • Models were interpreted by Shapley Additive explanations approach. • Positive or negative contributions of predictors were examined for each element. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.

Авторы
Suleymanov Azamat R. 1, 2, 3 , Nizamutdinov Timur I. 1 , Polyakov Vyacheslav Igorevich 1 , Shevchenko E.V. 4 , Dinkelaker Natalia V. 5 , Petrović Marko D. 6, 7 , Stošić Lazar V. 8, 9 , Adelmurzina Ilgiza F. 2 , Abakumov Evgeny V. 1
Номер выпуска
12
Язык
Английский
Страницы
3967-3983
Статус
Опубликовано
Том
25
Год
2025
Организации
  • 1 Department of Applied Ecology, Saint Petersburg State University, Saint Petersburg, Russian Federation
  • 2 Department of Geodesy, Ufa University of Science and Technology, Ufa, Bashkortostan Republic, Russian Federation
  • 3 Laboratory of Soil Science, Ufa Institute of Biology of the Russian Academy of Sciences, Ufa, Bashkortostan Republic, Russian Federation
  • 4 And Nanoelectronics, Saint Petersburg State University, Saint Petersburg, Russian Federation
  • 5 Faculty of Energy and Ecotechnology, Saint Petersburg National Research University of Information Technologies, Mechanics and Optics University ITMO, Saint Petersburg, Russian Federation
  • 6 Institute for the Serbian Language of SASA, Belgrade, Serbia
  • 7 Department of Regional Economics and Geography, RUDN University, Moscow, Moscow Oblast, Russian Federation
  • 8 Faculty of Informatics and Computer Science, Univerzitet Union Nikola Tesla, Belgrade, Serbia
  • 9 Department of Scientific and Technical Information and Scientific Publications, Donskoj Gosudarstvennyj Tehniceskij Universitet, Rostov-on-Don, Rostov Oblast, Russian Federation
Ключевые слова
Heavy metals; Machine learning; Potential toxic elements; SHAP; Shapley values; Soil
Цитировать
Поделиться

Другие записи