Gully erosion spatial modelling - Role of machine learning algorithms in selection of the best controlling factors and modelling process

A - Papers appearing in refereed journals

Pourghasemi, H. R., Sadhasivam, N., Kariminejad, N. and Collins, A. L. 2020. Gully erosion spatial modelling - Role of machine learning algorithms in selection of the best controlling factors and modelling process. Geoscience Frontiers.

AuthorsPourghasemi, H. R., Sadhasivam, N., Kariminejad, N. and Collins, A. L.
Abstract

This investigation assessed the efficacy of 10 widely used machine learning algorithms (MLA) comprising the least
absolute shrinkage and selection operator (LASSO), generalized linear model (GLM), stepwise generalized linear
model (SGLM), elastic net (ENET), partial least square (PLS), ridge regression, support vector machine (SVM), classification and regression trees (CART), bagged CART, and random forest (RF) for gully erosion susceptibility mapping (GESM) in Iran. The location of 462 previously existing gully erosion sites were mapped through widespread field investigations, of which 70% (323) and 30% (139) of observations were arbitrarily divided for algorithm calibration and validation. Twelve controlling factors for gully erosion, namely, soil texture, annual mean rainfall, digital elevation model (DEM), drainage density, slope, lithology, topographic wetness index (TWI), distance from rivers, aspect, distance from roads, plan curvature, and profile curvature were ranked in terms of their importance using each MLA. The MLA were compared using a training dataset for gully erosion and statistical measures such as RMSE (root mean square error), MAE (mean absolute error), and R-squared. Based on the comparisons among MLA, the RF algorithm exhibited the minimum RMSE and MAE and the maximum value of R-squared, and was therefore selected as the best model. The variable importance evaluation using the RF model revealed that distance from rivers had the highest significance in influencing the occurrence of gully erosion whereas plan curvature had the least importance. According to the GESM generated using RF, most of the study area is predicted to have a low (53.72%) or moderate (29.65%) susceptibility to gully erosion, whereas only a small area is identified to have a high (12.56%) or very high (4.07%) susceptibility. The outcome generated by RF model is validated using the ROC (Receiver Operating Characteristics) curve approach, which returned an area under the curve (AUC) of 0.985, proving the excellent forecasting ability of the model. The GESM prepared using the RF algorithm can aid decision-makers in targeting remedial actions for minimizing the damage caused by
gully erosion.

KeywordsMachine learning algorithm; Gully erosion; Random forest; Controlling factors; Variable importance
Year of Publication2020
JournalGeoscience Frontiers
Digital Object Identifier (DOI)doi:10.1016/j.gsf.2020.03.005
Open accessPublished as non-open access
FunderBiotechnology and Biological Sciences Research Council
College of Agriculture, Shiraz University
Funder project or codeS2N - Soil to Nutrition - Work package 3 (WP3) - Sustainable intensification - optimisation at multiple scales
97GRC1M271143
Output statusPublished
Publication dates
Online25 Mar 2020
Publication process dates
Accepted09 Mar 2020
PublisherElsevier
ISSN1674-9871

Permalink - https://repository.rothamsted.ac.uk/item/97y94/gully-erosion-spatial-modelling-role-of-machine-learning-algorithms-in-selection-of-the-best-controlling-factors-and-modelling-process

Restricted files

Publisher's version

Under embargo indefinitely

23 total views
0 total downloads
0 views this month
0 downloads this month