Estimating the soil water retention curve - comparison of multiple nonlinear regression approach and random forest data mining technique

A - Papers appearing in refereed journals

Rastgou, M., Bayat, H., Mansoorizadeh, M. and Gregory, A. S. 2020. Estimating the soil water retention curve - comparison of multiple nonlinear regression approach and random forest data mining technique. Computers and Electronics in Agriculture. 174, p. 105502.

AuthorsRastgou, M., Bayat, H., Mansoorizadeh, M. and Gregory, A. S.
Abstract

This study evaluates the performance of the random forest (RF) method on the prediction of the soil water retention
curve (SWRC) and compares its performance with those of nonlinear regression (NLR) and Rosetta-based pedotransfer functions (PTFs), which has not been reported so far. Fifteen RF and NLR-based PTFs were constructed using readily-available soil properties for 223 soil samples from Iran. The general performance of RF and NLR-based PTFs was quantified by the integral root mean square error (IRMSE), Akaike’s information criterion (AIC) and coefficient of determination (R2). The results showed that the accuracy of the RF-based PTFs was significantly (P<0.05) better than the NLR-based PTFs, and that the reliability of the NLR-based PTFs was significantly (P<0.01) better than the RF-based PTFs and all of the Rosetta-based PTFs. The average values of the IRMSE, AIC and R2 of the RF method were 0.041 cm3 cm-3, -16997.7, and 0.987, and 0.053 cm3 cm-3, -15547.5, and 0.981 for the training and testing steps of all PTFs, respectively, whereas the values for the NLR method were 0.046 cm3 cm-3, -16616.4,
and 0.984, and 0.048 cm3 cm-3, -16355.6, and 0.983 for the training and testing steps, respectively. The PTF5 of the
RF and NLR methods, with inputs of sand and clay contents, bulk density, and the water content at field capacity and
permanent wilting point, had the greatest R2 values (0.987 and 0.989, respectively), and the lowest IRMSE values (0.039 and 0.032 cm3 cm-3, respectively) compared to other PTFs for the testing step. Overall, the RF method had less reliability for the prediction of the SWRC compared to the NLR method due to overprediction, uncertainty of determination of forest scale and instability in the testing step. These findings could provide the scientific basis for further research on the RF method.

KeywordsPedotransfer functions; Soil water retention curve; Soil texture; Soil structure; Van Genuchten
Year of Publication2020
JournalComputers and Electronics in Agriculture
Journal citation174, p. 105502
Digital Object Identifier (DOI)doi:10.1016/j.compag.2020.105502
Open accessPublished as non-open access
FunderBiotechnology and Biological Sciences Research Council
Funder project or codeS2N - Soil to Nutrition [ISPG]
Output statusSubmitted
PublisherElsevier Sci Ltd
ISSN0168-1699

Permalink - https://repository.rothamsted.ac.uk/item/979z1/estimating-the-soil-water-retention-curve-comparison-of-multiple-nonlinear-regression-approach-and-random-forest-data-mining-technique

Restricted files

Accepted author manuscript

Under embargo indefinitely

9 total views
2 total downloads
0 views this month
0 downloads this month