Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total

A - Papers appearing in refereed journals

Li, H., Wang, J., Zhang, J., Liu, T., Acquah, G. and Yuan, H. 2022. Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total. Agronomy. 12 (3), p. 638. https://doi.org/10.3390/agronomy12030638

AuthorsLi, H., Wang, J., Zhang, J., Liu, T., Acquah, G. and Yuan, H.
Abstract

The successful estimation of soil organic matter (SOM) and soil total nitrogen (TN) contents with mid-infrared (MIR) reflectance spectroscopy depends on selecting appropriate variable selection techniques and multivariate methods for regression analysis. This study aimed to explore the potential of combining a multivariate method and spectral variable selection for soil SOM and TN estimation
using MIR spectroscopy. Five hundred and ten topsoil samples were collected from Quzhou County, Hebei Province, China, and their SOM and TN contents and reflectance spectra were measured using DRIFT-MIR spectroscopy (diffuse reflectance infrared Fourier transform in the mid-infrared range, MIR, wavenumber: 4000–400 cm−1; wavelength: 2500–25,000 nm). Two multivariate methods (partial least-squares regression, PLSR; multiple linear regression, MLR) combined with two variable
selection techniques (stability competitive adaptive reweighted sampling, sCARS; bootstrapping soft shrinkage approach, BOSS) were used for model calibration. The MLR model combined with the sCARS method yielded the most accurate estimation result for both SOM (Rp 2 = 0.72 and RPD = 1.89) and TN (Rp2 = 0.84 and RPD = 2.50). Out of the 2382 wavenumbers in a full spectrum, sCARS determined that only 31 variables were important for SOM estimation (accounting for 1.30% of all variables) and 27 variables were important for TN estimation (accounting for 1.13% of all variables). The results demonstrated that sCARS was a highly efficient approach for extracting information on wavenumbers and mitigating redundant wavenumbers. In addition, the current study indicated that MLR, which is simpler than PLSR, when combined with spectral variable selection, can achieve high-precision prediction of SOM and TN content. As such, DRIFT-MIR spectroscopy coupled withMLR and sCARS is a good alternative for estimating the SOM and TN of soils

KeywordsPrecision agriculture; Mid-infrared soil spectroscopy; Spectral variable selection; Multiple linear regression
Year of Publication2022
JournalAgronomy
Journal citation12 (3), p. 638
Digital Object Identifier (DOI)https://doi.org/10.3390/agronomy12030638
Open accessPublished as ‘gold’ (paid) open access
Publisher's version
Output statusPublished
Publication dates
Online05 May 2022
Publication process dates
Accepted02 Mar 2022
PublisherMDPI
ISSN2073-4395

Permalink - https://repository.rothamsted.ac.uk/item/9893x/combining-variable-selection-and-multiple-linear-regression-for-soil-organic-matter-and-total

66 total views
51 total downloads
3 views this month
1 downloads this month
Download files as zip