The legacy effect of synthetic N fertiliser

Abstract Cumulative crop recovery of synthetic fertiliser nitrogen (N) over several cropping seasons (legacy effect) generally receives limited attention. The increment in crop N uptake after the first‐season uptake from fertiliser can be expressed as a fraction (∆RE) of the annual N application rate. This study aims to quantify ∆RE using data from nine long‐term experiments (LTEs). As such, ∆RE is the difference between first season (RE1st) and long‐term (RELT) recovery of synthetic fertiliser N. In this study, RE1st was assessed either by the 15N isotope method or by a zero‐N subplot freshly superimposed on a long‐term fertilised LTE treatment plot. RELT was calculated by comparing N uptake in the total aboveground crop biomass between a long‐term fertilised and long‐term control (zero‐N) treatment. Using a mixed linear effect model, the effects of climate, crop type, experiment duration, average N rate, and soil clay content on ∆RE were evaluated. Because the experimental setup required for the calculation of ∆RE is relatively rare, only nine suitable LTEs were found. Across these nine LTEs in Europe and North America, the mean ∆RE was 24.4% (±12.0%, 95% CI) of annual N application, with higher values for winter wheat than for maize. This result shows that fertiliser‐N retained in the soil and stubble may contribute substantially to crop N uptake in subsequent years. Our results suggest that an initial recovery of 43.8% (±11%, 95% CI) of N application may increase to around 66.0% (±15%, 95% CI) on average over time. Furthermore, we found that ∆RE was not clearly related to long‐term changes in topsoil total N stock. Our findings show that the—often used—first‐year recovery of synthetic fertiliser N application does not express the full effect of fertiliser application on crop nutrition. The fertiliser contribution to soil N supply should be accounted for when exploring future scenarios on N cycling, including crop N requirements and N balance schemes. Highlights Nine long‐term cereal experiments in Europe and USA were analysed for long‐term crop N recovery of synthetic N fertiliser. On average, and with application rates between 34 and 269 kg N/ha, crop N recovery increased from 43.8% in the first season to 66.0% in the long term. Delta recovery was larger for winter wheat than maize. Observed increases in crop N uptake were not explained by proportionate increases in topsoil total N stock.


Funding information
The Rothamsted Long-term Experiments National Capability (LTE-NC) (which includes Broadbalk and Hoosfield) is supported by the UK BBSRC (Biotechnology and Biological Sciences Research Council, BBS/E/C/000J0300) and the Lawes Agricultural Trust.

| INTRODUCTION
Long-term experiments (LTEs), such as the Broadbalk Wheat Experiment at Rothamsted Research (UK), show that external nitrogen (N) inputs can increase crop yields by two or three times (Rasmussen et al., 1998). While ample N supply has led to increased food security in recent decades, it can cause environmental damage such as eutrophication of surface waters, loss of biodiversity and global warming. On the other hand, soil nutrient depletion, low yields and severe food scarcity are apparent in places with insufficient access to N inputs. Sustainable N management implies avoidance of excess application as well as avoidance of soil fertility depletion. This involves proper accounting for crop N requirements to meet given target yields, both in the short and longterm. This is especially relevant in regions where drastic changes in fertiliser-N input are advocated or expected.
Research on inorganic (also called synthetic or mineral) N fertilisers has largely focused on N uptake in the year of application, and recommendation systems commonly account only for first-season effects. Relatively few studies have aimed at quantifying the long-term effect of synthetic fertiliser N on soil N and crop N uptake, although the need has been recognised (e.g., by Thomsen et al., 2003). For organic manures, in contrast, long-term increments of total soil N, soil organic matter (SOM) and crop N uptake are well documented (Lund & Doss, 1980;Schröder et al., 2005).
Yet, inorganic fertiliser N inputs may also change the size or composition of the soil N pool in the long-termdirectly or via crop residues-and this could potentially sustain an increased annual soil N supply and associated crop N uptake and yield (i.e., the legacy effect of synthetic fertiliser N application). While multiple-year effects on soil N supply remain scarcely documented for synthetic N fertilisers, several estimations have been made in the UK based on trials with 15 N-labelling. Sylvester-Bradley et al. (1987) calculated that 10% of synthetic fertiliser applied was re-mineralised in the second year, 3% of the remainder in the third year, and 1% in each of the following years. Similar values were found by others who later followed the fate of fertiliser-derived 15 N over multiple years (e.g., Glendining et al., 2001;Macdonald et al., 2002;Smith & Chalk, 2018).
LTE's can be used to quantify long-term apparent fertiliser N recovery, by comparing annual crop N uptake in plots that did or did not receive fertiliser N for many years. Long-term recovery thereby accounts for both the continuous depletion-in the absence of fertiliser inputs-of initial soil N stock, and the possible build-up of soil N under a regime of fertiliser input. Long-term N response curves, therefore, show steeper yield responses to N (at low and moderate N rates) than curves from the 1-year trials typically used to inform fertiliser recommendation systems (van Grinsven et al., 2022).
Trends in long-term N recovery, as seen in LTEs, may provide an upper estimate of the effect that sustained inorganic N inputs may have on soil N supply and crop yield. In the Bad Lauchstädt trial (Germany), a long-term increase in N recovery was observed between 1970 and 2016 ( Figure S1). However, such trends do not necessarily reflect an increase in soil N supply caused by fertiliser input. They may also be caused by improvements in crop genotype or management or changes in climate. At LTE Ropsley (UK), Bhogal, Young, Sylvester-Bradley, O'donnell, and Ralph (1997) found a positive trend in N recovery between 1978 and 1990. Interestingly, it seems that the positive trend in long-term N recovery fraction overtime was steeper for higher N rates at Ropsley.
Other long-term studies assessed the residual effect of historically applied N on current crop N uptake after changing N application rates (e.g., Maaz & Pan, 2017;Petersen et al., 2010;Thomsen et al., 2003). Petersen et al. (2010) studied several experiments in Scandinavia where a wide range of new N rates was superimposed on historical N rates. The effect of historical N rates was found to be small compared to the effect of the newly established N rates on crop N uptake.
In this study, we present a new analysis of the legacy effect of synthetic fertiliser N application, that borrows elements from the above-cited studies. The objective of this study is to quantify the contribution of synthetic fertiliser N to soil N supply by evaluating the difference (ΔRE) between short-term recovery (RE 1st ) and longterm recovery (RE LT ) of fertiliser N in the aboveground crop biomass. We applied this method to a number of suitable cereal-based LTEs. We hypothesized that the long-term recovery of applied synthetic N is larger than the first-season recovery and that this difference would be mediated by factors, including climate, crop type, experiment duration, average N application, soil clay content, calculation method for RE 1st , and crop residue management.

| MATERIALS AND METHODS
First, the existing literature was searched for LTEs with suitable experimental set-ups (as described in Sections 2.1 and 2.2). Subsequently, RE LT , RE 1st and ΔRE were calculated for a number of data sets within each LTE (Sections 2.3, 2.4 and 2.5). Finally, a meta-analysis was conducted to find the mean ΔRE and to explain observed variation using a number of co-variables.

| Data selection and criteria
Data were collected from journal articles that reported information about LTEs. The selection criteria for inclusion in this study were as follows: (1) at least one long-term fertilised (NPK) and one long-term unfertilised control (PK) plot should be present to quantify RE LT ; (2) either a 15 N or a new control (PK) subplot superimposed on the long-term N-fertilised plot is present to allow quantifying RE 1st . Using the search terms ' Longterm' and 'Cereal' and ' 15 N' and/or 'subplot' in Google Scholar and Web of Science, five useful experiments were selected. Another experiment was obtained from the CATCH-C database which contains data from over 300 LTEs in Europe (Sandén et al., 2018). (None of the other LTEs in that extensive collection met our criteria). Via personal communication, three more useful experiments were found. In total, this resulted in nine useful LTEs, which contained data from 11 experimental sites. When data were not fully provided in an article, they were obtained either by personal communication or by analysing figures from the article using Webplotdigitizer (Rohatgi, 2020).

| Characteristics of LTEs included in the meta-analysis
The selected experiments suitable for the calculation of ΔRE were located in Europe and North America ( Figure 1). Crop residues (excluding roots and stubble) were removed from the field, except for the LTEs in Kiel and Iowa. The duration of the experiments varied between 5 and 141 years, with an average value of 38 years ( Figure S2). No experiments were found with a duration between 21 and 80 years. Although a duration as short as 5 years may not be regarded as 'long-term' by many, and while long-term effects may indeed become more apparent over time, we still retained all data, considering the scarcity of LTEs that allowed assessment of both RE 1st and RE LT . An overview of meta-data and slight deviations from the above methods to calculate ΔRE is provided in Table 1. Such deviations include irreversible modifications in the experimental setup (instead of using temporary subplots) enabling the calculation of RE 1st . A detailed description of all experiments is provided in Table S1.

| Quantifying long-term N recovery
The long-term N recovery fraction (RE LT ) was defined as the fraction of annually applied fertiliser N recovered in the aboveground crop biomass. It includes recovery of fertiliser-N applied in the current season, as well as recovery of previously applied fertiliser N, via uptake of fertiliser-derived soil N built up in previous seasons. It is expressed as a fraction of the annual application rate. RE LT was calculated based on LTEs where fixed levels of synthetic fertiliser N were maintained over many years ( Figure 2; Equation 1). To calculate RE LT from an LTE, at least one N application rate and a control plot must be present in the experimental set-up. The control plot should have received zero N, with phosphorus (P) and potassium (K) application at the same rate as the fertilised plot. As this method takes a zero-N treatment as a reference, it should be referred to as 'long-term apparent recovery' (as opposed to labelled N recovery), but the term 'apparent' is omitted for brevity in the remainder of the text.
with: U N,LT , annual N uptake from the long-term fertilised plot (kg N/ha). U 0N,LT , annual N uptake from the long-term non-fertilised (control) plot (kg N/ha). N rate, amount of N applied annually (kg N/ha) to the long-term fertilised plot. Note that both the N and 0 N treatments here refer to the long-term treatments in the LTE that are still being continued, undisturbed by recent interventions made to assess first-season recovery. Note also that first-season recovery (Section 2.4) is part of RE LT .

| Quantifying first-season N recovery
Within the long-term trial fields, two types of superimposed short-term experiments were considered suitable to calculate RE 1st : (1) introduction of an 0 N subplot (as illustrated in Figure 2, upper subplot); (2) synthetic fertiliser application with a 15 N isotope ( Figure 2, lower subplot). RE LT and RE 1st were both calculated for the year in which such short-term treatments were added. For reliable estimation of the difference between RE 1st and RE LT , it is imperative that these two variables refer to the same year of observation, thus eliminating errors due to annual variation. Hence, the long term treatments (control and fertilised) are to be continued unamended during the trial superimposed for the estimation of RE 1st .

| Method 1: Introducing a subplot
Experiments with a newly introduced control subplot (only receiving PK application without synthetic N fertiliser), enable the calculation of RE 1st by subtracting measured N uptake in the control subplot from the N uptake in the long-term main plot and dividing by the long-term N application rate (Equation 2).
with: U N,LT , N uptake from the long-term fertilised plot (kg/ha). U 0N,ST , N uptake from short-term non-fertilised (control) subplot (kg/ha), that is, where the historic longterm N rate was discontinued in the year of observation. N rate, amount of N applied annually (kg N/ha) to the long-term fertilised plot (as in Equation 1).

| Method 2: Using 15 N
Alternatively, most of the LTEs allowed the calculation of RE 1st from observations of first-season 15 N recovery from fertiliser labelled with the 15 N isotope (Powlson et al., 1986). This approach assumes that the two isotopes ( 14 N and 15 N) undergo chemical and biological

F I G U R E 1 Locations of long-term experiments in North America (a) and Europe (b) included in this study
with: U 15N , 15 N uptake (kg/ha/year). 15 N application rate (kg N/ha/year).

| Delta recovery (ΔRE)
The main response variable in this analysis, delta recovery (ΔRE), was introduced to express the legacy effect of long-term synthetic N application on crop N uptake. We define ΔRE as the difference between first season N recovery (RE 1st ) and long-term N recovery (RE LT ) in above-ground crop biomass, both measured in the same year. As explained, first season N recovery (RE 1st ) refers to the fraction of N taken up from fertiliser in the year of application ( Figure 3, large dotted arrow). Long-term N recovery (RE LT ), in contrast, also includes uptake of N applied in earlier years (Figure 3, solid black arrow). The difference between RE 1st and RE LT (ΔRE, Equation 4) results from the uptake of fertiliser-N that was retained in the soil and released beyond the year of application. Therefore, ΔRE could be thought of as 'delayed N recovery' and is expressed as the fraction of the annual N application rate (%).

| Meta-analysis
Relevant data from the nine LTEs were compiled in a database. Based on these, 66 sets of data were constructed, each of which allowed for calculating RE 1st , RE LT and ΔRE. This number (66) is larger than the number of LTEs (11), because data from multiple years, crop types and N application rates were available for some of the LTEs (Table S3).
Every observation included information about experimental location, year, N application rate, RE 1st calculation method and N uptake. Most studies included in this analysis did not provide a measure of variance for N uptake. Moreover, for some studies, every data point F I G U R E 2 Examples of short-term treatments within a longterm trial which allow for the calculation of ΔRE. The two shaded large fields have continuously received either synthetic N fertiliser or no N fertiliser. Often such treatments are part of a larger setup with multiple N rate treatments. At the right, examples of a 15 N treatment and of a new control subplot are shown superimposed on the original (long-term fertilised) treatment. The 15 N subplot receives the same N rate as the historic N rate, but now the fertiliser is 15 N labelled. The numbers indicate the amount of applied synthetic N fertiliser (these are examples only). Between brackets is indicated what is measured on these plots, with notation corresponding to Equations (1-3) F I G U R E 3 Hypothetical development of total N uptake and fertiliser N recovery, with continuous synthetic N application over time. The short-term recovery fraction (dark grey) increases over time in this graph, possibly by long-term improvements in, for example, cultivars or management. It could also decrease, for example, by changing biotic or abiotic stresses. The light grey area indicates the additional recovery (ΔRE) of synthetic fertiliser N via increased soil N supply. The total N recovery in a certain year is called long-term recovery (RE LT ), indicated by the solid black arrow. Uptake from the native soil N stock (that existed prior to the start of LTE and dwindles over time) is not shown here included only a single observation. Therefore, the number of underlying replicates was used as a weighting factor for the data points, including the number of years and the number of true field replicates. Experiment location and sampling year were included as random effects. For both RE 1st , RE LT and ΔRE, the normality of distribution was checked using density plots ( Figure S3).
2.6.1 | Comparing RE 1st calculation methods First, the extent to which the method used to calculate RE 1st (either 15 N or Subplot method) affected ΔRE was tested. This comparison was performed both on the whole dataset, but also separately on a subset of the experiments where both methods were used. The latter gives the most straightforward comparison but can be applied to a few LTEs only. As this selection reduced the sample size and the other analyses were performed using all data points, the type of method was also added as an explanatory variable for ΔRE using a mixed-effects model from the nlme package in R (Pinheiro et al., 2020) on the whole dataset.

| Calculation of mean ΔRE
To calculate a weighted average of ΔRE across all studies, a mixed effect model was used from the nlme package in R (Pinheiro et al., 2020). In cases where both methods to calculate RE 1st were available, a separate value for ΔRE was included for each of the two methods.

| Mixed-effects model estimation and selection
Besides quantifying ΔRE, this study also aimed to quantify the influence of several co-variables on ΔRE (Equation 5) such as crop type, experiment duration, average N application, soil type and climate. Co-variables were standardised to the same unit to enable comparison between studies. The effect of co-variables was tested in several combinations using mixed-effects models. To find the combination of co-variables that best fitted the data, a model selection was performed with the 'dredge' function from the Mumin package (Barton, 2019), based on the corrected Akaike information criterion (AICc). Models were considered to be different when ΔAICc >2. Fixed effects that were tested are provided in Equation 5. Co-variable values were mostly obtained from the published articles. In addition, the climate zone for each LTE was characterised by the Global Yield Gap Atlas approach, defining three main features: growing degree days (accumulated temperature sums for mean daily temperature above a base temperature), aridity index (annual total precipitation divided by annual total potential evapotranspiration), and temperature seasonality (quantified as the standard deviation of monthly average temperatures) (Van Wart et al., 2013). The variable named 'crop residue retention' indicated whether crop residues (straw) were removed or kept on the field after harvest (the latter being the case in two LTEs). Information about other co-variables is included in Table S3. However, additional variables were not included in the analysis because the data were not available for all experiments.
ΔRE $ Growing degree days þ Temperature seasonality þ Aridity index þ Crop type þ Experiment duration þ Average N application þ Soil clay content þ Method þ Crop residue retention þ ε: 2.6.4 | Total soil N In addition to the co-variables shown in Equation (5), the relation between ΔRE and the long-term change in total soil N stock was examined. As soil N data was only available for three experiments, this was done as a separate analysis. For those three experiments, the relative increase in total soil N (i.e., in fertilised plot versus unfertilised plot) was compared with the relative increase in soil N uptake (again in fertilised plot versus unfertilised plot; Equation 6).

| Observed ΔRE across nine experiments
In 61 out of 66 cases, RE LT was larger than RE 1st (Figure 4). Mean RE 1st and RE LT were 43.8% (±11%, 95% CI) and 66.0% (±15%, 95% CI) of annual N application rate, respectively. For four observations, RE LT exceeded 100%, meaning that the increment in N uptake (over the control treatment) was larger than the amount of N applied. Such points were retained in the overall analysis.

| Influence of co-variables on ΔRE
No severe collinearity was observed between the covariables included in the full model (i.e., including all covariables; further explained in Figure S4). The two models with the lowest AICc values include the variables crop type, method, and crop residue retention, with or without clay content (Table 2, Figure 5). Winter wheat showed a significantly higher ΔRE than maize (p = 0.002). ΔRE did not significantly differ from the other crop types. Soil clay percentage was included as a predictive variable for ΔRE in the second-best model. However, the estimated slope of 1.07% of the annual N application rate per percent clay content was not significant (p = 0.45) when included as a sole variable. The effect of clay content on RE LT seemed more evident but was not significant (p = 0.24, Figure S5). Crop residue retention was included in all selected models. However, when including crop residue retention as a sole variable, no significant difference in ΔRE (p = 0.38) was observed between retention and removal of crop residues. Crop residues were retained at two of the nine LTEs. Including experiment duration and level of N application did not improve the AICc-value of any linear mixed effect model used to explain variation in ΔRE (Table 2). When assessing the effect of experiment duration and N application as a sole variable on ΔRE, no significant correlations were observed ( Figure S2). As the correlation between duration and ΔRE may be more pronounced in earlier years (with RE LT levelling off after some point in time), we also selected a subset of experiments with a duration between 5 and 21 years. In this period, no significant effect of duration could be found either (p = 0.399).
The type of method to assess RE 1st ( 15 N or Subplot method) was included as an explanatory variable in the best model when using data from all experiments (Figure 5c). In the experiments that allowed for both methods to calculate RE 1st (Ropsley, Iowa-central and Iowa-southern, Figure 5d), ΔRE was not significantly different between both calculation methods (p = 0.18). However, when removing one outlier, the 15 N method showed a significantly higher ΔRE than the Subplot method (p = 0.008). The difference between both methods amounted to a 7.4% recovery of annual N application, caused by lower RE 1st values from 15 N treatments compared to the Subplot treatments.
In three out of nine experiments, total soil N data were available. Topsoil total soil N was, on average, 10% higher on fertilised plots compared to the control plots. N uptake from the soil in the fertilised plots, by contrast, increased by 86% on average, relative to unfertilised control ( Figure S6). There was no significant correlation F I G U R E 4 First season recovery and long-term recovery for winter wheat, spring barley and maize (N = 66). The diagonal solid black line indicates RE LT = RE 1st ; the difference between that black line and the red diagonal dashed line indicates the estimated average ΔRE. Point-size indicates the weight based on sample size. Note that data from the LTE Bad Lauchstädt is excluded from this graph because separate values of long and first season recovery could not be calculated and ΔRE was assessed differently (see Table S1; see Equation S2) T A B L E 2 Model results of a model without co-variables, with all co-variables and the four best models based on AICc model selection

| DISCUSSION AND CONCLUSION
4.1 | Long-term recovery is consistently higher than short-term recovery Our results show a consistently positive ΔRE, which indicates that N originating from earlier synthetic fertiliser applications contributes to crop N uptake in years after its application. Based on our data set, the mean legacy effect of synthetic N fertiliser application (ΔRE) was 24.4% (±12.0%, 95% CI) of the annual N application rate. Figure 4 suggests that ΔRE may decrease with increasing values of RE 1st . We therefore also assessed the legacy effect as a fraction of (1 À RE 1st ), rather than as a fraction of the N rate. However, expressing the legacy effect in this manner yielded no additional insights, as equally so, no correlations were found with explanatory variables (e.g., with experiment duration). Moreover, in practical cases, RE 1st is often unknown beforehand. Expressing ΔRE as a fraction of N application, therefore, has much more practical value. We do recognise, however, that in a true equilibrium state the sum of RE 1st and ΔRE cannot exceed 1 in situations where crop residues are removed from the field (as in most LTEs here). Our observed ΔRE corresponds well with multiple 15 N studies which followed the fate of a single synthetic 15 N application over multiple years (e.g., Dourado-Neto et al., 2010;Glendining et al., 2001;Macdonald et al., 2002;Smith & Chalk, 2018). Nitrogen retention was assessed in those studies by measuring the fraction of applied 15 N that ends up in the soil pool or crop. Additionally, when following 15 N in the soil for multiple years, the return to the soil of fertiliser-derived N via crop residues was also measured in those studies. While that method allows following a 15 N 'spike' (once applied) over several years, it does not allow quantification of the cumulative effect on crop N recovery of maintaining a fixed N rate over many years. Jenkinson et al. (2004) followed such a single 15 N pulse for nearly 20 years in old grassland, where the grass was harvested every year. In the year of application, about 47% of applied 15 N was recovered. Cumulated over the following 18 years of the experiment, another 17% of the initially applied 15 N was recovered in aboveground biomass, which is quite similar to the mean value of an additional 24.4% recovery for cereals found in our study. Glendining et al. (1996) also found evidence for a positive ΔRE, using the same method that is here called 'Subplot method', at Broadbalk where N application was withheld for one season. In the year that no N fertiliser was provided to the long-term plots, N uptake was higher on those 'withheld' plots that had previously received synthetic fertiliser N, than on the long-term control plot which had never received N fertiliser. The maximum additional N uptake compared to the long-term N 0 treatment was found to be 29 kg N/ha on a plot, which previously received 192 kg N/ha. The corresponding ΔRE value would be 15.1% of N applied. The interpretation of their results was somewhat difficult due to weed growth. Nonetheless, their reported value corresponds roughly to the value of ΔRE that was found in this study. (Our analysis included data from the same Broadbalk experiment, albeit from other years.) In our study, weighting was used to give more importance to observations that were based on more replicates. However, weighting can also result in a bias towards those agro-ecological conditions for which a higher number of replicates happened to be available. Nonetheless, the estimated mean ΔRE only increased marginally (0.5%-point) when excluding weights in the mixed-effects model. Furthermore, the model selection results did not change when weights were excluded.

| Influence of crop type, soil clay content and RE 1st calculation method
Crop type, soil clay content, crop residue retention and RE 1st calculation method were the most important factors governing ΔRE, based on the model selection results. Winter wheat showed a significantly larger ΔRE than maize (Figure 5a), possibly because of its longer growing season, finer root system and ability to root deeper, which would enable it to use mineralised N from SOM more effectively (Thorup-Kristensen et al., 2009). Two out of the three experiments that cultivated maize used urea as fertiliser type, while all other experiments used a variation of ammonium-nitrate. There was no significant difference in ΔRE between the two types of fertiliser, neither when evaluated on the whole dataset nor when based on the maize subset only (data not shown). Soil type may also affect the ability to store and re-mineralise fertiliser N. Soils with a higher clay content show a larger N retention capacity compared to sandy soils (Cheshire et al., 1999). However, both RE LT and ΔRE were not significantly influenced by soil clay content in our study ( Figure S5). Retaining crop residues on the field was suggested to play an important role, based on the model selection results. However, crop residue retention did not significantly influence ΔRE when included as a sole variable in the model. Crop residues were retained in only two experiments, admittedly too few to obtain a firm view of the effect of crop residues on ΔRE. Finally, the method used to estimate RE 1st (either 15 N or Subplot method) was found to affect ΔRE. However, this was only significant within the subset of experiments where both 15 N and the Subplot method were available to calculate RE 1st , and only so when one outlier datapoint was removed.

15
N experiments are known to underestimate RE 1st (Jenkinson et al., 1985;Quan et al., 2021), which would lead to higher estimates for ΔRE (the difference between RE 1st and RE LT becoming larger). Underestimation of RE 1st could be caused by phenomena collectively referred to as 'added nitrogen interactions' (ANI). This includes pool substitution: labelled N replacing unlabelled N that would otherwise have been immobilised, leaving more unlabelled N available for plant uptake, and so causes overestimation of the contribution of soil-N to crop N uptake (Jenkinson et al., 1985). Stepwise N rate experiments avoid these difficulties but are potentially afflicted with the 'priming' issue (i.e., synthetic N application increasing soil N mineralisation). Quan et al. (2021) mention that differences between 15 N and Subplot methods tend to increase over time. However, with increasing experiment duration, no significant change in the difference between both methods was found in our data (p = 0.51, Figure S2). Differences between 15 N and Subplot methods could also be caused by the 'law of diminishing returns', when the Subplot method uses different control N rates (Quan et al., 2021). The latter situation does not occur in our study: 0 N control plots were used in all experiments that relied on the Subplot method. Finally, more 15 N experiments than 'subplot' experiments were found in the literature. This is most likely because 15 N trials are less disturbing to the main setup of LTEs.

| Low variation in co-variable values
Climate, which was included in the regression model by using growing degree days, temperature seasonality, and an aridity index, did not significantly contribute to the explanation of observed variation in ΔRE, even when accounting for the influence of other co-variables. However, this could be due to the small variation in climate among the experimental locations. Most experimental sites were located around the same latitude, some with a more continental and others with a more maritime climate. For other climates, results may differ. However, no suitable long-term experiments were found beyond temperate climates.
Somewhat surprisingly, the duration of the experiments did not affect ΔRE. A possible caveat is the lack of data points in this study with an experimental duration between around 20 and 80 years ( Figure S2). When data points were clustered in two groups, the group with durations above 80 years (14 out of 67 observations) showed no higher ΔRE than the group with durations between 5 and 21 years (p = 0.24). A likely explanation is that contributions to crop uptake from remineralized N diminish over time, so the increase in ΔRE from later years will become progressively smaller.
Because the experimental setup required for the calculation of ΔRE is relatively rare, the total number of observations used in this study was limited (N = 66, based on nine LTEs). As discussed, the limited size of our data pool, and especially the limited number of experiments included, may have caused the lack in observed correlations between co-variables and ΔRE, for example for the experimental duration. Moreover, the prevalence of co-variables in our dataset did not have a balanced distribution. For example, in contrast to winter wheat, there are no observations for maize with an experimental duration of over 26 years (Table S2). Similarly, most experiments included only one crop type and either retained residues on the field or not. Such configuration can lead to confounding effects. To alleviate this problem we used a mixed-effects model as is common in meta-analyses. This corrects for observations from the same experimental site. Nonetheless, we only found a significant difference in ΔRE between winter wheat and maize in our data. This does not rule out possible other correlations which however could not be substantiated. More data points, with more variation in co-variable factors, may be required to reveal other possible co-variable effects.

| Total soil N
If a cumulative effect of synthetic fertiliser N on soil N supply exists beyond the years of application, one would expect to find evidence also in changing soil N stocks. Previous studies reported clear, but relatively small increments in total soil N with increasing N application (Glendining & Powlson, 1995;Macdonald et al., 1989;Petersen et al., 2010). Glendining and Powlson (1995) indicated that total soil N increased under higher synthetic N application, but mineralisable N increased proportionally more. This suggests that changes in the quality of the soil N pool are governing ΔRE, rather than an increased total N stock. A similar conclusion was drawn in another study by Glendining et al. (1996).  reported a 'break point' at around 150-160 kg N application, above which N recovery increased more proportional to total soil N compared to lower N rates. Those findings correspond well with outcomes from our study, where the relative increase in total soil N after long-term synthetic N application is much smaller than the relative increase in N uptake from soil. This may imply that ΔRE reflects a change in composition rather than the size of the total soil N pool, as Glendining and Powlson (1995) already suggested. To our knowledge, there is still no conclusive explanation for this disproportionality.

| Implications of this study
Despite the relatively small number of observations and the potential difficulties afflicting the assessment of ΔRE, it is clear that our results show that continuous, longterm application of synthetic fertiliser leads to a higher N recovery of the applied synthetic fertiliser N, as compared to the first season recovery. Additionally, it seems that crop type is the most important factor governing this process (Table 2). More long-term experiments, with larger variations in all co-variables (e.g., assessing ΔRE in other situations, such as a tropical climate), can help to further develop an understanding of sustainable N cycling. The outcomes of this study suggest that a single year N recovery of 43.8% (±11%, 95% CI) in the total aboveground biomass can become, on average, 66.0% (±15%, 95% CI) over time due to N retention in the soil and its subsequent release. This is different from the simplified 50% (e.g., Lassaletta et al., 2016) that is commonly used currently, which does not consider fertiliser N retention by the soil.
Due to the legacy effect of synthetic fertiliser N application (expressed here as ΔRE), N yield response curves based on long-term trials show steeper slopes than those based on short-term trials. As also argued by Van Grinsven et al. (van Grinsven et al., 2022), this shift in slope should be taken into account in studies that seek to strike a balance between farm profit, food security and the environment. This is especially relevant in regions where N input rates are drastically changed. For example when grain output must steeply rise to feed a growing population such as in sub-Saharan Africa, or when N inputs are reduced to mitigate water pollution or greenhouse gas emissions as in parts of Europe today.