Burden tests can be used to map causal genes for a simple metabolic trait in an exome‐sequenced polyploid mutant population

genetic screens are excellent tool to assign gene function, but it is often necessary to employ map-based cloning to identify the causal genes. This can be laborious and represents a bottleneck in in DNA it is becoming increasingly afford-able to sequence large populations. Krasileva et al . (2017) exome sequenced tetraploid and hexaploid wheat ethyl methanesul-fonate (EMS) mutagenized populations, primarily to facilitate reverse genetic screens. Gene redundancy allows a very high mutant load of 35 – 40 mutations per kilobase, and the populations of ~ 1500 and ~ 1200 lines each harbour ~ 22 – 23 missense or truncation mutations per gene. Here, we show that burden tests, a simple form of rare-variant association analysis developed for human disease genetics (Lee et al ., 2014), can be used to identify causal genes in the hexaploid wheat ( Triticum aestivum ) cv. Cadenza mutant population, without the need for map-based cloning. power with is ., and in the Cadenza singletons .,

Forward genetic screens are an excellent tool to assign gene function, but it is often necessary to employ map-based cloning to identify the causal genes. This can be laborious and represents a bottleneck in plant fundamental and applied research. With advances in DNA technology, it is becoming increasingly affordable to sequence large populations. Krasileva et al. (2017) exome sequenced tetraploid and hexaploid wheat ethyl methanesulfonate (EMS) mutagenized populations, primarily to facilitate reverse genetic screens. Gene redundancy allows a very high mutant load of 35-40 mutations per kilobase, and the populations of~1500 and~1200 lines each harbour~22-23 missense or truncation mutations per gene. Here, we show that burden tests, a simple form of rare-variant association analysis developed for human disease genetics (Lee et al., 2014), can be used to identify causal genes in the hexaploid wheat (Triticum aestivum) cv. Cadenza mutant population, without the need for map-based cloning.
The statistical power to detect association with rare variants is very limited (Lee et al., 2014), and most mutations in the Cadenza EMS population are singletons (Krasileva et al., 2017). Burden tests work by collapsing multiple variants within a gene (or other functional groups) into a single test score, thereby increasing frequency and providing greater power (Lee et al., 2014). However, this power relies on the selected variants mostly being causal and having the same direction and magnitude of effect (Lee et al., 2014). Such assumptions likely hold for mutant populations where causal variants are most frequently deleterious (Meinke, 2013), and their severity can be predicted from sequence analysis (Kumar et al., 2009). The absence of genetic structure in mutant populations should simplify association studies and collapsing homoeologous groups, that lack functional divergence in 'recent' polyploids like wheat (Krasileva et al., 2017), should also improve power.
To investigate whether burden tests can be applied to the Cadenza population, we measured the fatty acid composition of lipids in individual M 4 grains (caryopses) from 1188 exomesequenced lines using gas chromatography and calculated the proportion of unsaturated fatty acids that are polyunsaturated (x-6 desaturation efficiency or x-6DE), which is a simple adaptive metabolic trait (Menard et al., 2017) and a determinant of edible oil quality (Hajiahmadi et al., 2020). As summarized in Figure 1a, we extracted a list of putative deleterious mutations in the M 2 population (Krasileva et al., 2017) using BioMart within Ensembl-Plants (https://plants.ensembl.org/biomart/martview) and collapsed them by gene and by homoeologous group (triad) (Ram ırez-Gonz alez et al., 2018). These mutations were given equal weight and include stop codon gained, start codon lost, splice donor and acceptor variants and non-synonymous mutation with a SIFT (sorting intolerance from tolerance) score <0.05 (Kumar et al., 2009). We then performed gene and triad-based burden tests using a single-locus linear model (CMLM) implemented in GAPIT (genome association and prediction integrated tool) (Lipka et al., 2012).
TaFAD2 and TaROD1 transcripts are average length for wheat (~1.6 and~1.5 kb), encoding proteins of~390 and~300 amino acid residues, respectively. The 1188 M 4 lines that we screened contained 22-24 putative deleterious mutations in each TaFAD2 gene, and 6-9 in each TaROD1 gene, when the M 2 generation was exome sequenced (Krasileva et al., 2017). To confirm that disruption of the TaFAD2 genes causes a reduction in x-6DE, we selected two independent lines with mutations in each gene that had low x-6DE in our screen (Figure 1e). We backcrossed them to wildtype and identified five homozygous and five wildtype segregant BC 1 F 2 plants using KASP (kompetitive allele specific PCR) assays and further confirmed their genotype by DNA sequencing (Krasileva et al., 2017). We then analysed the fatty acid composition of their BC 1 F 3 grains and found that x-6DE is significantly (P < 0.05) lower in all the homozygous TaFAD2 mutants (M) versus wildtype (WT) segregants (Figure 1e). The decrease in x-6DE is small (<9%), but owing to the high broad-sense heritability of the trait (H 2~0 .9), the effect size is very large (Cohen's d > 0.8).
In conclusion, we show that gene and homoeologous groupbased burden tests can identify causal genes for a simple metabolic trait in an exome-sequenced polyploid mutant population. Many rare-variant association analysis methods have been developed and may be applicable, including burden tests with more sophisticated weighting, variance-component and combined tests (Lee et al., 2014). We have collapsed point mutations in the Cadenza population, but deletions are also present  (Krasileva et al., 2017) and could be included. The gene redundancy that exists in polyploid mutant populations likely provides a trade-off between power and effect size when applying burden tests. Redundancy allows polyploids to tolerate high mutant loads (Krasileva et al., 2017), providing smaller populations with more collapsible variants per gene (and homoeologous group). However, redundancy also hides the phenotypic effects of variants (Krasileva et al., 2017). It is intuitive that more heritable traits that are controlled by fewer (and larger) genes will likely be more amenable to genetic dissection using burden tests. Mutant populations of tetraploid wheat (Krasileva et al., 2017) and many other polyploid crops such oilseed rape (Brassica napus) and false flax (Camelina sativa) might also be amenable to burden tests.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article.