Identification of universal grass genes and estimates of their monocot-/commelinid-/grass-specificity
The evolutionary success of grasses is due to characteristics of resilience and fast growth in open habitats that led to their underpinning of agriculture and is attributable to many grass-specific traits. Genes responsible for these traits are likely specific to grasses, highly conserved and present in all grasses (universal genes) as they perform essential functions for fitness. A bioinformatics pipeline was developed to identify such genes using 16 grass full genomes in Ensembl Plants release 56. The first steps used existing gene models to generate groups of grass orthologs to rice and maize genes present in most grass species and refined membership of these groups such as to optimise the Hidden Markov Model (HMM) profile score from the HMMER package. These were then supplemented using new gene models found in grass genomes with the genBlastG tool; this step increased the number of universal groups by >2-fold to give 12,855 highly conserved, universal groups. Specificity for these groups was assessed using closest matching gene models from non-monocot species. Possible cut-off values were tested with sets of known genes expected to be either of common function for all plants, or of commelinid- / grass-specific function. A specificity metric based on HMM score from grass group profiles performed better than % identity as a means of discriminating between these common and specific function test sets. Using an appropriate cut-off for this metric, 5,701 of the groups were identified as monocot- / commelinid- / grass-specific of which 72% appeared to be grass specific. These results comprise the universal_grass_peps database available at DOI doi.org/10.23637/rothamsted.98ywz. This database can be searched by researchers to determine whether their experimentally identified grass genes match universal groups and, for those that do, to obtain systematic estimates of monocot- / commelinid- / grass-specificity.
| Item Type | Article |
|---|---|
| Open Access | Gold |
| Keywords | Monocot, Grass evolution, Gene model, Functional orthologs, Genomics |
| Project | Xylan arabinosyl transferases: identification and characterisation of their role in determining properties of grass cell walls, Designing Future Wheat (DFW) [ISPG] |
| Date Deposited | 05 Dec 2025 10:41 |
| Last Modified | 19 Dec 2025 14:57 |


