A - Papers appearing in refereed journals
Mitchell, R. A. C. 2024. Identification of universal grass genes and estimates of their monocot-/ commelinid-/ grass-specificity. Bioinformatics Advances. p. vbaf079. https://doi.org/10.1093/bioadv/vbaf079
Authors | Mitchell, R. A. C. |
---|---|
Abstract | The evolutionary success of grasses is due to characteristics of resilience and fast growth in open habitats that led to their underpinning of agriculture and is attributable to many grass-specific traits. Genes responsible for these traits are likely specific to grasses, highly conserved and present in all grasses (universal genes) as they perform essential functions for fitness. A bioinformatics pipeline was developed to identify such genes using 16 grass full genomes in Ensembl Plants release 56. The first steps used existing gene models to generate groups of grass orthologs to rice and maize genes present in most grass species and refined membership of these groups such as to optimise the Hidden Markov Model (HMM) profile score from the HMMER package. These were then supplemented using new gene models found in grass genomes with the genBlastG tool; this step increased the number of universal groups by >2-fold to give 12,855 highly conserved, universal groups. Specificity for these groups was assessed using closest matching gene models from non-monocot species. Possible cut-off values were tested with sets of known genes expected to be either of common function for all plants, or of commelinid- / grass-specific function. A specificity metric based on HMM score from grass group profiles performed better than % identity as a means of discriminating between these common and specific function test sets. Using an appropriate cut-off for this metric, 5,701 of the groups were identified as monocot- / commelinid- / grass-specific of which 72% appeared to be grass specific. These results comprise the universal_grass_peps database available at DOI doi.org/10.23637/rothamsted.98ywz. This database can be searched by researchers to determine whether their experimentally identified grass genes match universal groups and, for those that do, to obtain systematic estimates of monocot- / commelinid- / grass-specificity. |
Keywords | Monocot; Grass evolution; Gene model; Functional orthologs; Genomics |
Year of Publication | 2024 |
Journal | Bioinformatics Advances |
Journal citation | p. vbaf079 |
Digital Object Identifier (DOI) | https://doi.org/10.1093/bioadv/vbaf079 |
Open access | Published as ‘gold’ (paid) open access |
Funder | Biotechnology and Biological Sciences Research Council |
Funder project or code | Xylan arabinosyl transferases: identification and characterisation of their role in determining properties of grass cell walls |
Designing Future Wheat (DFW) [ISPG] | |
Publisher's version | |
Accepted author manuscript | |
Supplemental file | |
Supplemental file | |
Output status | Published |
Publication dates | |
Online | 07 Apr 2025 |
Publication process dates | |
Accepted | 04 Apr 2025 |
ISSN | 1367-4803 |
Publisher | Oxford University Press (OUP) |
Permalink - https://repository.rothamsted.ac.uk/item/99007/identification-of-universal-grass-genes-and-estimates-of-their-monocot-commelinid-grass-specificity