universal_grass_peps
This dataset is the output from a bioinformatics pipeline developed by Rowan Mitchell during 2018-2024 that seeks to identify all universal protein-coding genes in grasses and to estimate how specific they are to grasses. The dataset has 5 components: (1) universal_grass_peps.xlsx contains summary information on all the universal groups of peps identified. (2) files in genBlastG/* are genome annotation files for each novel gene model generated by the genBlastG files in the pipeline. (3) hmms/*.msa.fa are the multiple alignment sequence fasta files, one for each group. (4) files hmms/final_db.hmms* are for use to search the database with query sequences using the HMMER package. (5) files in lookup/* allow users to find which groups a grass query pep ID is a member of, or associated to, for 16 different grass species.
| Item Type | Data Collection |
|---|---|
| Creators | Mitchell, Rowan |
| Keywords | genomic features; data analysis; networks |
| Project | Xylan arabinosyl transferases: identification and characterisation of their role in determining properties of grass cell walls, Designing Future Wheat (DFW) [ISPG] |
| Date | 5 December 2023 |
| Data collection method | The genomic sequences that were used in the pipeline were taken from Ensembl Plants release 56 (February 2023). https://feb2023-plants.ensembl.org/index.html |
| Publisher | Rothamsted Research |
| Digital Object Identifier (DOI) | 10.23637/rothamsted.98ywz |
Explore Further
-
subject - Other
-
- Available under Creative Commons: Attribution 4.0
description - text/plain
- folder_info
- 1MB
-
subject - Other
-
- Available under Creative Commons: Attribution 4.0
folder_zip - application/x-gzip
- folder_info
- 49MB

