The distinct roles of genome, methylation, transcription, and translation on protein expression in Arabidopsis thaliana resolve the Central Dogma’s information flow
Background: We investigate the flow of genetic information from DNA to RNA to protein as described by the Central Dogma in molecular biology, to determine the impact of intermediate genomic levels on plant protein expression. Results: We perform genomic profiling of rosette leaves in two Arabidopsis accessions, Col-0 and Can-0, and assemble their genomes using long reads and chromatin interaction data. We measure gene and protein expression in biological replicates grown in a controlled environment, also measuring CpG methylation, ribosome-associated transcript levels, and tRNA abundance. Each omic level is highly reproducible between biological replicates and between accessions despite their ~1% sequence divergence; the single best predictor of any level in one accession is the corresponding level in the other. Within each accession, gene codon frequencies accurately model both mRNA and protein expression. The effects of a codon on mRNA and protein expression are highly correlated but independent of genome-wide codon frequencies or tRNA levels which instead match genome-wide amino acid frequencies. Ribosome-associated transcripts closely track mRNA levels. Conclusions: DNA codon frequencies and mRNA expression levels are the main predictors of protein abundance. In the absence of environmental perturbation neither gene-body methylation, tRNA abundance nor ribosome-associated transcript levels add appreciable information. The impact of constitutive gene-body methylation is mostly explained by gene codon composition. tRNA abundance tracks overall amino acid demand. However, genetic differences between accessions associate with differential gene-body methylation by inflating differential expression variation. Our data show that the dogma holds only if both sequence and abundance information in mRNA are considered.
| Item Type | Article |
|---|---|
| Open Access | Gold |
| Additional information | This work was funded by BBSRC grants BB/T002182/1, BB/X017877/1, BB/W019620/1 , and by BB/W510543/1—21ROMITIGATIONFUND Rothamsted. Funding by the Max Planck Society and the European Research Council under the European Union’s Horizon 2020 Research and Innovation Programme (ERC Starting Grant number 803825-TransTempoFold). |
| Keywords | Gene-body methylation, Mim-tRNAseq, RNAseq, Ribosome-associated expression, Gene expression , Protein expression, Data-independent acquisition, Genome assembly, Chromatin interaction, Long reads, Central Dogma |
| Project | What determines protein abundance in plants?, FTMA4 - Training and mobility to support multi-omic analysis, 21ROMITIGATIONFUND, BB/W019620/1 |
| Date Deposited | 05 Dec 2025 10:46 |
| Last Modified | 19 Dec 2025 14:58 |


