High-quality, haplotype-phased de novo assembly of the highly heterozygous fig genome, a major genetic resource for fig breeding

C1 - Edited contributions to conferences/learned societies

Usai, G., Mascagni, F., Giordani, T., Vangelisti, A., Bosi, E., Zuccolo, A., Ceccarelli, M., King, R., Hassani-Pak, K., Solorzano, L. S., Cavallini, A. and Natali, L. 2021. High-quality, haplotype-phased de novo assembly of the highly heterozygous fig genome, a major genetic resource for fig breeding. International Society of Horticultural Science (ISHS). https://doi.org/10.17660/ActaHortic.2021.1310.4

AuthorsUsai, G., Mascagni, F., Giordani, T., Vangelisti, A., Bosi, E., Zuccolo, A., Ceccarelli, M., King, R., Hassani-Pak, K., Solorzano, L. S., Cavallini, A. and Natali, L.
Abstract

The genome assembly of allogamous perennial species can be very challenging due to the high heterozygosity and repeat content they present. In fruit trees, many important phenotypic traits of a specific genotype lie in its heterozigosity, maintained by a widespread clonal propagation. The fig tree (Ficus carica L.) has a great potential for expansion thanks to valuable nutritional and nutraceutical characteristics, combined with the ability to adapt well to marginal soils and difficult environmental conditions. However, the fig is still poorly characterized at genomic level, and only a preliminary genome sequence (of the Japanese cultivar 'Horaishi') has been released. Here we report a de novo high-quality assembly of the typical Italian fig cultivar 'Dottato' obtained by single-molecule, real-time sequencing (SMRT). PacBio reads (with average length of 12,364 nt and corresponding to about 74 genome equivalents) allowed us to obtain sequence contiguity and resolve the repetitive component of the genome. The assembly, of approximately 333 Mb and N50 of 823 kb, was haplotype-phased using FALCON-Unzip and it is composed by 905 sequences of which 407 were arranged in 13 chromosome-related pseudomolecules. This new reference genome improved the assembly N50 of the previous short-read based fig assembly of about 5-fold. A curated genome annotation analysis resulted in the identification of 37,840 protein-coding genes and 1,685 non-coding genes. Furthermore, we found that the amount of repetitive sequences accounted for the 37.39% of the assembly. The production of a high-quality haplotype-phased reference genome sequence of fig offers interesting insights into the genomics structure of this species, opening great opportunities for speeding up the development of new cultivars and for the application to this species of genome editing, a technology which seems especially suitable to change the specific traits that are currently limiting the success of this ancient species.

KeywordsFicus carica; Genome annotation; Genome assembly; Single-molecule real-time sequencing
Year of Publication2021
Digital Object Identifier (DOI)https://doi.org/10.17660/ActaHortic.2021.1310.4
Open accessPublished as non-open access
JournalActa Horticulturae
Journal citation(1310), pp. 21-28
PublisherInternational Society of Horticultural Science (ISHS)
FunderBiotechnology and Biological Sciences Research Council
Output statusPublished
Publication dates
OnlineSep 2021
ISSN2406-6168

Permalink - https://repository.rothamsted.ac.uk/item/987q8/high-quality-haplotype-phased-de-novo-assembly-of-the-highly-heterozygous-fig-genome-a-major-genetic-resource-for-fig-breeding

Restricted files

Publisher's version

Under embargo indefinitely

119 total views
0 total downloads
5 views this month
0 downloads this month