Aquaculture Europe 2023

September 18 - 21, 2023

Vienna,Austria

Add To Calendar 19/09/2023 10:30:0019/09/2023 10:45:00Europe/ViennaAquaculture Europe 2023FUNCTIONAL ANNOTATION AND COMPARATIVE ANALYSIS OF THE DUPLICATED GENOMES OF SALMONID FISHESStolz 0The European Aquaculture Societywebmaster@aquaeas.orgfalseDD/MM/YYYYaaVZHLXMfzTRLzDrHmAi181982

FUNCTIONAL ANNOTATION AND COMPARATIVE ANALYSIS OF THE DUPLICATED GENOMES OF SALMONID FISHES

D.J. Macqueen *1 , D. Perojil Morata 1 , M.-O. Baudement 2 , P. Dewari 1, G. Gillard 2 , D. Baranasic 3,4 , M.K. Gundappa 1 , T. Podgorniak 2 , L. Grønvold 2, A. Laurent 5 ,  A. Perquis 5 , R. Ruiz Daniels 1 ,  G. Ilsley 6 , P. Harrison 6 ,  D. Thybert 6 , J. Bobe 5 , C. Guyomar 5, T . Desvignes 7, C. Berthelot 8,9 ,  E. Parey 5,8 , A. Louis 8 , F. Giudicelli 8,  H. Roest Crollius 7 , T. Hvidsten 2,10 , S. Sandve 2 , B. Lenhard 3,4 ,  Y. Guiguen 5, M. Kent 2, S. Lien 2

 

1 The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, UK.

2  Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Centre for Integrative Genetics (CIGENE), Norwegian University of Life Sciences, Ås, Norway.

3 MRC London Institute of Medical Sciences, London, UK.

 4 Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus, London, UK.

 5 INRAE, LPGP, Rennes, France.

6  European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

7 1

8 Institut de Biologie de l’ENS (IBENS), Département de Biologie , École Normale Supérieure, CNRS, INSERM, Université PSL, Paris, France.

9 Institut Pasteur, Université Paris Cité, CNRS UMR 3525, INSERM UA12, Comparative Functional Genomics group, F-75015 Paris, France.

10  Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway.

 

* Email: daniel.macqueen@roslin.ed.ac.uk

 



Introduction

High-quality genome functional annotation crucially underpins fundamental and applied genomics studies in all species. The genome sequence of aquatic species, however, typically have limited functional annotation datasets publicly available. Several projects within the international ‘ Functional Annotation of Animals Genomes’ (FAANG) initiative (Giuffra et al. 2019) seek to enhance the quality of functional annotations in diverse farmed  animal taxa .  AQUA-FAANG (https://www.aqua-faang.eu/ )  has generated  extensive functional annotation data for six commerci ally important finfish species, including Atlantic salmon ( Salmo salar) and rainbow trout (Oncorhynchus mykiss).  The comparative nature of  AQUA-FAANG  datasets provides scope to explore the evolutionary conservation of  fish-specific genomic  features controlling gene expression phenotypes .  The aim of this study was to comprehensively map expressed genes and regulatory elements across  the genomes of both salmonid species ,  including their  evolutionary  conservation following  the  salmonid-specific whole genome duplication (ssWGD )  event, which  occurred 100 MYA (Lien et al. 2016; Gundappa et al. 2022).

Methods

 For  Atlantic salmon  and rainbow trout,  we generated matched  functional genomics  data for samples spanning ontogeny, including 14 stages of embryogenesis and a  common  panel of tissues (liver, brain, gill, intestine, muscle, head kidney, ovary and testis)  from males and females at sexually immature and mature stages. The data was generated using standardized methods, and represents approximately 1,000 sequencing libraries, including 130 mRNA-Seq, 70 small RNA-Seq, 80 ATAC-Seq and 270 ChIP-Seq (H3K27ac, H3K4me1, H3K4me3, H3K27me3 histone marks) datasets . Primary analysis for mRNA-Seq, ATAC-Seq and ChIP-Seq was done using nf-core pipelines (Ewels et al 2020) .  Small RNA data was annotated by the fish microRNA database (Desvignes et al. 2022 ). All data was submitted to the FAANG data portal (Harrison et al. 2021), and is  being used to up-date gene annotations  and produce the first regulatory annotations  for any teleost species in the Ensembl Genome Browser (ATAC-Seq data for both species publically available as of release 109). W e generated novel comparative tools to navigate  this new dataset , including a homology prediction integrating phylogenetic and synteny criteria to capture > 10,000 high-confidence duplicate gene pairs per species, in addition to a multispecies genome alignment including duplicated regions  from ssWGD. ChromHMM  (Ernst and Kellis , 2012)  was used to model chromatin states in embryonic stages and adult tissues  combining  ATAC-Seq and ChIP-Seq data, separately for each species .  We clustered  and visualized gene  expression  and open chromatin regions  across samples using self-organizing maps (SOM)  and Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction. E nrichment in transcription factor  (TF)  binding within  regulatory elements was done  using GimmeMotifs ( van Heeringen & Veenstra, 2011) .

Results & Discussion

 SOM clustering and UMAP identified modules of genes showing specific expression at di fferent stages of embryogenesis in both species, for example genes upregulated during the maternal-to-zygotic transition. In adult tissues, we identified tissue-specific, stage-specific and sex-specific clusters of gene expression. ChromHMM identified several distinct chromatin states, including active and bivalent promoters, acti ve, primed and poised enhancers, polycomb repressed regions, and unmarked open chromatin.  We observed the  expected  enrichment of  predicted  TF binding in promoter and enhancer elements according to sample types, e.g.  Sox2/Oct in early  embryo stages, Hox in mid-embryogenesis, DMRT3 in testis, HNF in liver, NeuroD in brain, among many others.

 In Atlantic salmon, we identified 313,000 high-confidence open chromatin regions across sample types. SOM clustering and UMAP captured the dynamics of  regulatory  element activity across ontogeny, and revealed regulatory elements specific to tissues, stages and sexes. In embryos, most stage-specific open chromatin regions  were present at the end of segmentation, a stage where conserved non-coding elements  were  strongly overrepresented.  We quantified the average divergence in expression of duplicated genes retained from ssWGD across sample types. We also quantified the conservation of chromatin accessibility and regulatory elements in duplicated and singleton (i.e. where one duplicated copy was lost) regions within the Atlantic salmon genome . These analyses revealed extensive variation in evolutionary constraint acting on duplicated gene expression and regulation across ontogeny, broadly matching with expectations of the hourglass model of development (Uesaka et al. 2022).

 As a next step,  we will overlay our comprehensive annotations of regulatory elements with genetic variation, including single nucleotide polymorphisms and structural variations defined in the Atlantic salmon genome. This will enable the prioritization of candidate causal genetic variants  responsible for gene expression phenotypes, with diverse future applications in support of sustainable and profitable aquaculture.

Acknowledgements :  The AQUA-FAANG project received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 817923 (www.aqua-faang.eu).

 References

 Ernst J, Kellis M.  (2012) Nat Methods . 9:  215-216.

Ewels PA, Peltzer A, Fillinger S, et al. (2020)  Nat Biotechnol. 38:  276-278.

Desvignes  T, Bardou P, Montfort J, et al (2022).  Mol Biol Evol .  39 :msac 004.

Giuffra E, Tuggle CK; FAANG Consortium. (2019)  Annu Rev Anim Biosci .  7:65-88.

Gundappa MK, To TH, Grønvold L, et al. (2022)  Mol Biol Evol .  39 :msab 310.

 Harrison PW, Sokolov A, Nayak A, et al.  (2021). Front Genet. 12:  639238.

 Lien S, Koop BF, Sandve SR, et al.  (2016). Nature. 533:  200-205.

Uesaka M, Kuratani S, Irie N. (2022).  J Exp Zool B Mol Dev Evol. 338:  76-86.

 van Heeringen SJ, Veenstra GJ. (2011). Bioinformatics . 27  :270-271.