Infectious diseases represent a major threat to farmed animals , having a large impact on animal health and welfare, production, and compromising human food security. Genetic disease control strategies that focus on enhancing host response to infectious pathogens have gained attention in recent years. Disease resilience, the ability of an animal to cope and survive an infectious disease , has emerged as a desirable breeding goal. However, resilience is a complex trait that involves various underlying mechanisms , including host resistance, endurance and infectivity, which are not easily disentangled.
In this study, we used whole-genome sequencing and imputation to genotype all the genetic variants , including structural variants, in a turbot breeding population challenged with the parasite Philasterides dicentrarchi , a ciliate that causes scuticociliatosis resulting in high mortality in farmed flatfish. The design of this challenge allows disentangling the different components of resilience
, and the goal of this study is to evaluate whether whole-genome sequencing combined with genome functional annotation might i) improve our ability to accurate measure these traits, ii) reveal their genetic architecture and iii) increase the accuracy of selection.
Materials and methods
The turbot population consisted of ~1,400 animals from 36 full-sib families, challenged with P. dicentrarchi using a complex semi- factorial design involving donor and recipient families (full design in Anacleto et al. 2019), to enable decomposing the resilience trait into resistance (propensity to become infected), endurance (propensity to survive the infection) and infectivity (propensity to transmit the disease). For this project, we sequenced the whole genome of the 5 4 turbot parents of the challenged families using a Novaseq 6000 150 PE sequencing . After quality control, filtered reads were aligned with BWA-MEM to the new turbot chromosome level genome assembly
(GCA_013347765.1) . Single nucleotide polymorphisms (SNPs) were called with BCFtools and structural variants with Smoove, following a custom pipeline
. The genotypes of the 54 whole-genome sequenced (WGS) parents were used as the reference population to impute the 2b-RAD SNPs sequenced genotypes of their offspring using FImpute v.3. After quality control, the resulting imputed dataset was used to estimate genetic parameters of resilience traits with ASREML v.4.2 , and conduct genome-wide association study (GWAS) for each trait using GCTA v.1.24.7. The functional annotation of the genome including regulatory regions, obtained from the AQUA-FAANG project, was intersected with the genetic variants, and used to test if their prioritization led to improved accuracy of genomic selection using BayesRCO.
The average sequencing coverage across the 54 whole genome sequenced samples was ~14 ± 8.6. After quality control, 2,825, 587 SNPs and 8,722 high-confidence SVs remained for imputation. There were 10,301 common SNPs between the WGS and the 2b-RAD of the offspring, which were used as anchorage for imputation of the whole population to whole-genome genotypes. After imputation, 1,100,299 SNPs and 1, 390 offspring passed filtering and quality control. The heritabil ity estimate for the composite resilience trait time to death was 0.15 ± 0.04, similar to that previously reported
, and GWAS using the imputed data identified a polygenic genetic architecture of the trait , with no major SNP surpassing the genome-wide significance threshold (Fig.1).
Figure 1 . Manhattan plot of the GWAS with GCTA software for the trait of resilience. The values on the y-axis represent the −log10 of the P value and the x-axis the positions on the chromosomes. The red line is the 5% genome-wide significance threshold and the blue line is the 5% chromosome-wide significance threshold (Bonferroni correction).
We are now estimating the genetic parameters and performing GWAS for resistance, endurance and infectivity to understand the genetic makeup of resilience to P. dicentrarchi in this turbot population. The study also aims to investigate the accuracy of genomic prediction using preselected variants integrating genome annotation information from the AQUA-FAANG project to enrich the dataset with functional information and improve the predictive performance of the models. The incorporation of biological knowledge into genomic prediction models can provide insights into the underlying biological processes of disease- related traits and be useful for future functional studies.
survival in epidemics. Sci. Rep. 9, 1–12 (2019).
2. Martínez, P. et al. A genome-wide association study, supported by a new chromosome-level genome assembly, suggests sox2 as a main driver of the undifferentiatiated ZZ/ZW sex determination of turbot (Scophthalmus maximus). Genomics 113, 1705–1718 (2021).
3. Bertolotti, A. C. et al. The structural variation landscape in 492 Atlantic salmon genomes. (2020) doi:10.1038/s41467-020-18972-x.
4. Saura, M. et al. Disentangling genetic variation for resistance and endurance to scuticociliatosis in turbot using pedigree and genomic information. Front. Genet. 10, 539 (2019).