A genome assembly and transcriptome atlas of the inbred Babraham pig to illuminate porcine immunogenetic variation
The inbred Babraham pig serves as a valuable biomedical model for research due to its high level of homozygosity, including in the major histocompatibility complex (MHC) loci and likely other important immune-related gene complexes, which are generally highly diverse in outbred populations. As the ability to control for this diversity using inbred organisms is of great utility, we sought to improve this resource by generating a long-read whole genome assembly and transcriptome atlas of a Babraham pig. The genome was de novo assembled using PacBio long reads and error-corrected using Illumina short reads. Assembled contigs were then mapped to the porcine reference assembly, Sscrofa11.1, to generate chromosome-level scaffolds. The resulting TPI_Babraham_pig_v1 assembly is nearly as contiguous as Sscrofa11.1 with a contig N50 of 34.95 Mb and contig L50 of 23. The remaining sequence gaps are generally the result of poor assembly across large and highly repetitive regions such as the centromeres and tandemly duplicated gene families, including immune-related gene complexes, that often vary in gene content between haplotypes. We also further confirm homozygosity across the Babraham MHC and characterize the allele content and tissue expression of several other immune-related gene complexes, including the antibody and T cell receptor loci, the natural killer complex, and the leukocyte receptor complex. The Babraham pig genome assembly provides an alternate highly contiguous porcine genome assembly as a resource for the livestock genomics community. The assembly will also aid biomedical and veterinary research that utilizes this animal model such as when controlling for genetic variation is critical.