US-AUSTRALIAN ACADEMIES JOINT WORKSHOP
US-AUSTRALIAN ACADEMIES JOINT WORKSHOP ON VERTEBRATE COMPARATIVE GENOMICS
Beckman Conference Centre, Irvine, California, 23-25 May 2007
Predicting genome structure from limited sequencing - the virtual sheep genome
by Brian Dalrymple
Dr Brian Dalrymple received his PhD in DNA Replication from the University of Leicester (UK) in 1982. He has since held research positions at the Biozentrum, University of Basel (Switzerland) and a number of divisions of CSIRO (Australia). Currently he is a Senior Principal Research Scientist, Leader of the Computational and Systems Biology Stream and Leader of the Bioinformatics Group, CSIRO Livestock Industries. His current research interests include genome sequencing, assembly and annotation, in particular the cattle and sheep genomes, utilisation of the sequence information by cattle and sheep researchers, non-coding RNA and gene expression and systems biology, in particular for muscle structure and function.
The availability of a complete genome sequence underpins many of the tools used in modern genetics/genomics, such as high-throughput genotyping assays based on SNPs. Despite the availability of many fully, and even more partially, sequenced mammalian genomes, the genome sequence of many species of interest are not high on the sequencing priority list.
Here we provide an example of how to combine limited sequence from the organism of interest and the genome sequences of other mammals to create a virtual genome sequence. A high coverage sheep BAC-library was constructed and end-sequenced. By scaffolding the sheep BACs on the current cow, dog and human genome assemblies around 50% of the ~200 k BACs have been positioned on the human genome with both ends in a tail to tail configuration and between 10 and 500 kb apart. The utilisation of three genomes substantially increased the number of BACs that could be positioned and hence the coverage of the human (and therefore sheep) genome in BAC-comparative genome contigs (BAC-CGCs).
Using the sheep marker linkage map the BAC-CGCs were oriented and ordered into our best guess of the structure of the sheep genome. Whilst many breakpoints and rearrangements could be positioned fairly accurately, due to the small number of markers on the sheep map the location and orientation of many of the fragments was based on conserved synteny amongst vertebrate or mammalian genomes.
The resulting virtual sheep genome enables the capture of the annotation of the human, dog and cow genomes ordered appropriately for the sheep research community. The virtual genome is an integral part of the development of other sheep genomics research tools, such as a sheep whole genome SNP chip and the planned eventual sequencing of the complete genome.
Contact details:
CSIRO Livestock Industries
306 Carmody Road
St Lucia 4067
Tel (+61) 7 3214 2503
Email; brian.dalrymple@csiro.au
Web: www.livestockgenomics.csiro.au



