Go Back

Automated Analysis of Whole Genomes: Experiences and Lessons

Terry Gaasterland

Annotating thousands of ORFs in a newly sequenced genome is a detailed process that involves assessing evidence of different types from many sources. We have built a system called MAGPIE, to find coding regions, collect evidence to answer ``Does it code?" and ``For what?", and then facilitate confirmation and editing of annotations. Through MAGPIE, we have been exploring how to automate the annotation process, make decisions automatically, and update annotations regularly. We illustrate the steps and difficulties involved in annotation with four cases: an ORF with ``perfect" evidence in which all tools agree an ORF with strong evidence that is sparse but clear an ORF with good but seemingly conflicting evidence an ORF with strong, partial evidence that covers half of the ORF an ORF with weak evidence Finally, I will touch on other uses of MAGPIE, namely, cross-genome comparisons, tracking random protein sequences, and comparing microbial pathogens to eukaryotes. (This talk includes work done in collaboration with Christoph Sensen of the Institute for Marine Biosciences, Halifax, Nova Scotia)

Go Back