On April 14, 2003, the National Human Genome Research Institute and its international partners, including the Baylor College of Medicine Human Genome Sequencing Center (BCM-HGSC), announced the completion of the Human Genome Project and the successful generation of a highly accurate and publicly available reference sequence of the human genome. The approximately 3 billion letters of ordered DNA sequence provided for the first time a human genetic blueprint, building a framework of knowledge for pursuing numerous new and exciting biological studies and eventually integrating them into diagnosis and therapies for human diseases.
To celebrate the anniversary of completion of this unprecedented project, carried out from 1990 to 2003 and considered one of the most ambitious and important scientific endeavors in human history, From the Labs sat with Dr. Richard Gibbs, director of the BCM-HGSC since its establishment in 1996, to learn about the role BCM has played in this landmark global scientific effort.
“We were engaged in the Human Genome Project from the very beginning,” Gibbs said. “Baylor has an appetite for genomics that began with the late Dr. C. Thomas Caskey who built the genetics program at Baylor from the ground up. He founded the Department of Molecular and Human Genetics and promoted its growth into a national leader in the field.”
Back in 1985, Caskey had met Gibbs in Melbourne, Australia, and offered him a postdoctoral position in his Baylor laboratory. “Caskey provided an unparalleled opportunity and created an environment that fostered innovation and success,” Gibbs said. In the next several years, Gibbs developed various DNA sequencing methods and applied them to decode individual genes – the parts of the genome that code for the proteins that make up living cells.
Only a few genes had been sequenced with limited technologies, but the idea of sequencing the entire human genome had been brewing in the minds of scientists for some time. A special committee of the U.S. National Academy of Sciences outlined the original goals for the Human Genome Project in 1988, which included sequencing the entire human genome in addition to the genomes of several carefully selected non-human organisms.
The goal was to uncover information that would guide a new era for biomedical research in which detailed genetic information would inform medical diagnosis and treatments and lead to improved outcomes.
The national program began to warm up in 1994-’95. “All the time leading up to that we had been developing methods for DNA sequencing,” Gibbs said. “Beginning in 1993-‘95, there were a series of competitions through which we were recognized as a leading sequencing center. We had begun to assemble a dream team. Things accelerated in 1996, when Donna Muzny and I formally declared our center the HGSC. Back in those days things were simpler; the center was born without pomp and ceremony but grew steadily with federal funding to fulfill the challenges of the program. We engaged many BCM faculty and brought on board many talented staff and associates. George Weinstock joined us from the University of Texas, so we were a truly multitalented organization.”
The International Human Genome Project team involved scientists from 20 institutions in six countries: France, Germany, Japan, China, the U.K. and the U.S. All of these countries played an important role in the project; however, five institutions generated more than 80% of the Human Genome Project data. Nicknamed the ‘G5,’ these were: Baylor College of Medicine in Houston, the Broad Institute/Whitehead Institute for Biomedical Research (MIT) in Cambridge, the Department of Energy’s Joint Genome Institute in Walnut Creek, Washington University in St Louis, all in the U.S., and the Wellcome Trust Sanger Institute, in Cambridge, UK.
In June 2000, the International Human Genome Sequencing Consortium announced that it had produced a draft human genome sequence that accounted for 90% of the human genome.
Baylor contributed about 10% of that draft. The draft sequence was useful – but far from perfect as it contained hundreds of thousands of errors and gaps where the DNA sequence was unknown because it could not be determined accurately.
“Although at that time it was debated whether the draft was useful enough, we were always committed to making the sequence as good as we could because it would need to be of high quality to be fully usable,” Gibbs said. “We continued to work on the program from 2000 to 2003, when we produced an essentially complete human genome sequence that was as good a job as we could possibly do with the technologies and resources that we had at the time. It accounted for 92% of the human genome and had less than 400 gaps; it was much more accurate than the draft.”
In part due to a deliberate focus on technology development, the Human Genome Project ultimately exceeded its initial set of goals. It was completed by 2003, two years ahead of its originally projected 2005 completion. By 2022, technologies had evolved that enabled scientists to finally sequence the remaining 8 % of the genome that was not completed in 2003.
The genome was divided by chromosomes and chromosomal regions among the G5, depending on each site’s capacity and previous interests on chromosomes.
“We had interest in chromosomes 3, 12 and X, for slightly different reasons in each case,” Gibbs said. “We shared the sequencing of the X chromosome because many researchers were interested in its disease-associated genes. For chromosome 12, we first developed interest because it contains the gene for CD4, the major receptor for the HIV virus. Shortly after, we found another region of chromosome 12 that is very strongly conserved in human evolution. To sequence chromosome 12, we collaborated with a Dr. Raju Kucherlapati, whose group had already been developing a map of this chromosome.”
The BCM team was interested in chromosome 3 because it includes major genes involved in cancer response. There are genes for chemokine receptors, important components of the immune response, on chromosome 3 that also are involved in infectious diseases. “We collaborated with Dr. Susan Naylor in San Antonio as well as Professor Huanming Yang in Beijing and their teams, to sequence chromosome 3,” Gibbs said.
What did the Human Genome Project really do?
From the beginning, the goal of the Human Genome Project was to make medicine more scientific by offering an understanding of the disease processes through genetics and genomics.
“To achieve this goal, there are three steps we must walk through: one is to describe what we have. Next, to understand how it works so we can finally alter it or improve it,” Gibbs said. “The genome project was in essence that first step. It was essentially a description of the genome. Once we had finished the description, suddenly we had learned quite a bit about how it works, but that journey is going to take us many years more. We have been steadily building on that information ever since.”
The project generated immense amounts of information, which stimulated computational developments to digitize the data, making it manageable and accessible to the scientific community. The genome project also has provided the knowledge of the overall organization of the genome and the ability to study it subsequently in a systematic way.
When we completed the genome sequencing in 2003, that was the first time we had a global idea of the complement of the human genome content,” Gibbs said.
“It’s one thing to look down and narrow the view through a telescope, and another to have a panoramic view of the distribution of genes by functional categories. How many genes are involved in neural processes? How many in muscular development or in metabolism? The Human Genome Project gave us essentially a comprehensive view of the human genetic blueprint, that was really the most exciting moment.”
There was also a moment when scientists first saw that the number of human genes in the genome (between 20,000 and 25,000) was actually much smaller than what they had thought there would be (once, it was thought there were about 100,000 genes). That was a surprise.
Thanks to the Human Genome Project
“The genome project provided the foundation to move on to other programs, such as Haplotype Mapping Project, that catalogued global patterns of genetic variation, and the Cancer Genome Atlas that has helped us understand how normal cells change and become cancer cells,” Gibbs said.
The BCM-HGSC was also involved in different animal projects. They have sequenced the genomes of the rat, mouse, macaque, bovine, sea urchin, fruit fly and dolphin. Each one has provided tools to further study their respective biology.
The sequence from the Human Genome Project was a mosaic, made from different people. Since then, new technologies have allowed them to sequence individual genomes. They completed the first individual human genome, the James Watson project, in 2007. (Watson is the co-discoverer of the double helix structure of the DNA).
BCM-HGSC has continued its contributions to technological advances, for instance with methods for rapid sequencing of only exons, the genes that code for specific proteins, which represent just 1% of the total genome. They began personalized sequencing, which has shown genomics could have significant practical applications.
Our goal is to integrate genome analysis into diagnoses and therapies,” Gibbs said. “Our mission is to bring us to that point as quickly and effectively as possible as we tackle heart disease, Alzheimer’s disease, cancer and all other genetic maladies.”