The human genome has been fully mapped -- which means what, exactly?
Transcript
Host Amber Smith: Upstate Medical University in Syracuse, New York invites you to be The Informed Patient with the podcast that features experts from Central New York's only academic medical center. I'm your host, Amber Smith. Two decades ago, we celebrated the mapping of the human genome, but it wasn't until March 2022 that scientists announced a fully assembled genetic blueprint for human life. For help understanding why this matters so much to science, I'm talking with Upstate professor, Stephen Glatt. His areas of scientific research include neuroscience and physiology, psychiatry, and behavioral science and public health and preventive medicine. Welcome back to The Informed Patient Dr. Glatt.
Stephen Glatt, PhD: Oh, thanks Amber. It's great to be back.
Host Amber Smith: Can we start with some background, please?
Stephen Glatt, PhD: Well, yeah, I mean, it's such an exciting topic because we all took for granted that the first draft of the human genome was completed back in 2001. There was a huge public dissemination effort made around this, and that ushered in an era of expectation around personalized medicine. "Now we know what the draft of the human genome is; we can take advantage of that to design better medicines that take advantage of us knowing everyone's genetic background." But it simply wasn't the case. That was the first draft. And I think the scientists at the time were appropriately cautious to say, this is a blueprint, but it's not the full volume, right? It's not the unabridged version of the encyclopedia. It's just the draft. And so it's taken a long time to fill in the gaps, but it's estimated that that first draft covered maybe 70 plus percent of the genome. And a lot of it was still unmapped. Now, we're at a point here in 2022 where 99% or more of the genome is mapped. And so we can consider this a complete human genome. There's still probably intricacies in there that need to be worked out a little bit as well.
Host Amber Smith: So why was the human genome project originally launched? What did we want to learn from this? And why did we think it was important to embark on?
Stephen Glatt, PhD: Well, it's such a complicated question, but I'll try and do it justice. In essence, Our DNA code holds the recipe for making proteins, and proteins are the parts of our bodies that combine to make things like receptors for drugs, or natural chemicals, or the elements of our muscle tissue or skin, or our nerve cells in our brains. Everything that you can lay your hands on, on your body, is made out of protein.
And the instructions for assembling those proteins are in our genomes. And the little, minute differences between you and I and other individuals in our DNA code determines subtle differences in the proteins that we'll make. And which proteins we have determine how effective certain drugs will be. For example, one type of protein that I might make based on the recipe in my DNA code may bind more strongly to a drug or to a natural chemical than the protein that you've made.
And that's in large part determined by what DNA code we've inherited. So the scientists intuited that if we understand everyone's recipe book, everybody's individualized recipe for how they're building proteins, we may understand why some people get sick and others don't, even though they're both exposed to the same environmental circumstances. It may be because one person's making a protein that's better or worse under some certain environmental condition than another person.
So it was really intended to give us a sense of everyone's reaction range, it's called, how they're going to respond to things in their environment, based on the proteins that they're capable of making.
Host Amber Smith: Do you know how many scientists worked on this project collectively, and were they just United States-based scientists?
Stephen Glatt, PhD: Oh no. The U.S. made a major push to formalize the human genome initiative and was one of the leaders in that initiative, but scientists from over 200 different groups participated in this, in countries in the UK and France, China, and Japan, and several other countries around the world. So this was not an initiative that was led or that could have been led by any one group, any one laboratory or even one country or government. This had to be a collaborative effort. And then all the results generated in all these labs around the world needed to be stitched together, which was another huge effort.
Host Amber Smith: So did all the different labs, did they all take different pieces of the genome? Do you know how it was, I mean, because there were sequenced roughly 3 billion letters. I mean, how did you divide the work up?
Stephen Glatt, PhD: That's right. You just don't everyone set about it randomly and let's all see what we can produce. You assign certain chromosomes, of which we have dozens of these, divide them amongst the different labs that are interested. And then each lab would go about sequencing some part of the chromosome or an entire chromosome. And then stitch those results back together based on the overlapping sequences that are detected.
Host Amber Smith: We recently heard that there were gaps in the human genome project Can you explain what those gaps were that have recently been, closed?
Stephen Glatt, PhD: Sure. Yeah. I think you could say that we sequenced the easy part, but none of it was easy. But it was just that it was more accessible. If you think about each chromosome as kind of a letter X, the part where the letter X, the two lines cross in the middle, that's called the centromere. And that's a place where the DNA is really tightly wound. And because it's so tightly wound, it's not really accessible. It's being broken down or sequenced by the technologies that were in place at that time. So the centromeres were kind of off limits, and they weren't part of the original sequence draft.
The same is true of the teleomeres, which are the tips of each line in the X. So there's four tips on the line of the letter X. Those outer ends, they get degraded over time, and they're very difficult. They have highly repetitive sequences, which are hard to map too. So imagine if I asked you to remember a phone number. You could do that if the numbers have some repetitive pattern. Or if people remember the digits in Pi. Some people can memorize that. But what if the digits in Pi were just repetitive over and over again? 3.1 4, 3 1 4, 3 1 4, and so on. You might lose your place. And that's kind of what happens with sequencing technology. When there's a lot of repeated letters of the genome, those A's, C's, G's and T's that are the nucleotides in our genome. When those just repeat in sequence over and over again, it's very hard for the sequencing technology to keep track of where it is. And in the telomeres where there's these repetitive segments of DNA, they kind of left those areas out of the original draft, as well. So the extended outer arms of those chromosomes, as well as the central part of those chromosomes are what have been filled in now with better sequencing technology.
Host Amber Smith: You mentioned the X chromosome. Does this apply to the Y chromosomes, too? They have the telomeres on the ends?
Stephen Glatt, PhD: Well, that's true that the Y has telomeres, and I wasn't referring to the sex chromosomes. I know it's confusing without a visual aid. But there's X and Y chromosomes that determine which sex we are. But I was using more of the visual symbol of an X, because all the autosomes, which are all the other chromosomes, besides your sex chromosomes, also have that X-like shape.
Host Amber Smith: I gotcha. So there's equipment that we have today that allows us to do more than we could do 20 years ago, essentially?
Stephen Glatt, PhD: That's right. The process for sequencing at that time called Sanger sequencing was very methodical, very precise, but very slow. And so a lot of different laboratories were deploying Sanger sequencing and then assembling those reads into one map. Now we have something called next generation sequencing, which uses a totally different technology to sequence lots of fragments of DNA in the same sample, in parallel. So it's not just reading one. It's not like pulling one thread through a needle eye. It's like passing many threads on a loom at the same time. That's what Sanger sequencing was, is passing a thread through an eye of a needle, one strand of DNA. Whereas now, it's like multiple strands of DNA being sequenced simultaneously.
Host Amber Smith: This is Upstate's The Informed Patient podcast. I'm your host, Amber Smith talking with Dr. Stephen Glatt. He's a professor at Upstate whose areas of research include neuroscience and physiology, psychiatry, and behavioral science and public health and preventive medicine. We're talking about the importance of scientists completing the mapping of the human genome.
What more can you tell us about this newly sequenced section of the genome? I think I read that it had something to do with aging?
Stephen Glatt, PhD: Well, it's true that those telomeres -- which are the outer tips of the chromosomes, not the centromere, the central part, but the telomeres -- those have long been thought to be involved in aging, because what we've noticed is that as people and animals age, their telomeres shorten. As cells go through division cycles, as we age, your cells are constantly dividing to regenerate to make new ones, and those cells pull apart, and they pull those chromosomes apart to make new cells that each have a set of those chromosomes. And the points of attachment for those chromosomes are the telomeres. So you can imagine if you're pulling those apart repetitively, you're going to lose some fabric at the ends, and that's what's happened in telomeres. So telomere shortening is actually a proxy for one's biological age. And so having the sequence of those telomeres and finding out what areas of the telomeres are being degraded and how that correlates with aging phenotypes or disease phenotypes, that's why that's so important because it's already an area of the chromosomes that we know is related to how well or poorly an animal or an individual ages.
Host Amber Smith: Can you envision concrete ways in which this completed genome might contribute to human health?
Stephen Glatt, PhD: Oh gosh, it boggles the mind. I don't want to over promise because what we've fallen into in the past, even with the first draft of the human genome, is we set up an expectation that by knowing the sequence of the human genome, it would enable personalized medicine, and all of our problems would be fixed. Now, none of us in science believed that, but sometimes the translation of the message gets lost.
And so while knowing did genome sequence has enabled so many great advances in medicine and technology, we also know that there's a lot more to do to translate just knowledge of the structure of the genome into things that help people. But what I envision for the future really is that having this sequence of the genome now much more complete will enable us to determine what are the differences between individuals in their DNA code and how that relates to what diseases they get or what diseases they're resistant to? And that, in turn, will help us design better interventions, whether those are medicines, whether those are nonmedical interventions, but things that target our genomes in ways that we couldn't envision or understand without knowing what our blueprints are.
I think it will also help in prediction of disease so that people can understand their risks so that they can avoid certain environmental exposures that might compound those risks. For example, if your DNA code says that you're at a heightened risk for lung cancer, you certainly might benefit knowing that information, so you could avoid that first cigarette. I envision a lot of uses both in terms of the design of better medicines, the interventions that can be done in the clinical practice of medicine, but also in prevention.
Host Amber Smith: Would we ever see a time when everyone's genome is mapped and recorded for medical or identification purposes, or is that science fiction?
Stephen Glatt, PhD: I think we will see that time. And even at this stage, it's possible for each of us to have a rough map of our own whole genome sequence derived for a few hundred dollars. We can generate whole genome sequences now pretty readily on most people. We may not have all the telomeric information or the centromeric information from everybody, but you can get a readout of your own blueprint for less than a thousand dollars.
The issue is not generating those genomes. It's storing the data and using the data. If you can believe it, this generates a massive amount of data, and to store and manipulate and move that data is much more difficult than faxing over an X-ray. It takes huge computational resources to process and align and store that DNA sequence data. And it's even more difficult once that data is generated and can be passed around on your personal flash drive or something, let's say, to make sense of that. A lot of science has to happen understanding how those DNA codes that we have relate to the proteins we can produce in the diseases that we're at risk for, or the medications we may or may not respond to. A lot more science has to be done first, before we can capitalize on knowing our sequence. We can generate and we can keep our sequences, but what does it all mean? How can we use it? That's the frontier that we haven't crossed just yet.
Host Amber Smith: So a person who got their entire genome done through one of the services that does that, they could bring that to their doctor today, but what is that doctor going to do with it?
Stephen Glatt, PhD: That's the question without an answer.
There's nothing. Aside from very significant, major changes in the structure of your chromosome, it's very hard to take information about tens of thousands of small changes in your genome and what that means for your health. So for example, with knowledge of a family history of a significant disease, such as breast cancer for a BRCA-1 or -2 mutation, or early onset Alzheimer's disease from a presenillin-1 or -2 mutation, if you know that these diseases run in your family, it may very well behoove you to seek out your sequence information, maybe not even for your whole genome sequence, but for these loci that I've mentioned to see, am I a carrier of this gene that puts me at very significant risk for a life-altering diagnosis?
On the other hand, most of the diseases that have high morbidity or mortality, that cause a lot of sickness and death, are not those that are caused by a single genetic event. They're more like what we call complex disease or multifactorial disease. Things like mental illness, things like cancer, most forms of these diseases are caused by a combination of genes and environmental factors, and not just the effects of one gene or 10 genes, but hundreds, if not thousands of genes.
And when I say genes, I mean these little individual changes in your DNA code. And so the personalized suite of DNA changes that you possess may set some level of risk for you, and the environment in which you are raised or that you choose may set another level of risk. And then those two combine to determine how likely you are to have a disease like depression, or like colon cancer. And so knowing your sequence at any one of those thousands of genetic sequences is not going to help you much because none of those individual genes is necessary or sufficient to cause disease. They're just risk factors.
You can have your genome sequence, but then what? Like, what do you make of that? Just knowing it is not enough. It may be readily apparent in someone's genome sequence that they have a huge problem. And if they have a huge problem, that's worth knowing. But you still can't correct it. You can't fix it. You may change the way you deal with the disease that's owing to that. I mean like a portion of their entire genome, like millions of nucleotides, may be repeated or deleted. That's a big, big problem. That's just not you having an adenine and me having a cytosine in one position, one nucleotide. I'm talking millions of those being deleted. That's a big, big problem. And that has a lot of effects for the person. Knowing that helps manage the sequella that come from that, right? But what can you do with those minor kind of risk predisposing alleles, most of which have a minute effect on your liability toward a disease? And none of which, as I mentioned, are necessary or sufficient. If you have it, it increases your risk. If you don't, you still can get the disease because there's lots of other redundant factors that increase your risk. So I think having whole genome sequences of ourselves at our fingertips is more noise than signal right now. That's where you need the bioinformatics and the data analytics to understand what are the important ones? What's their likelihood, in combination with other important ones and environmental factors, that this person will have a disease? Then you could start to intervene.
Host Amber Smith: You're listening to Upstate's The Informed Patient podcast. I'm your host, Amber Smith talking about the human genome with Dr. Stephen Glatt. He's a professor at Upstate whose areas of research include neuroscience and physiology, psychiatry, and behavioral science and public health and preventive medicine.
This now completed blueprint of a full human, is it a woman or a man?
Stephen Glatt, PhD: This is a misnomer to say that it's one individual's genome. It's actually an amalgamation of different individuals. And in this case, it's a cell line that was sequenced, that was taken from an individual long ago. But even the original draft of the human genome was an amalgamation of several individuals. And so the next frontier is for us to get much more representation from individuals across the globe of different ancestries, men and women, so that we have a better sense of the range of variation.
Host Amber Smith: I'm just curious now. So we have this blueprint of a full human. Can we clone this person now that we have this blueprint?
Stephen Glatt, PhD: Well, theoretically, and technically, it's possible that that sequence could be used, but it's not the sequence information that's going to allow the cloning. You could clone an individual today. And in fact, a scientist in China claimed to have cloned a human being a couple of years ago, and that was met with a lot of scorn in the scientific community and the ethical community, because it's not a thing that should be done, ever. But it's not knowing the sequence information that enables the cloning. Actually just having access to a DNA sample enables the cloning. You don't need to know what those sequences are in order to necessarily clone them. You could do that in a biological way. But I think having the access to the new sequence that we have enables some cloning of individual proteins that may be helpful for purifying these to make medicines. But I don't think anyone's intending, nor should anyone be intending or imagining, that we're going to use advanced sequencing technologies like this to develop clones of animals or humans.
Host Amber Smith: The sequences, would the sequence be different based on ethnic or racial differences in the person whose cell line it's drawn from?
Stephen Glatt, PhD: Indeed. There are major differences in the DNA sequence of different ancestral groups around the world. And in fact, even with existing technologies, we can readily detect those differences. So if we genotype individuals of Northern European ancestry, Latin American ancestry, East Asian ancestry on standard genotyping technologies, and then we look at everyone's code, we can sort people into bins of ancestry based on what DNA markers they have. So it's pretty clear that a lot of our DNA code is related to our ancestry, and that's how it's intended. We're intended to inherit these things. But the differences among the ancestral groups and especially in America start to dissolve. We are a melting pot, both sociologically and ideologically, but also genetically in America. And so mapping ancestry of people in a place that's highly, admixed like America, results in very different findings than if you were to genotype individuals in a more ethnically or ancestrally homogeneous place, where there's not a lot of immigration or emigration.
But yes, your ancestry does relate to which sequence variations you carry forward. Some of those will be related to the prevalence and risk for disease, and others of those are just related to the structure of the DNA and how it holds together.
Host Amber Smith: Have we completed genomes for other animals? Horses, cows, dogs, chickens?
Stephen Glatt, PhD: A lot of species have been sequenced over time. And those, each of those has been a huge breakthrough for science and there's been a queue: "let me get my species into the queue where we can generate its entire genome." And I think for farming, for livestock, for crops, agriculture, knowing those sequence variations, scientists want to capitalize on those the same way they do for human health to make better crops, to make more disease-resistant animals and so on. But there's a litany of species that have been sequenced in a similar manner.
Host Amber Smith: I'm assuming that the human genome is the most difficult. Is that right?
Stephen Glatt, PhD: No, that's not right. There are other species that have intricacies to them that make them much more difficult to sequence than humans, such as circular DNAs, for example. Ours are in an X shape, but some species have their DNA wrapped in other configurations that make it even more difficult to sequence. Highly repetitive sequences abound in other species as well. So I would say the human genome is no more or less difficult than the average animal.
Host Amber Smith: So I was wondering, do scientists learn anything that helps in human medicine from studying the genomes of plants or animals?
Stephen Glatt, PhD: There's this entire kind of scientific pursuit of genome biology, whether it's done in animals or plants or humans, whatever we learn from one pursuit, we want to disseminate widely and adopt those principles in the pursuit of advances in human health. And largely those are technological, but sometimes those are theoretical game-changing discoveries about the biology of the genome. So there's still a lot that we have to discover, and I think it's let a thousand flowers bloom. You know, study plant genomes, study animal genomes and human genomes at the same time.
Host Amber Smith: So, what do we have to look forward to next in human genomes? Because 20 years ago there was news coverage that it was complete, but it really wasn't entirely complete. But now we're being told it is entirely complete, but is it really?
Stephen Glatt, PhD: It's very, very close. There are still some aspects of the genome that are being mapped, but it's very, very close to a final map of an amalgam of individuals, or this one cell line. So what we obviously need to do is repeat this process many, many times, because the sequence information you gleaned from one, or even a handful of individuals may not contain all of the variation of individuals of a different ancestry. So we need to complete genomes of many, many people so that we have a reference set against which any new individual with some disease can be sequenced and compared to find what's different in their genome that might have caused them to have that debilitating disease. So we need to generate lots and lots more genome sequences themselves. And then the other aspect that's going to grow very fastis exploration of how those individual changes in the genome, that we all have, relate to the types of proteins that we make and how that in turn relates to our risk of disease or our response to interventions like medications.
Host Amber Smith: Would the genome have changed over time? If the genome was completed 200 years ago, versus today, would there be differences just because of the span of time?
Stephen Glatt, PhD: So genomes do change and evolve. But 200 years in genomic time is actually a relatively short period of time. The average genome of a human now is pretty much the same as it was of the average human 200 years ago. Although, genomes do change and acquire new mutations, especially due to cancer causing agents, things that are introduced into the germline and passed on from generation to generation. We also have DNA repair mechanisms. And so when those changes are introduced, we generally fix them. But sometimes mutations do sneak through. Sometimes they're advantageous and they lead to better fitness and passing on more of that gene to the next generation. Sometimes they're deleterious or harmful, and they lead to less fitness and less success in passing them onto the next generation. And sometimes they're neutral and they just go for the ride.
Host Amber Smith: Could prehistoric humans be sequenced, possibly from fossils or remains?
Stephen Glatt, PhD: Well, absolutely in fossil remains, there is detectable amounts of DNA at times. So it is conceivable that the sequences of, for example, Neanderthal or other proto-humanoids will be sequenced. And that's interesting as well. You can already, for example, know the Neanderthal genome and compare when your genome is sequenced, how much Neanderthal do you have in your genome? I think I have 2%.
Host Amber Smith: Well, I appreciate you making time for this interview, Dr. Glatt.
Oh, it's my absolute pleasure. It's the most exciting thing going on in my field of psychiatric genetics, but also in a lot of medicine. There's a lot of buzz about this, so I'm glad that you've dedicated the time to tackle it and share it with your listeners.
My guest has been Upstate professor Stephen Glatt, whose areas of research include neuroscience and physiology, psychiatry, and behavioral science and public health and preventive medicine. The Informed Patient is a podcast covering health, science and medicine brought to you by Upstate Medical University in Syracuse, New York, and produced by Jim Howe. Find our archive of previous episodes at upstate.edu/Informed. This is your host, Amber Smith, thanking you for listening.