More Things in DNA, Horatio …

Ideas Roadshow
9 min readFeb 7, 2021


From: Our Human Variability

Biology fascinates me. But as a non-expert, I’m forced to think of things in pretty simple terms. So when I hear biologists talk about evolution, adaptability and natural selection, I always find myself asking: What’s going on, exactly? What are the physical mechanisms at play?

After all, if a species evolves through mutations of its members, then these mutations must be physically represented somewhere. And where else could that happen other than in our DNA, our own personal “instruction manual” of nucleotides and genes that we carry with us in every cell.

If evolution is as strong a force as we are led to believe, then, these sorts of variations must somehow be happening all around us, resulting in a world replete with manifold diversity and uniqueness that is layered upon our common humanity. Which is — to all intents and purposes — pretty well what we see when we look around and see such differences in the people on all sides of us. So far, so comprehensible.

But when The Human Genome Project announced that their DNA sequencing experiment demonstrated that we were all “99.9% identical”, things took a decided turn towards the unintelligible for me, and my first reaction was one of sceptical confusion, rapidly followed by one of embarrassed withdrawal.

Like many laymen, the conclusions seemed downright perplexing to me, but who on earth was I to question the scientific consensus of thousands of expert researchers from around the world?

Stephen Scherer, on the other hand, a world-class geneticist who built an internationally renowned research program at Toronto’s Hospital for Sick Children, naturally felt less inclined to be deferential to the prevailing wisdom.

Howard Burton in conversation with Stephen Scherer, University of Toronto

“When the rough draft papers came out in 2000, which talked about how we’re 99.9% identical, I remember thinking, ‘But we’re not identical. My brothers and I share 50% of our DNA from our parents, and we’re nothing alike.’ You could probably pick us out in a crowd, but we’re really quite different.”

Let’s run the numbers. For a human genome of roughly 3 billion nucleotides, that 0.1% difference results in variations of about 3.2 million of the individual nucleotides that make up the “human genome”. So that’s one way to look at things.

But, crucially, it’s not the only way.

Many years before the Human Genome Project reached its conclusion, geneticists had also recognized that some 0.4% of the population exhibited large-scale deviations from the norm — so-called “copy number variation” — where huge chunks of DNA, often millions of nucleotides long, were either missing from their genome or present in extra copies. All of these large-scale changes were associated with serious medical conditions like autism or Down syndrome.

There were, then, it seemed two types of variation: one for the “diseased” and one for “the rest of us”. It was a picture that most geneticists and molecular biologists of the time unhesitatingly accepted. But not Stephen.

“I have this figure that I always show the students when I teach. If you plot out the number of different types of genetic variation and divide them into single nucleotide variation and the copy number variations, you’ll see that, in fact, 0.4% of the ‘normal’ population, the average population, carries big chromosome structural changes. Trisomy 21 is mainly associated with Down syndrome, but there are other big segments of DNA in a very small portion of the population that are different from each other. 0.4% of the population have these big, big changes, and we’ve known about that for 50 years.

“On the one hand, The Human Genome Project talked about those 3.2 million potential single-nucleotide changes that everyone is subjected to,and then on the other hand there’s 0.4% of the population who experience these large-scale chromosome changes.

“And when I was teaching back in 2002, I kept thinking to myself, ‘Biology favours balance. There have got to be a lot of other variants here. Why is it that we haven’t seen them yet?

“Well, because we didn’t have the tools to see them.”

He didn’t develop the right tools himself. But as a self-confessed “technology guy”, Stephen had the presence of mind to aggressively seek out better and different techniques to see what others might have missed.

In 2003, he partnered with Craig Venter’s Celera Genomics to study the DNA of chromosome 7, his primary area of expertise. Venter had pioneered a different sort of DNA sequencing technique, called “shotgun cloning”, that had also been used for the Human Genome Project. Now there was a way of comparing and contrasting the two approaches.

“We published that in Science in 2003 with Craig Venter’s group. Figure 1 of that paper is probably the most underplayed figure in the field of genetic variation. In Figure 1, we compared the sequence we put together with the Celera group approach with the public Human Genome Project reference sequence.

“If you look at that figure we show that there were about 167 or so sites along the chromosome that, when we compared the sequences, showed significant differences, including pieces of DNA in one that were not in the other.

“The reviewers kept saying, ‘These are just technical mistakes. You guys screwed up. You made a mistake.’ We knew that wasn’t the case, because we had used another form of experiments to prove that, indeed, those variations existed. But they still didn’t believe us, and the editor wanted it taken out. But I said, ‘You’re not getting our paper unless you leave it in. The data support it.’

“Those were the first copy number variations that were identified.”

So “copy number variation” again, but this time not necessarily associated with any particular condition or disease. What Stephen and his colleagues had stumbled upon was the groundbreaking possibility that large-scale, DNA copy-number variation might be nothing less than a universal human trait, a key ingredient in allowing evolutionary variability — concrete evidence, in other words, that we were far more distinctive than the Human Genome Project was telling us we were.

More work, though, needed to be done — and, once again, with cutting-edge tools.

“The real breakthrough was this technology called microarrays, which allowed us to scan for dimensional differences in the DNA sequence. What we had previously looked for were binary differences: Was it an adenine or a thymine here? Or a cytosine or a guanine there? These are single letter changes — site by site. There was really no good technology that allowed you to look for what I would call a copy number difference, where instead of having two copies, you might have three copies, or one copy, or in some cases zero copies.

“We actually used DNA from a child that was autistic as our first set of experiments. I wanted to get the most bang for my buck — I wasn’t going to run just anyone’s DNA — these “experiments cost thousands of dollars. At any rate, we knew that this boy had about a 6-million-base-pair deletion on one of his chromosome 7’s, right near the cystic fibrosis gene. We knew a lot about this.

“When we looked along his chromosome 7, starting at the beginning, there are a few blips, then you get to where his deletion is known to be, where he only has the one copy, and it drops down a bit, and afterwards it picks up again and continues along. But along the way there were all these other blips.

“It looked to be the same site on the chromosome; and we only saw them in some families and not others. That was really copy number variance. There were all these little blips along the chromosomes where there were segments of DNA of the order of 100,000 nucleotides long (an average gene is about 30,000 nucleotides), all the way up to millions of base pairs.

“Further studies and meta-studies, revealed that copy number variation transcended autism entirely. It was not only common to everyone, its contribution to our uniqueness is by far the largest contributing factor, dwarfing that of single-nucleotide variations.

“We’ve now done a meta-analysis of all of the data that’s been generated in the last 10 years with respect to copy number variation. On average, you or I would have about 45 million nucleotides of DNA in our genome (which is roughly 1% of the entire genome) that is copy number variable with respect to the standard human reference sequence. The variable factor for SNPs was 0.1%, as I said a moment ago. So that’s more than ten times as much.”

Ten times as much variation? How, then, is it possible that so many others could have missed it? How could The Human Genome Project — one of the largest and most comprehensive scientific collaborations in human history — have overlooked such a humongous elephant in the room?

“The Human Genome Project made a consensus sequence of what a human DNA would look like, based on a lot of individuals. In fact, I think there were 708 different donors. To come up with a consensus you have to merge them. It’s like a grey picture of what a human genome would look like.

“And to do that, because there were lots of different pieces of DNA coming together from different individuals, you take the easiest explanation: you essentially force them together and come up with the most common, linear sequence.

“Just based on the design, then, you would not see these large pieces of DNA missing, because you erase that variation when you merge things. Our advantage was that we were comparing two different DNA assemblies: the private Celera one, and the public one. And that gave us hints.”

For rigorously following his sceptical hunches, Stephen now firmly occupies a place in the pantheon of the scientific establishment. But however large the personal accolades become, they are dwarfed by the fundamental change in our understanding that his research has brought, not only the extent of our genetic individuality, but also — even more importantly still — of appreciating what it means to be human.

“It seems, then, that there are all of these “normal human beings” walking around with massive chunks of DNA either added or missing. I later found out that I’m carrying a copy number variant deletion that’s 800,000 nucleotides long.

“We knew about these sorts of things before, but they were always associated with disease. It turns out, however, that all of us carry lots of these chunks of DNA that are either missing in one copy, present in extra copies, or sometimes you don’t have any in a gene at all.

“It’s quite amazing when you think about it.”

Howard Burton,

This is the introduction of the book called Our Human Variability. This thought-provoking book is based on an in-depth, filmed conversation between Howard Burton and Stephen Scherer, the GlaxoSmithKline Research Chair in Genome Sciences at the Hospital for Sick Children and University of Toronto.

The book is broken into chapters and includes questions for discussion at the end of each chapter. Visit the page for Stephen Scherer for further details about the book and videos developed from the filmed conversation: and watch a clip here:

This book is also available as part of the 5-part Ideas Roadshow Collection called Conversations About Biology, also featuring Nick Lane, Frans de Waal, Mattew Walker and Alcino Silva.

Visit our website for further details: