By Mark Blaxill
There’s a familiar rhythm to the most prominent autism gene hunt publications. Their authors hype their newly minted study aggressively in the media. The prestigious journals that publish them lend their imprimatur to press releases that say, “this study is a big deal.” The findings sound impressive in the press release (and the authors get plenty of time on camera and in leading newspapers to tell us how truly impressive they are). In the meantime--in papers that are so densely written that making sense of what they really say requires far more reflection than the media hype cycle permits--skillfully concealed evidence reveals the truly important news in the findings: the authors whisper quietly (if at all) that the new analysis negates the most important findings of some of the most prominent previous gene hunts, while crucial detail on their new findings is often relegated to “supplementary material” that’s not available on the publication date.
All of these patterns will almost certainly be on display today as the latest missive from the autism-genetics establishment bursts forth in the form of not just one, but two major papers in the journal Nature. But I warn you, don’t be fooled by the hype. These two studies report a few moderately interesting findings, which isn’t a bad thing. Broadly speaking, trustworthy and actionable biological findings about autism are something all autism parents should welcome, whether they’re about genes or the environment or the interaction between the two. And indeed, most autism parents I know generally agree that there OUGHT to be some kind of genetic susceptibility that we can discover in autism.
But what’s truly remarkable in these two papers is how so much will be made about so very little.
That said, the publication of these two papers--one on the risk of rare mutations (copy number variants) in “autism genes”, the other on common inherited genes (reported here in the form of “single nucleotide polymorphisms” or SNPs) that may increase autism risk--creates an opportunity to review the current state of the great autism gene hunt, something I’ve wanted to do for a while. I’ll break the review into four pieces
1. What you should know about the lead authors and their funding
2. What the paper on “copy number variants” really says
3. Why the paper on common genetic variations will get the most hype
4. How to distinguish faith from reality in reading the results
1. The Lead Authors and their Motives
For the autism parent community, the biggest news in this PR blitz is not the work itself, but rather who stands behind it. The work is billed as a collaboration between the Autism Genetic Resource Exchange (AGRE) group from Autism Speaks and the Center of Applied Genomics at the Children’s Hospital of Philadelphia (CHOP).
That’s right. CHOP.
Autism Speaks’ AGRE program is right at the center of most American autism gene research so it’s not surprising to see them involved, but the interesting part in these two prominent studies is that Autism Speaks has clearly taken a back seat to CHOP in controlling this project. Researchers from CHOP spearheaded the study design and report writing. CHOP appears to have funded a material potion of the work on their own. And the corresponding author on both papers, Hakon Hakonarson, is the Director of CHOP’s Center for Applied Genomics (CAG).
With CAG and CHOP at the center of this research, it’s certainly reasonable to ask, what connection does this work have to Paul Offit? Especially in light of the $182 million that CHOP received last year from selling its rights to the Merck rotavirus vaccine, is the Merck vaccine not only making Paul Offit a wealthy man, but also financing his personal crusade to exonerate the environment in general (and vaccines in particular) from any role in causing autism?
As we reported previously (HERE) Offit and CHOP have declined comment on the distribution of the millions they received from the Merck vaccine, but one can get a few clues from CHOP’s official policy document on the distribution of patent royalty income (HERE): 30% goes to the inventor(s) (in this case Offit alone), 25% to the inventor’s lab and department and the remaining 45% to the general CHOP research budget.
It’s impossible to know where all that Merck money is going if CHOP won’t disclose it, but it’s reasonable to speculate. Could the lion’s share of these millions really be going to the Infectious Disease Section where Offit officially hangs his hat? That would be hard to imagine: if you visit the web-site (HERE) you will find that there’s really not much going on in infectious disease. As a research area, it’s certainly not particularly sexy. One clue to the patent royalty distribution might come from Offit’s other recent choice: when he was demonstrating his compassion for families affected by autism, Offit made a big show of donating his book royalties to CHOP’s Center for Autism Research (CAR). Is there a connection between CAG and CAR? Yes there is and it’s Dr. Hakonarson himself: The corresponding author on the two new autism-gene papers is not only the Director of CAG, he is also a researcher at the CAR at CHOP (HERE).
Follow the acronyms. Merck to CHOP to CAR and CAG. Any way you slice it, there’s a pretty direct path from Merck’s vaccine profits to Hakonarson’s gene study with Paul Offit in the middle. None of which should negate the evidence any gene study produces, but it should certainly influence anyone who wants to calibrate the PR blitz we’re about to see.
2. The New Church of the Immaculate Mutation
The least radical of the two papers, “Autism genome-wide copy number variation reveals ubiquitin and neuronal genes”, is actually the most interesting from an autism research perspective. CNVs have recently become a hot topic in autism genetic research. Most notably, researchers from the Cold Spring Harbor (NY) Laboratory (2007) and the Boston-based Autism Consortium (2008) recently published papers, in Science and the New England Journal of Medicine, respectively, arguing that de novo copy number variants could provide genetic answers where all the research on inherited genes could not. Their theory was essentially one of mutation: the idea that the defects in the DNA of autistic children that caused the disorder were not present in their parents’ DNA but rather the result of newly created (or de novo) variations in the egg or sperm that subsequently combined to create the child.
The de novo CNV theory looked promising: most previous autism gene research had assumed that “autism genes” had to come from the parents and be passed on; studies therefore used analytical techniques that looked at linkage, i.e., the transmission of genes from parent to child. They consistently found little. The idea that the defects could be de novo opened up an entirely new field of possibilities.
One part of the idea made a great deal of sense. De novo variation is consistent with an environmental theory. One doesn’t need to dismiss the tenfold increase in autism rates to argue for genetic causation. Instead, one could argue that the autism epidemic has resulted from all kinds of environmental toxins that were wrecking havoc on the reproductive fitness of modern parents, basically leading to widespread mutations in eggs and sperm. The corollary was that the relevant environmental injury had nothing to do with what happened to the embryo itself, not to mention the infant. All the damage took place before conception, occurred in sperm and egg cells, and of course had nothing to do with vaccines.
But if the de novo CNV theory was plausible at one level, it was absurd at another. The genetic mutations the theory proposed (because this was the best the available evidence could support) were completely non-specific. The copy variants were spread widely (even randomly) over the genome, the theory went. No individual mutation was responsible for autism, just the unhappy presence of the wrong one. And these non-specific mutations were not only widely spread, they were virtually undetectable in the infant: no dysmorphic features; generally normal birth and (in many cases development); and the beautiful children we so often see affected by autism.
In other words these CNVs were a case of immaculate mutations.
And it was the perfect new project for the genetics research community. A wide open field of research opportunities. Lots of new money. And a chance to explain past failure away as part of the inexorable march towards genetic understanding.
Unfortunately, the religious zeal with which the genetics field took on CNVs began to run into difficulty as the evidence didn’t cooperate very well with the theory. The Cold Spring Harbor paper found de novo CNVs in 14 out of 195 autism cases as compared to 2 out of 196 controls. These were scattered around chromosomes 2, 3, 6, 7, 13, 15, 16, 20 and 22. But only a few of these were in regions that had ever been raised before in autism gene research. More to the point, the researchers had very little idea what common pathway might be involved in this seemingly random selection of individual mutations. The mutations--though more common in the autism genomes, they claimed--were limited in number, not really the outcome of a “fragile-genome disorder.” Most of them didn’t go along with any known genes. They confessed that “these studies do not address the mechanisms by which structural mutations of genes contribute to autism.”
And then, like the heritability studies that went before, the studies started contradicting each other. The Autism Consortium study published in the NEJM, waited until the penultimate paragraph before conceding that “a comparison of our findings with those of other recent studies that have reported de novo copy-number variations in autism does not add evidence in favor of susceptibility to autism conferred by other recently highlighted genomic regions.”
There was one major exception, however, and that was a region on chromosome 16 (16p11.2 to be precise) that received all the focus of the gene hunt PR machinery. “A hot spot of genetic instability in autism”, was discovered on 16p11.2, proclaimed an accompanying NEJM editorial that also advertised the attractions of the new theory. “These examples highlight a different paradigm for the genetic basis of autism. Rather than being an inherited disease, autism may be the result of many independent loci that rarely delete or duplicate during gamete [sperm or egg] production.”
Publishing these findings in the prestigious New England Journal of Medicine required the authors to meet some unusually high standards. More so than the Cold Spring Harbor paper, the 16p11.2 finding was made with modesty in its claims. In the original sample of 751 families (genes provided again by AGRE), only four (barely half a percent) showed the most durable CNV result, a deletion of DNA on 16p11.2. Normally, this trivial a result wouldn’t merit publication in a journal like NEJM. But the difference here was that the Autism Consortium could claim that they had replicated the specific CNV finding, not once but twice, in two different samples. There would be more details to come in the supplementary material, but the fact that three entirely unrelated samples (the AGRE original and then separate replications from Boston Children’s Hospital and from Iceland) had shown the same result, <1% deletion in “cases” and close to 0% in controls, was a landmark finding.
I spoke to one of the Autism Consortium researchers at the time (she was trying to recruit our family to donate samples). They were very pleased with themselves that they had “explained” half a percent of the cause of autism.
And then along comes today’s CHOP CNV publication. It’s good news for them (word is they’re pleased with themselves too). Like the NEJM paper, the CHOP paper claimed to have replicated their findings in two separate samples: the AGRE and the CHOP cohorts. Unfortunately for the Autism Consortium, the news for replication of the “hot spot” on 16p11.2 was not so good. “We observed a similar frequency of deletions and duplications of the 16p11.2 locus in the ASD cases (~0.3%) as previously reported; however the CNV frequency in the control subjects at this locus was also comparable to that of the cases.”
So much for the hot spot.
But the CHOP CNV paper also departs from the Boston and NY papers in a different and more important way. They report their own set of “enriched” areas of CNVs. But unlike the other papers that reported de novo variations, the CHOP CNVs were almost entirely inherited (in fact, they reported surprisingly low levels of de novo variation). With that finding, not only did the 16p11.2 hot spot go down the drain, so did the “new paradigm for the genetic basis of disease.”
So much for an environmentally-based theory of genetic causation.
And so it goes. What we’re seeing now won’t be reported widely, but it’s an important silent finding. Effectively, the New Church of the Immaculate Mutation is stumbling from enthusiasm into a new crisis of faith. And the news gets worse, because not only did some landmark findings fail their first major replication test, but in the retrospective analysis they’re going to find themselves vulnerable to critical flaws in their previously acclaimed analysis of de novo CNVs. Here’s a few that I’ve found interesting.
• The 16 p11.2 “hot spot” was developed based on an original finding of both deletions and duplications. The duplication CNV wasn’t found in the Iceland group and was questionable in the Boston Children’s Hospital group
• The second sample from Boston Children’s Hospital was made up of “case subjects that had received the diagnosis of developmental delay, mental retardation or autism spectrum disorder.” That’s right, it said “OR autism spectrum disorder.” The second replication sample wasn’t really an autistic sample at all, a limitation that should place a serious cloud on their second claim to replication for the CNVs they found in the initial group.
• In fact, in reviewing the “supplementary material”, the (scantily) reported history of the “autistic” cases makes one highly suspicious that these are really cases of autism at all. There is no clear diagnosis listed and several symptom areas that would be markers of autism were reported in which the entry was merely “NA.”
So it appears the much hyped replications were far less robust than the study authors claimed and perhaps the CHOP study has now revealed the weakness in the Autism Consortium’s methods.
But even in the CHOP study, it’s not clear that the latest set of CNVs is any more reliable than the Autism Consortium’s. The CHOP study took a full genome scan of both the AGRE and CHOP samples and reported on the handful that showed up “significant” (meaning they had low P values) in both. Notably, the paper itself doesn’t describe the overall results as “significant”, but rather “positive” and “enriched.” That’s because the standard for separating meaningful results from noise in a full genome scan is higher than a typical analysis: there is so much data in the full genome that one would expect low P values in a random analysis.
For all these reasons (it rejects the 16p11.2 “hot spot”; it failed to find de novo variation; it doesn’t meet a high standard of statistical validity) that’s why the CHOP CNV paper is getting far less attention than the more traditional study: a straightforward sequel to past analysis of inherited genes. Let’s turn to that one now.
3. The Orthodox Church of the Common Variant
In order to understand the importance of the second Nature article, you need to understand a critical bit of background about a raging debate among geneticists right now. There has been a crisis brewing for a while among the genetic research faithful. The New York Times reported on this crisis in a September 16th, 2008 article. In the opening paragraph, the reporter summarized the large stakes involved.
“The principal rationale for the $3 billion spent to decode the human genome was that it would enable the discovery of the variant genes that predispose people to common diseases like cancer and Alzheimer’s. A major expectation was that these variants had not been eliminated by natural selection because they harm people only later in life after their reproductive years are over, and hence that they would be common. This idea, called the common disease/common variant hypothesis, drove major developments in biology over the last five years. “
But after many years of investigation, the search for common variant has been widely judged a failure. And that failure has precipitated a remarkable crisis of faith among geneticists. The New York Times article quoted David B. Goldstein of Duke University, a population geneticist who basically said to the faithful that God was dead. “It’s an astounding thing,” Dr. Goldstein said, “that we have cracked open the human genome and can look at the entire complement of common genetic variants, and what do we find? Almost nothing. That is absolutely beyond belief.” That wasn’t good news for the practical application of genomics, said Goldstein. “There is absolutely no question,” he said, “that for the whole hope of personalized medicine, the news has been just about as bleak as it could be.”
Most of you probably missed a recent series on genetic analysis of disease in the April 23rd issue of the New England Journal of Medicine. Goldstein and several other authors contributed editorial commentaries, all of which focused on the implications of the newly recognized limits of common variant analysis in deciphering human disease. They each tried to find the silver lining in the failure. Goldstein argued to search for rare variants rather than common ones. Peter Kraft and David Hunter of Harvard argued doctors could save time and money, claiming that “the positive predictive value of a genetic test [for human disease] will almost always be low.” Joel Hirschhorn of MIT tried to cheer the faithful up with the argument that genetic analysis was not really about genes after all. “The success of genomewide studies is not tied to prediction”, argued Hirschhorn. “If we identify only new pathways underlying disease, these studies will have a tremendous impact.”
In the midst of this crisis of the orthodox faith in the grand human genome project, the big news in the CHOP autism papers becomes more evident, especially the second one, “Common genetic variants on 5p14.1 associate with autism spectrum disorder.” Here is where the real focus of the press release just issued by Nature has been placed.
“The first robust evidence of common genetic variation having a role in autism is presented online in Nature this week. The research pinpoints significant single-nucleotide polymorphisms (SNPs) that appear to have a strong genetic association with autism, and marks the first identification of this kind for the disease. The discovery of these SNPs suggests that, in contrast to what was previously thought, common genetic variation can contribute to autism spectrum disorders (ASDs).”
The search for common gene variants in autism requires another kind of technique, not a mutation search but a linkage analysis. There have been millions of dollars spent on linkage studies in autism, by over 10 different groups carrying out even more individual genome scans.
(I reviewed these studies HERE) The bottom line: like so much of the genetic analysis in autism, none of these studies can sustain a reproducible result. Why? Since the genome is so large, these linkage studies are like massive random number generators: any individual study will find a significant finding somewhere in the genome all of which end up being false signals of disease risk. This law of large numbers was famously demonstrated by leading geneticists Eric Lander and Leonid Kruglyak in a simulation model.
“To illustrate the random fluctuations expected in a whole genome scan, we generated simulated genotypes assuming independent assortment throughout the genome—that is that there are no trait causing loci. All positive scores in such data necessarily represent random fluctuations, not true linkage…[In the simulation, a] single region on chromosome 14 reached the status of suggestive linkage, as expected, while no region showed significant linkage. If these results had occurred in a real dataset, an investigator would likely call attention to the possibility of linked genes on chromosome 14…The example thus illustrates that false positives can cluster in candidate regions and otherwise mimic true loci.”
In scan after autism scan, researchers have fallen into this trap. Some region shows up with a significant result, the authors hope against hope (they’ve all read Lander and Kruklyak) that their “suggestive” finding is really a true risk factor, the next wave of studies comes out and lo and behold the old suggested regions are rejected and new “suggestive” (and sometimes even “significant”) findings emerge. This has gone on so frequently, it’s become a real embarrassment to the genetic community.
So what’s different about this Nature paper? After all, Nature sets higher standards, right? Well judge for yourself. Remember, for the linkage study, the authors were using two data samples, one from Autism Speaks (AGRE) and one from CHOP (ACC). When they did their linkage analysis, here’s what Hakonarson and colleagues report.
• “We did not observe genome-wide significant association (P < 5 X 10-8 ) to ASDs in the AGRE cohort, but we proposed that meaningful associations were contained within the lowest P values.”
• “We did not detect genome-wide significant association (P < 5 X 10-8 ) to ASDs in the ACC cohort either.”
• “Therefore we subsequently performed a combined analysis of these two independent data sets using recommended meta-analysis approaches…[and] one SNP located on 5p14.1 reached genome-wide significance…and five further SNPs at the same locus had P values below 1 X 10-4.”
If they had stopped after that, there would be no Nature article, since other research groups, most recently the Autism Genome Project Consortium (AGPC), have pooled similarly large samples and claimed the large sample revealed significant findings too (Lander and Kruglyak point out that if you run enough genome scans, the random number generator will generate a “significant” finding about 10% of the time). The AGPC study, published in 2007 with only a slightly smaller sample, reported a “significant” finding on chromosome 11p.11-12. Not surprisingly, however, the CHOP study doesn’t call attention to that previous effort. Also not surprising is the fact that the CHOP region of interest, 5p14.1, didn’t come up as significant at all in the AGPC analysis.
So if we’re just watching competing random number generators at work, why did Nature publish this new study? Because the CHOP group reported not one, but two additional replications of the initial significant finding.
• “To replicate our genome-wide significant association signals at the 5p14.1 locus, we examined the association statistics for these markers in a third independently generated and analyzed cohort…the association signals for all the aforementioned SNPs were replicated in this cohort with the same direction of association with P values ranging from 0.01 to 2.8 X 10-5.”
• “To seek further evidence of replication, we examined association statistics from a fourth independent cohort…[and] most of the SNPs were replicated (P < 0.05)…with the same direction of association.
So that’s the deal. Never mind the fact that this is a completely new area, never reported by ANY previous genome scan, or that the largest alternative study the AGPC found nothing there, or that the “replication” P values were relaxed relative to the “discovery” P values. But give the group some credit for jumping over a hurdle that others have struggled to clear.
Only time will tell whether this new “common genetic variant” falls by the wayside as other purported autism genes have before. For now, let’s accept the idea that the CHOP group found something of interest.
Accepting that as a premise, the first question you might ask is simple: So what’s the gene? Well there’s not exactly a gene at the 5p14.1 “allele.” The significant region is not on a known part of any gene. Rather, it’s in between two genes, specifically the region between two genes previously unexplored in autism known as cadherin 9 and cadherin 10.
The next question you might ask is more practical: so what do the two genes (that we’re not quite looking at but simply looking between) do? Well here’s where the CHOP group gets excited, because the cadherin 10 gene (tests on #9 were “uninformative") shows up in a test of fetal brain tissue. And more to the point, it looks like it could even be a brain development gene. “The cadherins represent a large group of transmembrane proteins that are involved in cell adhesion and the generation of synaptic complexity in the developing brain.” Okay, so maybe there’s something there, but it’s hard to know exactly what.
But the most important question might slip right past you: indeed it took me several readings to figure this out. The crucial question is this: how important is this “common variant”? The importance ought to be high if we’re going to get excited. Interestingly, the CHOP authors don’t feature the answer to this question, they place their measure of “MAF” (for “minor allele frequency”) in acronym form right next to their more featured P value findings. But the results are straightforward to interpret if you read the fine print and do some simple arithmetic. The essence of the answer is this: normal children had the common variant 61% (1 - 0.39, the MAF) of the time, while the autistic children had the common variant 65% of the time.
That’s it. 65% vs 61%. Do the math. Even if CHOP is right and these results hold up, we’re talking about less than a 10% difference. That’s all.
As Kraft and Hunter pointed out in their NEJM essay, “a striking fact about these first findings is that they collectively explain only a very small proportion of the underlying genetic contribution to most studied diseases.” How small is small for them? “The great majority of the newly identified risk marker alleles confer very small relative risks, ranging from 1.1 to 1.5.”
So the CHOP finding, if it’s not a statistical fluke, is at the low end of the range for risk prediction.
So much for common variants in autism
4. Moving Beyond Faith to Real Biological Insight
A closer reading of the recent autism genetic literature reveals a kind of strategic retreat under way among geneticists. Rather than focusing on individual genes or gene regions, most authors are beginning to follow Joel Hirschhorn’s advice and focus on extracting insights from the long list of genes that have emerged as promising, if eventually failed, candidates for strong autism causation. The insight, these scientists argue, might not lie in the specific gene but rather in the specific biological processes in which a larger number of genes cluster. There’s a certain wisdom in that. I know that in my reading of the autism gene literature, these weak signals are the findings I store away personally.
One source I know has argued that what’s interesting in all these findings is that the relevant genes “cluster around the synapse.” That’s clearly what the CHOP authors argue and it’s certainly possible.
But it’s also possibly wishful thinking. There are lots of genes that get bandied about in autism gene papers, and some of them are active in the nervous system. They even have names (neurexins and neuroligins) that make them sound like they’re “brain genes.” And since the orthodox view of autism requires that the disorder is clearly genetic and placed neatly within the brain, it’s not at all surprising that these scientists would be inclined to that interpretation.
But there are also any number of alternative clustering hypotheses. Ubiquitin, one of the genes involved in the CNV paper, is important for glutathione metabolism and could affect detoxification capacity. The supposedly “neuronal” genes show up all over the body, the neuroligins and neurexins are expressed in the gut and kidneys too. The cadherins keep cells together and one scientist I know points out that “Individuals with comparatively weak physical connectivity between neurons will be most vulnerable to environmental factors.” The cadherins are also involved in maintaining the blood brain barrier, another vector by which genes and environment could interact to injure susceptible children.
In short, who’s to say that these findings “cluster around the synapse” or for that matter exclusively affect the brain? There are as many interpretations of these findings as there are biases. The frustrating thing is that, in the aggregate, the findings don’t cluster very much. And as for any near term benefit for today’s autism families and children, the key point is this:
a) the chances are overwhelming (over 99%) that any of these mutations (inherited or de novo) don’t really affect your child, and
b) there’s very little chance that there’s much treatment insight coming any time soon in any common variant risk that might result from further study of the DNA between the cadherins.
In the end, we’ll all need to accept the coming hoopla on this study as yet another fleeting flail of a failing disease model, a “big hungry lie” that needs new studies to stay alive. It’s not surprising to see CHOP in the middle of it all, nor is it at all surprising to see larger doctrinal debates in scientific research playing themselves out with our children as guinea pigs. It’s just a shame to see all this money and attention placed on such flimsy foundations. There are so many more productive ways to spend that money and so little time for the children.
Mark Blaxill is Editor at Large for Age of Autism.
1. Glessner JT et al. Autism genome-wide copy number variation reveals ubiquitin and neuronal genes. Nature. 2009 (in press)
2. Wang K et al. Common genetic variants on 5p14.1 associate with autism spectrum disorders. Nature. 2009.(in press)
3. Sebat J, et al. Strong association of de novo copy number mutations with autism. Science. 2007;316(5823):445-9.
4. Weiss LA, et al. Association between microdeletion and microduplication at 16p11.2 and autism. N Engl J Med. 2008;358(7):667-75.
5. Goldstein DB. Common genetic variation and human traits. N Engl J Med. 2009;360(17):1696-8.
6. Kraft P, Hunter DJ. Genetic risk prediction--are we there yet?. N Engl J Med. 2009;360(17):1701-3.
7. Hirschhorn JN. Genomewide association studies--illuminating biologic pathways. N Engl J Med. 2009;360(17):1699-701.
8. Lander E, Kruglyak L. Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet. 1995;11(3):241-7.