Continue reading the main story
Continue reading the main story
Some of the tests look for missing snippets of chromosomes. For every 15 times they correctly find a problem...
…they arewrong 85 times
By Sarah Kliff and Aatish Bhatia
Leer en español
After a year of fertility treatments, Yael Geller was thrilled when she found out she was pregnant in November 2020. Following a normal ultrasound, she was confident enough to tell her 3-year-old son his “brother or sister” was in her belly.
But a few weeks later, as she was driving her son home from school, her doctor’s office called. A prenatal blood test indicated her fetus might be missing part of a chromosome, which could lead to serious ailments and mental illness.
Sitting on the couch that evening with her husband, she cried as she explained they might be facing a decision on terminating the pregnancy. He sat quietly with the news. “How is this happening to me?” Ms. Geller, 32, recalled thinking.
The next day, doctors used a long, painful needle to retrieve a small piece of her placenta. It was tested and showed the initial result was wrong. She now has a 6-month-old, Emmanuel, who shows no signs of the condition he screened positive for.
Ms. Geller had been misled by a wondrous promise that Silicon Valley technology has made to expectant mothers: that a few vials of their blood, drawn in the first trimester, can allow companies to detect serious developmental problems in the DNA of the fetus with remarkable accuracy.
In just over a decade, the tests have gone from laboratory experiments to an industry that serves more than a third of the pregnant women in America, luring major companies like Labcorp and Quest Diagnostics into the business, alongside many start-ups.
The tests initially looked for Down syndrome and worked very well. But as manufacturers tried to outsell each other, they began offering additional screenings for increasingly rare conditions.
The grave predictions made by those newer tests are usually wrong, an examination by The New York Times has found.
That includes the screening that came back positive for Ms. Geller, which looks for Prader-Willi syndrome, a condition that offers little chance of living independently as an adult. Studies have found its positive results are incorrect more than 90 percent of the time.
Nonetheless, on product brochures and test result sheets, companies describe the tests to pregnant women and their doctors as near certain. They advertise their findings as “reliable” and “highly accurate,” offering “total confidence” and “peace of mind” for patients who want to know as much as possible.
Some of the companies offer tests without publishing any data on how well they perform, or point to numbers for their best screenings while leaving out weaker ones. Others base their claims on studies in which only one or two pregnancies actually had the condition in question.
This isn’t the first time Silicon Valley technology has been used to build a business around blood tests. Years before the first prenatal testing company opened, another start-up, Theranos, made claims that it could run more than a thousand tests on a tiny blood sample, before it collapsed amid allegations of fraud.
In contrast with Theranos, the science behind these companies’ ability to test blood for common disorders is not in question. Experts say it has revolutionized Down syndrome screening, significantly reducing the need for riskier tests.
However, the same technology — known as noninvasive prenatal testing, or NIPT — performs much worse when it looks for less common conditions. Most are caused by small missing pieces of chromosomes called microdeletions. Others stem from missing or extra copies of entire chromosomes. They can have a wide range of symptoms, including intellectual disability, heart defects, a shortened life span or a high infant mortality rate.
Not every patient is screened for every condition; doctors decide what to order, and most companies sell microdeletion testing as an optional add-on to the Down screening. Most test makers don’t say how often their microdeletion tests are being performed.
But it is clear some of the tests are in widespread use. One large test maker, Natera, said that in 2020 it performed more than 400,000 screenings for one microdeletion — the equivalent of testing roughly 10 percent of pregnant women in America.
To evaluate the newer tests, The Times interviewed researchers and then combined data from multiple studies to produce the best estimates available of how well the five most common microdeletion tests perform.
The analysis showed that positive results on those tests are incorrect about 85 percent of the time.
results are wrong
Affects 1 in 4,000 births
Can cause heart defects and delayed language acquisition. (May appear on lab reports as “22q.”)
1 in 5,000 births
Can cause seizures, low muscle tone and intellectual disability.
1 in 15,000 births
Can cause difficulty walking and delayed speech development.
1 in 20,000 births
Can cause seizures, growth delays and intellectual disability.
1 in 20,000 births
Can cause seizures and an inability to control food consumption.
Testing companies currently offer seven microdeletion screenings. But two syndromes — Langer-Giedion and Jacobsen — are so rare that there is not enough data to understand how well the tests work. A few other tests for conditions that are not caused by microdeletions are also widely offered, with varying degrees of reliability. The screenings for Patau syndrome (which often appears on lab reports as “trisomy 13”) and Turner syndrome (“monosomy X”) also generate a large percentage of incorrect positives, while the screenings for Down syndrome (“trisomy 21”) and Edwards syndrome (“trisomy 18”) work well, according to experts.
Sources: Figures are pooled from multiple studies: Diagnostic Labs (Labcorp, Baylor Genetics, Combimatrix); Natera (2021, 2017, 2017, 2014). The estimate for Wolf-Hirschhorn syndrome is based on limited data (one true positive and six false positives).
Experts say there is no single threshold for how often a test needs to get positive results right to be worth offering. They note that when the tests do accurately identify an abnormality, it can give expectant parents time to learn about and prepare for challenges to come. Some said one common microdeletion screening, for a condition called DiGeorge syndrome, has the most potential to do good.
But there are hundreds of microdeletion syndromes, and the most expansive tests look for between five and seven, meaning women shouldn’t take a negative result as proof their baby doesn’t have a genetic disorder. For patients who are especially worried, obstetricians who study these screenings currently recommend other types of testing, which come with a small risk of miscarriage but are more reliable.
Some said the blood screenings that look for the rarest conditions are good for little more than bolstering testing companies’ bottom lines.
“It’s a little like running mammograms on kids,” said Mary Norton, an obstetrician and geneticist at the University of California, San Francisco. “The chance of breast cancer is so low, so why are you doing it? I think it’s purely a marketing thing.”
There are few restrictions on what test makers can offer. The Food and Drug Administration often requires evaluations of how frequently other consequential medical tests are right and whether shortfalls are clearly explained to patients and doctors. But the F.D.A. does not regulate this type of test.
Alberto Gutierrez, the former director of the F.D.A. office that oversees many medical tests, reviewed marketing materials from three testing companies and described them as “problematic.”
“I think the information they provide is misleading,” he said.
Patients who receive a positive result are supposed to pursue follow-up testing, which often requires a drawing of amniotic fluid or a sample of placental tissue. Those tests can cost thousands of dollars, come with a small risk of miscarriage and can’t be performed until later in pregnancy — in some states, past the point where abortions are legal.
The companies have known for years that the follow-up testing doesn’t always happen. A 2014 study found that 6 percent of patients who screened positive obtained an abortion without getting another test to confirm the result. That same year The Boston Globe quoted a doctor describing three terminations following unconfirmed positive results.
Three geneticists recounted more recent examples in interviews with The Times. One described a case in which the follow-up testing revealed the fetus was healthy. But by the time the results came, the patient had already ended her pregnancy.
After being presented with some of The Times’s reporting, half a dozen of the largest prenatal testing companies declined interview requests. They issued written statements that said patients should always review results with a doctor, and cautioned that the tests are meant not to diagnose a condition but rather to identify high-risk patients in need of additional testing.
In interviews, 14 patients who got false positives said the experience was agonizing. They recalled frantically researching conditions they’d never heard of, followed by sleepless nights and days hiding their bulging bellies from friends. Eight said they never received any information about the possibility of a false positive, and five recalled that their doctor treated the test results as definitive.
When Meredith Bannon’s pregnancy tested positive for DiGeorge syndrome, a nurse called and told her she and her husband would soon face “tough decisions” related to their child’s “quality of life,” which Ms. Bannon took to mean a choice about whether to end the pregnancy.
The call came as Ms. Bannon was driving to her parents’ house, with her son in the back seat wearing a “big brother” T-shirt. “I was coming home to tell them that I was pregnant, but instead I had to tell them the news I got this horrible result back,” Ms. Bannon recalled.
Further testing revealed that the result was wrong. Her baby is due in April.
Some women began tentatively planning abortions after receiving positive screenings.
“I couldn’t help but have termination on my mind,” said Allison Mihalich, 33, whose screening incorrectly indicated her baby might have Turner syndrome, which can cause infertility and heart defects. (Studies show that the test’s positive results are wrong 74 percent of the time.) She lived in Indiana at the time and recalled scrambling to arrange follow-up testing before the state’s 22-week abortion ban.
A big market for rare conditions
Between 2011 and 2013, a small California-based biotech company, Sequenom, tripled in size. The key to its success: MaterniT21, a new prenatal screening test that did remarkably well at detecting Down syndrome.
Older screening tests took months and required multiple blood tests. This new one generated fewer false positives with a single blood draw.
The test could also determine the sex of a fetus. It quickly became a hit. “You had people walking in saying, ‘I want this sex test,’” recalled Dr. Anjali Kaimal, a maternal-fetal medicine specialist at Massachusetts General Hospital.
Competitors began launching their own tests. Today, analyst estimates of the market’s size range from $600 million into the billions, and the number of women taking these tests is expected to double by 2025.
As companies began looking for ways to differentiate their products, many decided to start screening for more and rarer disorders. All the screenings could run on the same blood draw, and doctors already order many tests during short prenatal care visits, meaning some probably thought little of tacking on a few more.
For the testing company, however, adding microdeletions can double what an insurer pays — from an average of $695 for the basic tests to $1,349 for the expanded panel, according to the health data company Concert Genetics. (Patients whose insurance didn’t fully cover the tests describe being billed wildly different figures, ranging from a few hundred to thousands of dollars.)
But these conditions were so rare that there were few instances for the tests to find.
Take Natera, which ran 400,000 tests in 2020 for DiGeorge syndrome, a disorder associated with heart defects and intellectual disability.
The 400,000 tests would be expected to identify about 200 actual cases of the disorder.
In a recent study, Natera said that its latest algorithm would identify about an equal number of false positives.
But that same study also included the results from when the tests were actually taken.Those numbers suggest there would be three times as many false positives as actual cases.
At least six percent of the tests include the full panel of microdeletions.Those would be expected to find about eight true positives and between 17 and 134 false ones.
Natera declined an interview request after The Times presented its reporting. In statements, it said that the early detection of DiGeorge syndrome can “profoundly improve” patient outcomes and stressed how infrequently it identifies some of the other conditions. (It said the screening that gave a false positive for Prader-Willi syndrome in Ms. Geller’s pregnancy, for example, had returned positive results only 113 times since 2015.) It pointed to its recent study of 20,000 pregnant women that found DiGeorge syndrome occurs in 1 in 1,600 births — twice as common as other estimates.
The company offers free genetic counseling to patients who screen positive. Natera also publishes data on how often its positive results are right and includes that information on patient results sheets.
Other companies release little information about how many tests they sell, and far less research on how well their screenings work.
Myriad Genetics’s prenatal test, Prequel, offers five microdeletion screenings, even though its study on the test includes just two confirmed cases of microdeletions.
In a statement, Myriad estimated that only one in 9,000 of its patients screens positive for a microdeletion. It said its data showed a “very small fraction” of those are wrong, but declined to provide specific figures.
Some companies test for conditions so rare that there are few known examples for comparison.
Both Labcorp, which purchased Sequenom, and Myriad Genetics offer screenings for one disorder that is so rare its prevalence is unknown, and another, called Jacobsen syndrome, that affects 1 in 100,000 births.
Dr. Diana Bianchi runs a National Institutes of Health laboratory studying prenatal blood screenings. She said of Jacobsen syndrome, “I’ve never seen a case of that in my 20-plus years of practicing genetics.”
Here’s why a test that works well for Down syndrome can be much less useful for rarer conditions.
If 20,000 women take a test of the same quality as the better prenatal blood screenings, there would be about 20 false positives.
And if the test is screening pregnant women in their late 30s for Down syndrome, it would identify about 100 real cases.
DiGeorge syndrome is 20 times as rare. An equally good test would get a similar number of false positives. But it would find only five actual cases.
And Prader-Willi syndrome is even more rare. That test would be expected to find one case.
The positive results would be wrong around 95 percent of the time.
Estimates are based on a test with 99.9% sensitivity and specificity.
‘Total confidence in every result’
Those shortfalls are rarely referenced when companies explain the tests to doctors and patients.
A Labcorp MaterniT21 lab report tells patients the test “detected” a problem, even though most studies show positives on that screening are usually wrong. Myriad Genetics advertised “total confidence in every result” on its prenatal testing website but said nothing about how often false positives can occur.
After The Times inquired about these tests, Myriad took down that language.
The Times reviewed 17 patient and doctor brochures from eight of the testing companies, including Natera, Labcorp, Quest and smaller competitors. Ten of the brochures never mention that a false positive can happen. Only one mentioned how often each test gets positive results wrong.
Examples of positive
Tests for these conditions
usually get positives wrong.
A footnote defines “high probability”
as “1% or greater.”
Genetic counselors who have dealt with false positives say some doctors may not understand how poorly the tests work. And even when caregivers do correctly interpret the information, patients may still be inclined to believe the confident-sounding results sheets.
When Cloey Canida, 25, got a positive result from Roche’s Harmony test in September, the result sheet seemed clear: It said her daughter had a “greater than 99/100” probability of being born with Patau syndrome, a condition that babies often do not survive beyond a week.
Her obstetrician tried to reassure her, citing independent data showing that for a woman her age, 93 percent of positives turn out to be wrong.
But Ms. Canida couldn’t stop thinking about the result sheet. She recalls crying during an ultrasound, thinking it was one of the few times she’d see her child moving.
After spending $1,200 on follow-up tests, she learned that her pregnancy was healthy, and that her daughter would not be born with Patau syndrome. She is now in her third trimester.
“I wish that we would have been informed of the false positive rate before I agreed to the test,” she said. “I was given zero information about that.”
Roche, which recently sold the Harmony test to another company, said in a statement that “all women should discuss their results with their health care provider” before making any decisions based on screening results.
Three experts reviewed marketing materials and results sheets for The Times and identified obvious reasons a patient would be confused.
“These numbers are meaningless,” said Mr. Gutierrez, the former F.D.A. official, after reviewing an advertisement for the Quest Diagnostics QNatal Advanced Test.
The test is advertised as getting positive microdeletion results right 75 percent of the time. But that figure comes from a single study that included nine confirmed cases of microdeletions, for a test that screens for seven such disorders. The company doesn’t specify how the tests perform individually, and declined to provide that data. (In a statement, Quest said its test has “excellent performance.”)
The F.D.A. considered regulating these tests a decade ago, but backed away. If the agency had oversight, Mr. Gutierrez said, Quest would be required to publish a brochure, but “it would not look like this.”
Nonetheless, companies are charging ahead, viewing microdeletions as a major business opportunity — especially if they can persuade more doctors to order them and more insurers to cover them.
Myriad Genetics, which owns the Prequel test, has told investors that it plans to start a “next-generation” microdeletion screening in 2022, and that it will lobby the professional society for obstetricians to begin recommending the test to its members.
Natera has performed more than two million screenings for Down syndrome since 2013. It went public in 2015, and the value of its stock has grown to $8.8 billion.
With its expanded panel of screenings, the company sees more growth ahead. “This is a really significant moment for the microdeletions business,” the company’s chief executive, Steve Chapman, said at an investor conference last January.
The company’s 2020 revenues were $391 million, and it projected its 2021 revenues to exceed $615 million. But if more insurers begin paying for microdeletion tests, Mr. Chapman said, the potential is “enormous” — it could bring in up to another $300 million every year.
Kitty Bennett contributed research.
About the analysis
To estimate the performance of microdeletion screening tests, The Times interviewed genetic counselors and experts on medical testing and prenatal care, then searched for peer-reviewed studies of screenings by U.S.-based labs that included follow-up diagnostic testing. Six studies met these criteria: three from diagnostic testing labs, and three studies funded by one of the test makers, Natera. An additional 2021 report by Natera was added as it included results from a recent clinical trial of its microdeletion test. (An eighth study, published in 2015, was excluded because experts identified multiple problems with its methodology.) Reporters then combined the data from these studies and estimated the tests’ overall positive predictive value to be 15 percent. Two researchers reviewed the resulting analysis.
Three of the four Natera studies include projected performance numbers that are based on re-analyzing the blood samples they collected with a modified version of the original test, a practice that can help improve results. At times, the company could not replicate those projections in subsequent studies. To be conservative, The Times used Natera’s higher projected numbers in its estimates; using the initial data instead would decrease the calculated positive predictive value from 15 percent to 12 percent.