By now you may have heard of the study out of Stanford which estimates that infection rates are much higher than previously believed and so that the mortality rate is much lower. This is the type of study that we need to see more of, and I don’t want to denigrate the work done. However, there are a number of reasons to be cautious about taking its conclusions at face value.
- This is a pre-peer review study. We’re seeing a lot of these given the urgent circumstances, but studies often go through revisions during the process of peer review. As a result, it’s best to take their findings as provisional.
- The test that was used is not FDA approved. Here again, urgency is at play. Tests are being allowed without the usual strict guidelines to ensure that they are accurate and consistent. I don’t know of any issues with this specific test, but until it has been thoroughly vetted, we should be cautious of using it to draw broad conclusions.
- The sample was not random. This is noted in the paper. The authors did make an effort to apply weights to adjust for demographic bias based on zipcode, race, and sex, but other biases would remain and are unknown.
- As the authors note, the conclusions apply only to Santa Clara county and may not generalize to areas that are significantly different from Santa Clara county in various ways. The fact that they estimate that infections are 50 to 80 times higher than reported cases in Santa Clara is not a good reason to conclude that they are 50 to 80 times higher across the country.
These are pretty standard types of weakness in studies and represent reasons why we should generally be cautious of single studies with surprising results. The authors bring up most of them in their discussion. But media reports tend to ignore the caveats and focus on the attention grabbing bits that drive headlines.
Beyond these, there are two reasons to be specifically cautious of this study which are not addressed in the study itself.
- In estimating mortality rates, the authors start with the number and growth rate of reported deaths and then project the growth out 3 weeks to account for the resolution of current infections. This technique tacitly assumes that the reported deaths are accurate and that deaths will continue to grow at the same rate. Both of these assumptions are problematic. First, it is almost certain that many deaths were unreported. This is a widespread problem and has a large impact on the fatality rate. Second, the growth rate of deaths over a three week period after the study depends not on the prior growth in deaths but on the prior growth in infections. Given that the most central conclusion of the study is that infections have grown far faster than reported, it is unreasonable to project a fatality rate based solely on past growth of reported deaths. The latter worry can be mitigated by simply waiting 3 weeks to see how many deaths end up being reported. However, this leaves us fully dependent on the assumption that reported deaths are the same as actual deaths. Given the low number of deaths (the authors base their fatality rate estimate on 100 deaths) even small numbers of unreported deaths can have large impacts on the mortality rate.
- The biggest error in the study, and this is an error not merely a worry, is the inference from positive tests in a sample to actual infections in a population. The study measured the percentage of positive tests in a sample. From this they generalize to the population of the county. The methods they use entitle them to draw conclusions about the number of people who would test positive in the population. However, they actually draw conclusions about the number of infections in the population, which is not the same thing. When overall infection rates are low, as they are even in this study,even a test with high specificity can show a lot of false positives. In order to correct for this, we apply Bayes’ theorem to determine the chance that someone who has tested positive has actually been infected. Bayes theorem calculates the probability that someone who tests positive actually has the antibodies given information about the general prevalence of antibodies and the specificity and sensitivity of the test. Of course we have to estimate these and the answer will depend on our choices. The authors recognize the need to estimate sensitivity and specificity and provide analyses based on different estimates. However, they don’t take into account the need to adjust their conclusions based on the general prevalence of the antibodies in the population. Here’s an example of how this might be done, simplified because I’m not writing an academic paper here.
Let’s start with the background assumption (called a “prior”) that 1.5% of the population has developed antibodies and that the test has a sensitivity of 80.3% and a specificity of 99.5% (these are the estimates from the author’s third scenario.) Inputting these numbers into Bayes’ theorem tells us that the chance that someone has the antibodies, given that they test positive, is about 71%. So, if the sample indicates that we should expect a positive test rate in the population of 2.75%, then we should conclude that about 1.95% of the population actually has antibodies.
The upshot of problems A ad B above is that the infection and mortality estimates are likely to be considerably wrong. If the infection rate is only 71% of the estimate, as in the example I gave, and the real number of deaths is twice what was estimated, then the infection fatality rate would be nearly 3 times as high as the study estimated. These are big differences.
So, what’s the upshot?
Infection rates are undoubtedly many times higher than the reported case rates. This study adds on to the weight of evidence for this. The number of deaths is also undoubtedly a lot higher than reported. It seems likely that the percentage of underreporting of infections is higher than the percentage of underreporting of deaths, so that the infection fatality rate is much lower than the 5.3% the tracker currently shows for the US. However, until we actually have widespread testing using tests that are of known quality, we won’t actually have a clear idea of what the infection rate, or the infection fatality rate, really is. This study is suggestive, but not much more. At least that’s my read of it.