CORONAVIRUS: COMPUTING THE DEATH TOLL

On March 13, the New York Times reported that the U.S. Centers for Disease Control and Prevention (CDC) had prepared four scenarios of the potential medical impact of COVID-19 (details of which have not been fully published), indicating that anywhere between 200,000 to 1.7 million U.S. residents could die over the duration of the pandemic. And this could involve hospitalization of anywhere from 2.4 million to 21 million people. The hospitalization scenario is particularity concerning to CDC and the public considering that the U.S. only has a little more than 924,000 staffed beds.

What has been released by the CDC for publication are age-specific hospitalization, intensive care (ICU) and mortality rates of the U.S. population based on experience from February 12 to March 16 of this year. While this initial sample is relatively small (at 2,449 cases), it begins to provide a window into potential case-fatality, estimated to range between 1.8% to 3.4% of all persons infected.

The combination of these two studies make it possible to assess potential mortality scenarios relative to more normalized patterns of age-specific deaths in the U.S. That is the purpose of this blog.

We walk through this analysis in several steps:

  • First, considering age-specific mortality for the U.S. in a typical year — in this case 2018 as the most recent year with data available from the National Center for Health Statistics (NCHS).

  • Second, looking at what is publicly known, so far, about the CDC scenarios of potential U.S. deaths from the pandemic.

  • Third, combining the CDC scenarios with recent case-fatality data as generated by the CDC to estimate the percentage distribution of deaths by age.

  • Fourth, bringing all of these datasets together to compare how COVID-19 mortality projections compare to underlying existing (or normalized) death rates for the U.S.

As will be evident with the data presented, there is still a considerable range of estimates for many of the variables discussed. With more case experience, it hopefully will also become possible to refine and tighten the range of estimate.

Consequently, this blog may be updated to reflect new information as it becomes available. Questions and comments about the methodology used with this somewhat simplified but potentially informative review are also appreciated.

Age-Specific Mortality

This discussion begins with a review of age-specific mortality rates for the U.S. population (age 15+) as of 2018. As illustrated by the following chart, the annual death rate per 100,000 population ranges from just over 70 deaths per 100,000 persons age 15-24 to about 13,450 per 100,000 who are age 85 and over.

Age-Specific Mortality (2018 Table).png

This data serves as a baseline as to the normalized pattern of mortality - independent of effects of a major pandemic such as coronavirus.

CDC Mortality Scenarios

As noted, the New York Times on March 13 headlined an article as: “Worst-Case Estimates for U.S. Coronavirus Deaths.” While details do not appear to have been publicly released of this CDC-sponsored teleconference with 50 experts from around the world, the Times reports that it obtained screenshots of the CDC presentation from someone not involved in the meetings. The newspaper then verified the data with scientists who did participate.

The discussion was aimed to address the question of how many people might be infected, need hospitalization, and/or die as the virus takeshold in the U.S. As reported by the Times:

One of the agency’s top disease modelers, Matthew Biggerstaff, presented the group on the phone call with four possible scenarios — A, B, C and D — based on characteristics of the virus, including estimates of how transmissible it is and the severity of the illness it can cause. The assumptions, reviewed by The New York Times, were shared with about 50 expert teams to model how the virus could tear through the population — and what might stop it.

While details were not provided for all four scenarios, the March 13 article brackets the range of scenarios with low and high estimates for an epidemic lasting for months or even over a year:

  • The low estimate indicates that 160 million could be infected with 200,000 deaths.

  • The high estimate involves up to 214 million U.S. residents infected with as many as 1.7 million deaths (essentially with a much higher mortality rate for those infested).

It would be useful to have more detail regarding all four scenarios and the assumptions that stand behind each alternative projection. However, even with these summary numbers, it is possible, on a preliminary basis, to model the age-specific implications of these low-to-high mortality estimates.

CDC Age-Specific Case-Fatality Analysis

Subsequent to CDC’s base mortality scenarios, the agency has released age-specific hospital, ICU admission and case-fatality information — covering U.S. cases over the period of February 12 to March 16. As detailed by the following chart, the resulting database comprises 2,449 cases, disaggregated by age group and providing range estimates for each of these three indicators of medical need and result:

  • The lower bound of the range is estimated by CDC using all cases within each age group as denominators to calculating each incidence rate.

  • The upper bound is estimated by using only cases with known information on each outcome as denominators.

CDC Case-Fatality Rates for COVID-19.png

The columns under the blue banners are directly from the CDC data base. The last three columns under the red banner involve supplemental calculations by E. D. Hovee:

  • The first red column rate by age reflects the mid-point between the low and high estimates by CDC.

  • The second column provides an imputed estimate of deaths in the CDC data base (not directly stated by CDC but calculated from the case rates divided by the mid-range case-fatality rate).

  • The third column provides a distribution of the number of age-specific deaths as a percent of the total (and is applied with the next and final step to the analysis which now follows).

U.S Deaths by Age & Coronavirus Scenario

This final step combines the baseline historical mortality data with the death rate scenarios as consistent with the CDC datasets. The following table provides a summary of the results of these calculations:

  • The first column provides the age group categories by which the data has been compiled - with the 0-14 age group excluded because it is not shown as part of the NCHS dataset and because no fatalities are indicated with the CDC coronavirus dataset as available, to date.

  • The second column shows the number of deaths as of 2018 from the NCHS dataset as the baseline expectation of mortality that might be expected in the absence of COVID-19.

  • The third column adds in the low estimate of coronavirus deaths totaling the control total of 200,000 deaths, assuming that there is no overlap of coronavirus deaths with baseline mortality (a topic described further below).

  • The fourth column adds in the added increment of high estimate coronavirus deaths, assuming a hypothetical 50% overlap between baseline mortality and coronavirus deaths (also described below).

  • The fifth column indicates the cumulative total of existing baseline + low estimate + high estimate added coronavirus related deaths.

  • The final column calculates the ratio of cumulative deaths divided by existing baseline mortality.

EDH Mortality by Age & CDC Scenario.png

As indicated by the above chart, the high estimate scenario is associated with an overall mortality rate for persons 15+ that is 130% of (or 30% greater than) the baseline of existing 2018 U.S. deaths:

  • For persons age 15-14, the high estimate of death is only 7% greater than existing baseline conditions.

  • Conversely, the mortality rate is 36% greater for those those over age 35 than with baseline conditons.

  • Generally speaking and consistent with press reports, the mortality rate effect of COVID-19 increases for older age cohorts than younger. The exception is at the 75-84 year age bracket, perhaps a statistical anomaly due to the as-yet relatively small sample size of the CDC database.

These comparisons can also be made in graphic terms, as illustrated by the following graph.

Graph - Deaths by Age & Coronavirus Scenarios.png

With the low scenario, the addition to existing death rates is much smaller — adding only 1.6% to the death rate for 14-44 year olds and 8.5% to morality for 85+ year old seniors. Over all 15+ age groups, mortality increases by just over 7%.

A critical (and not yet known) variable included with these hypothetical scenarios is the degree of overlap between existing deaths (that would happen anyway) versus deaths that might be attributable solely to COVID-19. In between is a middle category of deaths that might have happened this year without coronavirus but for which the virus was another contributing factor. In some cases, coronavirus would accelerate the time of death, in others it might be a minor factor due to the seriousness of other underlying conditions.

No data has been made available from CDC that provides guidance as to how this may be attribution can best be made. And to a large extent, this may be an unknowable, for example, trying to ascertain to what extent alcoholism or heart disease or diabetes may have each contributed to a person’s demise.

As noted above, for sake of illustration and discussion, this analysis hypothesizes:

  • No overlap with the low coronavirus estimate - assuming that coronavirus is the primary death factor due to its relatively low incidence relative to baseline mortality.

  • 50% overlap with the high coronavirus estimate - assuming that a higher rate of infection and serious complication will inevitably involve a higher proportion of people with existing underlying issues.

If the 50% overlap estimate is off the mark so that there is little or no overlap with the high coronavirus death estimate, then we would be faced with a worst-worst scenario. In effect this could increase the total annual U.S. death toll from 3.7 million per year with coronavirus to as many as 4.5 million. This means that instead of COVID-19 accounting for a death rate increase of 30%, the increase in death rate would increase by 60%. and for persons 85+, the death rate attributable to coronavirus could increase by as much as 72% rather than 36%.

Conversely, it may be that the overlap is even greater than 50%. This possibility is supported by the observation from cases to date that most fatalities have involved individuals with existing (generally multiple) underlying health issues.

In effect, it may be the COVID-19 is not the sole cause of death but a contributing factor — in some cases the final tipping point, in others perhaps not. A more granular examination of mortality data might be useful to attempt to quantify how much a person’s life span, on average, is shortened as a result of this virus. Shortening a life by 1 month is much different than reducing the life span of a senior citizen by, say, five years. The appropriate policy choices may also differ substantially based on this type of determination.

IMPLICATIONS

The question of how much America’s coronavirus will increase deaths will have an obvious influence on the inherently no-win task of determining the appropriate trade-off between saving lives and salvaging the country’s economic and social well-being.

This blog intentionally avoids the question of what set of policies best provide the best balance of countering or mitigating two pressing and conflicting catastrophes. Rather, the implications of most immediate interest involve improvements to the base data so that the decisions made will be more informed. From a data perspective, the following suggestions are noted:

  1. CDC should be more transparent by fully and publicly disclosing the research methodology and conclusions regarding the four scenarios of potential U.S. death toll already prepared. This information is essential to better understand what is required to achieve each scenario from medical, economic and societal perspectives.

  2. Over the next 1-2 months as the virus reaches exponentially more people, it will be important for CDC to continuously and publicly update its databases — each time providing for a larger and clearer window into who will be most affected and how. And as more testing resources become available, conduct random sample monitoring to better benchmark infection and mortality rates.

  3. Finally, to the extent possible, it would be extremely useful for more robust CDC datasets to parse out the degree to which death rates at specific age levels overlap with deaths likely due to existing age and underlying health conditions. This will be essential to gaining a better understanding of the true net effect of coronavirus on mortality going forward.

In the weeks ahead, E. D. Hovee will continue to monitor progress on the data side of the coronavirus challenge. And due to a clear economic and development bent, I may offer observations aimed to contribute to ongoing public policy discussion. So, look for updates and feel free to question or critique as the occasion arises.

Take care and be well,
Eric Hovee - Principal