Expected Death Analysis

Covid-19 is causing a higher death rate than normal. There has been some controversy as to whether covid-19 was the cause of death, even if it was present in the deceased. To avoid this controversy, it's possible to just look at the number of deaths in a particular week and compare it with the number the same week over the past few years. (See, for example, Mortality surveillance during the COVID-19 pandemic in the Bulletin of the World Health Organization. ) The graph below uses data from the CDC ( JSON). In particular, the week of reported deaths and the percentage of expected deaths fields are used. The percent of expected deaths compares the death count in the particular week to the average for that week in 2017 throgh 2019. Comparing to same week data for previous years removes seasonal effects on death rates.

There is also an interesting analysis in the New York Times. The NYT analysis shows the consistently improving death rate (deaths per 100,000 population) until 2020. Analysis of death rates is interesting since there are so many variables. The number of deaths per 100,000 population is considerably less than half of what it was in 1920, probably due to medical advances. Other effects on the death rate include the age of the population, the expected lifespan of the population, and many other effects. The NYT article also has a chart comparing the death rate to the previous year showing a substantial increase in 2020. With the high reference of 2020 for 2021, we are likely to see a negative "percentage increase in death rate from previous year" in 2021.

As can be seen in the data (blue line) through March 14, this number is pretty stable. However, after March 14, we see the percentage of expected deaths increase dramatically. Towards the end of the dataset, the percentage of expected deaths is very low. This is because the CDC records the actual date of the death, but may not receive that report for several weeks. We can make predictions based on early reports (see below).

According to this document at medium.com, the timeline for infections ending in death averages as below:

If we take the middle of each of these periods, we might estimate that death occurs about 6 weeks after exposure. We see the original upward trend in deaths at about March 21, 2020. Deaths peaked the week of April 11, 2020, then slowly declined until the week of June 6, 2020. Deaths held steady at about 110% of normal for about 4 weeks, then climbed to about 115% the week of July 11, 2020. Using the predicted results (red curve below), deaths continued to increase until stabilizing at around 120% of normal. The first increase (in March) was due to introduction of the virus to the U.S. What caused the second increase (starting the week of July 11, 2020)? There's a detailed timeline at the Project On Government Oversight, but it does not go to 6 weeks before July 11 (which would roughly indicate an increase in exposures around the week of May 30, 2020). PERHAPS the exposure increase and following death increase was due to large gatherings Memorial Day weekend (May 23 through May 25). What other conclusions can we draw from the data?

To Date Summary

The table below shows the actual number of reported deaths since 1/1/20, the expected nmber of deaths, and the excess number of deaths. As discussed above, the expected number is based on the same period for the years 2017 through 2019.

Predictions Based On Early Reports

As discussed above, it takes several weeks for the CDC to receive all the death reports for a specific week. In an attempt to get more current data, a "correction factor" has been developed based on how the percent expected deaths increases after the week of death as more reports are received. Data on report delays has been gathered starting June 15, 2020. This data is here. The summary data used for the predictions is here. The "Average Percent of Max" (not really shown as percent; 1.00 = 100%) is the portion of the final value that shows up the specified number of days after the end of the week of the death. The "final value" is the maximum value over time for that weeek as additional reports arrive at the CDC. The reported CDC percent of expected deaths is divided by the Average Percent of Max for the number of days between the end of the week of death and the CDC update date to yield the predicted percent of expected deaths. Based on this data, a predicted percentage of expected deaths is determined immediately following the week of death. As the CDC updates their numbers (every weekday), the correction factors are updated and the predicted percentage of expected deaths is updated. Along with a prediction based on the average correction factor, predictions are made based on the minimum and maximum prediction factors. These are the minimum percentage of final value the specified number of days after the end of week of the reported deaths and the maximum percentage of final value. Data is included in the prediction factor calculations once the percentage of expected deaths as reported by the CDC is unchanged for a week (making it appear this close to the final value). NOTE that the predicated values depend upon consistent reporting rates. If data to the CDC arrives faster than it has in the past, the predicted values will be higher. A graph of the reporting delay is below.

Average Age at Time of Death

The CDC does not publish data revealing the average age at death each week. They do, however, make data available showing the total number of deaths in approximate 10 year age groups each week. This data is here (CSV).

In the graph below, this data is used to compute an average age at death for each week. Records are used where the following conditions are satisfied:

The number of deaths in each age group is multiplied by the average age of that group (for example, the group 25-34 years uses an age of 30 years). These products are then added and divided by the total number of deaths in all age groups to yield an average age at death. Note that the age used for the 85 and older group is 90 years, the average in an 85 to 94 age group. Because the chart is not comparing age at time of death to historic values as the Percent of Expected Deaths chart, above, is, there may be seasonal effects that influence the chart.

Left drag mouse over area to zoom to that area; right click to zoom back to full

At first, I was expecting the age at death to decrease due to covid-19. We could then say covid-19 has resulted in so many years loss of life. However, it appears that covid-19 has, instead, just upset the balance in the number of deaths between different age groups. It has caused more older people to die causing the average age at time of death to increase. When percentage of expected deaths peaked in April 2020, the average age at death also peaked. When the percentage of expected deaths decreased in June 2020, fewer elderly were dying, so the average age at death also decreased. Then, as the percentage of expected deaths increased in July 2020, the average age at death also increased.

It's interesting to compare this with other data. For example, the life expectancy for someone born today in the US is about 78.539 years (as of 2018). However, people dying today were not born today, so today's life expectancy is not appropriate. We might look at the median age of the US population (38.2 years according to Wikipedia. We could then look at the life expectancy for those born in 1982. The life expectancy chart shows a life expectancy of 74.361 years for those born in 1982. That is still higher than the average age at death shown in the graph above. Looking at the graph above, we are seeing people die at about age 73. Thus they were born around 1947. Life expectancy for those born in 1950 is about 68.2 years according to this CDC document. The current 73 years is between the expected 68.2 years and the expected 74.361 years. As mentioned before, there are likely to be seasonal effects, but there is a fairly dramatic variation in age at death that tracks the percentage of expected deaths.

The image below demonstrates the average age at death calculation for February 1, 2020. It can be used to verify the average age at death calculations used to create the graph. The below calculation was done several months ago, and since then more data has shown up. This results in a slight difference between the below calculation and the data in the above graph. The graph above always uses the latest data.

Other Resources

Raw CDC Data

On 1/19/21, the CDC revised their website making it difficult to see the complete set of expected death data. The table below pulls the latest data from the CDC JSON data (https://data.cdc.gov/resource/r8kw-7aab.json). Note that the ending data is very low because it takes several weeks for the CDC to receive the death reports. The data below is based on the date of death, not the date the report is received. The percent of expected deaths will increase as more reports are received. The graph near the top of the page applies prediction factors to this data to yield a predicted percentage of expected deaths based on typical reporting delays.

CDC Data as of 07/30/2021

Week EndingPercent of Expected Deaths

Comments to harold@hallikainen.org