The Math of the Macabre (or The Equation of Death)

Given the ongoing Corona pandemic, I found myself staring at the worldometers website (  ) for the past few weeks to get a better understanding of the magnitude of the havoc that the COVID-19 has wrecked on the world. Initially, I noticed the COVID-19 cases and deaths creeping up gradually, but then the creeping soon evolved into accelerating and, of late, they have started galloping.

Around the same time, I came across a tweet that my dad had posted on the 23rd of March:


Picture 1: A snapshot of my father’s tweet

Thereafter, I kept an eye out on the worldometer curves (a snapshot of a curve can be seen in the tweet itself) and was stunned when the deaths reached approximately 27,000 four days after his tweet!

This, along with the countless discussions I have with my friends after our zoom classes, finally motivated me to diligently analyze the data of the COVID-19, not for any particular purpose but just to satisfy my intellectual curiosity. My first step was identifying the right indicator to choose. At first, I thought of choosing the number of Cases, but the testing process is different in different countries. Additionally, many cases go unreported in countries that do not have adequate testing facilities. Thus, the data on the same may not reflect the actual number of cases. For this reason, I decided to track the number of Deaths per day and Cumulative Deaths.

Getting the data in place :

I made a simple table on MS-Excel to track the same.

Date Cumulative deaths Deaths for the day
Feb 16 1,775  
Feb 17 1,873 98
Feb 18 2,009 136
Feb 19 2,126 117
Feb 20 2,247 121
Feb 21 2,360 113
Feb 22 2,460 100
Feb 23 2,618 158
Feb 24 2,699 81
Feb 25 2,763 64
Feb 26 2,800 37
Feb 27 2,858 58
Feb 28 2,923 65
Feb 29 2,977 54
Mar 1 3,050 73
Mar 2 3,117 67
Mar 3 3,202 85
Mar 4 3,285 83
Mar 5 3,387 102
Mar 6 3,494 107
Mar 7 3,599 105
Mar 8 3,827 228
Mar 9 4,025 198
Mar 10 4,296 271
Mar 11 4,628 332
Mar 12 4,981 353
Mar 13 5,428 447
Mar 14 5,833 405
Mar 15 6,520 687
Mar 16 7,162 642
Mar 17 7,979 817
Mar 18 8,951 972
Mar 19 10,030 1,079
Mar 20 11,386 1,356
Mar 21 13,011 1,625
Mar 22 14,640 1,629
Mar 23 16,513 1,873
Mar 24 18,894 2,381
Mar 25 21,282 2,388
Mar 26 24,073 2,791
Mar 27 27,343 3,270
Mar 28 30,861 3,518
Mar 29 34,065 3,204
Mar 30 37,774 3,709
Mar 31 42,309 4,535
Apr 1 47,192 4,883
Apr 2 53,166 5,974
Apr 3 59,145 5,979
Apr 4    
Apr 5    
Apr 6    
Apr 7    
Apr 8    

Table 1: Data on the cumulative deaths and daily deaths

Then, I plotted the charts for “cumulative deaths” and “deaths per day”:


Graph 1: Cumulative deaths from 16th February till 3rd April


Graph 2: Number of daily deaths from 16th February till 3rd April

Modeling the equation:

I tried to model an equation for the data using the in-built equation of best fit function on MS-Excel itself. There are five types of equations (standard curves) that Excel offers for modeling a best-fit equation. These are as follows:

  1. Exponential
  2. Logarithmic
  3. Linear
  4. Power
  5. Polynomial

Intuitively, from a quick eyeballing, I rejected linear, logarithmic, and power curves. However, just to be sure, I tried each curve out. The gap between the line of best fit and the data was too much for the above three curves, and this proved my intuition to be correct.

Ultimately, a polynomial graph suited the data. At first, I used a 2-degree polynomial (parabolic in shape) which looked like this:


Graph 3: A degree two-equation of best fit

However, on increasing the degree of the polynomial, my coefficient of determination value became closer to 1.

Here’s a 4-degree polynomial which models the data with great precision:


Graph 4: a degree 4 polynomial of best fit (beginning Feb 16, 2020)

The equation of the 4-degree polynomial is

y =  0.0315x4 – 1.4216x3 + 20.471x2 + 10.49x + 1881.3

where y is the number of deaths to date and x is the number of days since February 16th.

To get a more tangible feel of the precision of my model, I substituted a few values of days into the equation and found an error % between the predicted value and the actual value.

Start date–> 16 February 2020
Date Days elapsed (x) Predicted Actual Error
25 March 2020 38 18719 21282 -13.7%
01 April 2020 45 42489 47192 -11.1%
05 April 2020 49 64860    
10 April 2020 54 105004    
15 April 2020 59 162252    
30 April 2020 74 481715    

I wondered if the % error above was significant to take the equation to forecast the deaths in the future.

I then realized that the data being analyzed comprised two main sections: the slow ‘creeping’ rise and the ‘galloping’ rise. This could possibly be the reason for the high level of inaccuracies in the predictions. To find an equation that could be used to predict the current scenario, I then chose recent data points of the cases that have occurred since 5th March.

The chart and the equation are shown below :

Graph 5

Graph 5: a degree 4 polynomial of best fit (beginning Mar 5, 2020)

The equation this time around is

y =  0.0337x4 + 0.7715x3 + 5.0925x2 + 94.975x + 3296.3

where y is the number of deaths to date and x is the number of days since March 5th.

I repeated the task of finding some values of ‘y” for some selected values of “x” :

Start date–> 05 March 2020
Date Days elapsed (x) Predicted Actual Error
25 March 2020 20 18797 21282 -13.2%
01 April 2020 27 42668 47192 -10.6%
05 April 2020 31 65241    
10 April 2020 36 105913    
15 April 2020 41 164151    
30 April 2020 56 491495    

The error % has only reduced marginally than in the equation staring Feb 16th.  Thus, the MS-excel curve-equation program works well, and automatically takes more recent data if the curve has 2-3 distinct parts. However, I am happy to have reduced by an error of 0.5% because in a situation like this, every digit counts.

I think the above equation (since March 5th ) will be the best-fit equation for COVID-19 deaths for some time at least. I have modeled this equation using the data available. This model does not take into consideration any unforeseen acts like discovering a vaccine, or some community flare-up in Africa or India.

I am sure that a more accurate model using more rigorous mathematics can be developed; however, my model, based on data available publicly,  only serves to make an individual realize quantitatively the havoc that the COVID-19 has wrecked across the world.

The ultimate driving force behind me writing this is to make you realize that it’s only an upward curve from here unless we ALL take action. Let’s be a responsible citizen not only of your country but also of the world. If the corona hits the slums in the Indian subcontinent or Africa, the situation will worsen beyond control. Most of the poor people are following the lockdown due to which they have lost their daily source of income and are struggling to acquire the basic necessities. On the other hand, there are relatively privileged people sitting at home who have the means to help those out there indirectly. Whenever you see a “please donate” link on Instagram or any other social media platform, please do not scroll past it and dismiss it as “Oh someone else will do it” or “Oh I do not have the time”. Make an impact. Something that may seem small and inconsequential to you will make a big difference to a family in reality.

Through our smart actions, we alone can stop this Equation of mine from becoming the “Equation of Death”!


One thought on “The Math of the Macabre (or The Equation of Death)

  1. Great analysis! This made me so and think. Maybe you should also share your Excel sheet so that others can also model and benefit from your work.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: