# The Math of the Macabre (or The Equation of Death)

Given the ongoing Corona pandemic, I found myself staring at the worldometers website ( https://www.worldometers.info/coronavirus/  ) for the past few weeks to get a better understanding of the magnitude of the havoc that the COVID-19 has wrecked on the world. Initially, I noticed the COVID-19 cases and deaths creeping up gradually, but then the creeping soon evolved into accelerating and, of late, they have started galloping.

Around the same time, I came across a tweet that my dad had posted on the 23rd of March:

Picture 1: A snapshot of my father’s tweet

Thereafter, I kept an eye out on the worldometer curves (a snapshot of a curve can be seen in the tweet itself) and was stunned when the deaths reached approximately 27,000 four days after his tweet!

This, along with the countless discussions I have with my friends after our zoom classes, finally motivated me to diligently analyze the data of the COVID-19, not for any particular purpose but just to satisfy my intellectual curiosity. My first step was identifying the right indicator to choose. At first, I thought of choosing the number of Cases, but the testing process is different in different countries. Additionally, many cases go unreported in countries that do not have adequate testing facilities. Thus, the data on the same may not reflect the actual number of cases. For this reason, I decided to track the number of Deaths per day and Cumulative Deaths.

Getting the data in place :

I made a simple table on MS-Excel to track the same.

 Date Cumulative deaths Deaths for the day Feb 16 1,775 Feb 17 1,873 98 Feb 18 2,009 136 Feb 19 2,126 117 Feb 20 2,247 121 Feb 21 2,360 113 Feb 22 2,460 100 Feb 23 2,618 158 Feb 24 2,699 81 Feb 25 2,763 64 Feb 26 2,800 37 Feb 27 2,858 58 Feb 28 2,923 65 Feb 29 2,977 54 Mar 1 3,050 73 Mar 2 3,117 67 Mar 3 3,202 85 Mar 4 3,285 83 Mar 5 3,387 102 Mar 6 3,494 107 Mar 7 3,599 105 Mar 8 3,827 228 Mar 9 4,025 198 Mar 10 4,296 271 Mar 11 4,628 332 Mar 12 4,981 353 Mar 13 5,428 447 Mar 14 5,833 405 Mar 15 6,520 687 Mar 16 7,162 642 Mar 17 7,979 817 Mar 18 8,951 972 Mar 19 10,030 1,079 Mar 20 11,386 1,356 Mar 21 13,011 1,625 Mar 22 14,640 1,629 Mar 23 16,513 1,873 Mar 24 18,894 2,381 Mar 25 21,282 2,388 Mar 26 24,073 2,791 Mar 27 27,343 3,270 Mar 28 30,861 3,518 Mar 29 34,065 3,204 Mar 30 37,774 3,709 Mar 31 42,309 4,535 Apr 1 47,192 4,883 Apr 2 53,166 5,974 Apr 3 59,145 5,979 Apr 4 Apr 5 Apr 6 Apr 7 Apr 8

Table 1: Data on the cumulative deaths and daily deaths

Then, I plotted the charts for “cumulative deaths” and “deaths per day”:

Graph 1: Cumulative deaths from 16th February till 3rd April

Graph 2: Number of daily deaths from 16th February till 3rd April

Modeling the equation:

I tried to model an equation for the data using the in-built equation of best fit function on MS-Excel itself. There are five types of equations (standard curves) that Excel offers for modeling a best-fit equation. These are as follows:

1. Exponential
2. Logarithmic
3. Linear
4. Power
5. Polynomial

Intuitively, from a quick eyeballing, I rejected linear, logarithmic, and power curves. However, just to be sure, I tried each curve out. The gap between the line of best fit and the data was too much for the above three curves, and this proved my intuition to be correct.

Ultimately, a polynomial graph suited the data. At first, I used a 2-degree polynomial (parabolic in shape) which looked like this:

Graph 3: A degree two-equation of best fit

However, on increasing the degree of the polynomial, my coefficient of determination value became closer to 1.

Here’s a 4-degree polynomial which models the data with great precision:

Graph 4: a degree 4 polynomial of best fit (beginning Feb 16, 2020)

The equation of the 4-degree polynomial is

y =  0.0315x4 – 1.4216x3 + 20.471x2 + 10.49x + 1881.3

where y is the number of deaths to date and x is the number of days since February 16th.

To get a more tangible feel of the precision of my model, I substituted a few values of days into the equation and found an error % between the predicted value and the actual value.

 Start date–> 16 February 2020 Date Days elapsed (x) Predicted Actual Error 25 March 2020 38 18719 21282 -13.7% 01 April 2020 45 42489 47192 -11.1% 05 April 2020 49 64860 10 April 2020 54 105004 15 April 2020 59 162252 30 April 2020 74 481715

I wondered if the % error above was significant to take the equation to forecast the deaths in the future.

I then realized that the data being analyzed comprised two main sections: the slow ‘creeping’ rise and the ‘galloping’ rise. This could possibly be the reason for the high level of inaccuracies in the predictions. To find an equation that could be used to predict the current scenario, I then chose recent data points of the cases that have occurred since 5th March.

The chart and the equation are shown below :

Graph 5: a degree 4 polynomial of best fit (beginning Mar 5, 2020)

The equation this time around is

y =  0.0337x4 + 0.7715x3 + 5.0925x2 + 94.975x + 3296.3

where y is the number of deaths to date and x is the number of days since March 5th.

I repeated the task of finding some values of ‘y” for some selected values of “x” :

 Start date–> 05 March 2020 Date Days elapsed (x) Predicted Actual Error 25 March 2020 20 18797 21282 -13.2% 01 April 2020 27 42668 47192 -10.6% 05 April 2020 31 65241 10 April 2020 36 105913 15 April 2020 41 164151 30 April 2020 56 491495

The error % has only reduced marginally than in the equation staring Feb 16th.  Thus, the MS-excel curve-equation program works well, and automatically takes more recent data if the curve has 2-3 distinct parts. However, I am happy to have reduced by an error of 0.5% because in a situation like this, every digit counts.

I think the above equation (since March 5th ) will be the best-fit equation for COVID-19 deaths for some time at least. I have modeled this equation using the data available. This model does not take into consideration any unforeseen acts like discovering a vaccine, or some community flare-up in Africa or India.

I am sure that a more accurate model using more rigorous mathematics can be developed; however, my model, based on data available publicly,  only serves to make an individual realize quantitatively the havoc that the COVID-19 has wrecked across the world.

The ultimate driving force behind me writing this is to make you realize that it’s only an upward curve from here unless we ALL take action. Let’s be a responsible citizen not only of your country but also of the world. If the corona hits the slums in the Indian subcontinent or Africa, the situation will worsen beyond control. Most of the poor people are following the lockdown due to which they have lost their daily source of income and are struggling to acquire the basic necessities. On the other hand, there are relatively privileged people sitting at home who have the means to help those out there indirectly. Whenever you see a “please donate” link on Instagram or any other social media platform, please do not scroll past it and dismiss it as “Oh someone else will do it” or “Oh I do not have the time”. Make an impact. Something that may seem small and inconsequential to you will make a big difference to a family in reality.

Through our smart actions, we alone can stop this Equation of mine from becoming the “Equation of Death”!

## One thought on “The Math of the Macabre (or The Equation of Death)”

1. Great analysis! This made me so and think. Maybe you should also share your Excel sheet so that others can also model and benefit from your work.