Model simulations of the trajectory of COVID-19 in Maharashtra

(Murad Banaji, 24/05/2020)

Some updated estimates for COVID-19 in Maharashtra are presented. We know that the situation, particularly in Mumbai, is bad – worse than can be inferred directly from the numbers. Stories of hospitals overloaded and COVID patients being unable to access case are frequent. One question is, has the epidemic in Maharashtra peaked in terms of total infectious cases? The current data doesn’t tell us definitively whether this has happened, but suggests it is unlikely. We find that various possible evolutions are still possible, all compatible with the data so far.

We use the stochastic, agent-based model previously described (description at, code at to make some remarks about the mechanisms by which lockdown can slow the spread of COVID-19, focussing on the Indian data. First, some general remarks.

If we assume an IFR of 0.5% and run model simulations (below) taking into account systematic undercounting of COVID-19 deaths, we find that there have been about 4000 to 5000 deaths in Maharashtra and 2 to 2.5 million COVID-19 infections cumulatively to date. If two thirds are in Mumbai, then around 7-9% of Mumbai’s 18M population have had COVID-19. By comparison, the corresponding figure for London is about 17% and the figure for New York was 21% in late April.

Coming to the question of whether we are nearing or have even passed the peak in Maharashtra, this is not so clear. To see why, note that we can fit the confirmed infections data to date reasonably well using a variety of different model parameter values. In Figure 1 are plots from four simulations, all which match existing confirmed infections data reasonably well. Comments on the mismatch between model predictions and confirmed fatality data follow.

Figure 1. Four possible scenarios for how disease progresses in Maharashtra. Details in the text.

In all cases, the model estimates more than the recorded deaths, predicting about 4000 deaths to date where only 1576 have so far recorded. This is likely a consequence of systematic undercounting which followed a protocol change for how COVID-19 deaths would be treated on April 15th. This protocol change led to a precipitous fall in Maharashtra’s case fatality rate which remains today at about half of its peak value. We should remark that this misguided protocol change has interfered with the ability to reliably and systematically track the course of the pandemic in Maharashtra by making recorded fatality data – a key ingredient in monitoring the disease effectively useless.

Returning to the main theme, we see that in the most optimistic case, infectious cases (purple curves) are peaking around the time of writing at about 0.76M cases, and are beginning a slow descent. Note that infectious cases count only those people who are within the “infective window” (day 3 to day 14 of infection in these simulation), and not those who may be pre-infectious, or post-infectious. At the parameters used in this simulation, the descent after peak is so slow that it takes about one month for infectious cases to halve.

Somewhat less optimistic is a scenarios with a peak at the end of May with about 0.92M infectious cases. A third scenario, more pessimistic still is a peak in the first half of June with about 1.4M infectious cases. Finally, in the most pessimistic case there is no foreseeable peak with infectious cases reaching 4M shortly after mid-June. The parameter values used to obtain these plots are given in the Appendix.

The main difference between the simulations in Figure 1 is the extent to which control of the disease is being achieved through physical distancing and a consequent drop in transmission (parameter pdeff_lockdown); versus disease “localisation” (parameter infectible_proportion and popleak), namely, the extent to which disease has been stopped from reaching new areas. The moment at which lockdown occurs (parameter lockdown_at_test) is also changed slightly to obtain better matching with the measured data. The optimistic scenario represents a high degree of disease localisation – as high as is consistent with the data, while at the other end we have no disease localisation and only physical distancing taking place.

Of course there are other parameter choices which could be made and which would differ in their details. For example, if we change the IFR, then the absolute numbers of infectious cases change, but not the trends. We could also imagine scenarios where physical distancing weakens, or leak into the infectible population rises, causing the slope of the curve increases again.

The main conclusion is that the current data from Maharashtra is not sufficient to tell us with any confidence what is likely to happen in the weeks to come. However, if we return to the data periodically we may have better insight into which of these scenarios might unfold.


Here are the parameter values used to generate the 4 plots in Figure 1. Note that a scaling was applied to speed up computations – all populations appear at one tenth of their value, and death rates are multiplied by 10.

Figure 1 (lexicographically ordered)

number_of_runs 10
death_rate 5.0
geometric 1
R0 4.2
totdays 150
inf_start 3
inf_end 14
time_to_death 16
dist_on_death 6
time_to_recovery 20
dist_on_recovery 6
initial_infections 10
percentage_quarantined 3.7
percentage_tested 100
testdate 12
dist_on_testdate 6
herd 1
population 4000000
physical_distancing 0
pd_at_test N/A
pdeff1 N/A
haslockdown 1
lockdownlen 150
infectible_proportion 0.05, 0.1, 0.2, 1
lockdown_at_test 12, 12, 14, 20
pdeff_lockdown 53.5, 58, 60, 62
popleak 4000, 3000, 3000, 3000
popleak_start_day 10
sync_at_test 100
sync_at_time 24