Assessing and predicting epidemiological dynamics of the COVID-19 pandemic in Poland

Rafał J. Mostowy

Malopolska Centre of Biotechnology, Jagiellonian University in Krakow, Poland

Publishing date: 30/03/2020

Date of last data entry: 27/03/2020

Correspondence:

Summary

Poland, like other countries, is currently undergoing COVID-19 pandemic, caused by the SARS-CoV-2 virus, however to date few epidemiological predictions of the course of the Polish epidemic have been conducted so far. Here, I use a mathematical model which mimics a SARS-like epidemic and fit it to the reported number of deaths in Poland, as reported by the Ministry of Health, to quantify and predict the epidemiological dynamics in Poland. Specifically, I quantify the true start of the epidemic in Poland, the degree of case under-reporting and attempt to project the epidemic curve in the coming months depending on the efficiency of governmental social distancing measures. Here I summarise the main findings.

  • I estimate that the Polish epidemic began sometime around the second half of January. This suggests that the first documented COVID-19 case in Poland on 04/03/2020 is unlikely to be the first infected patient in Poland.

  • We currently have a large degree of case under-reporting. I estimate that we are probably missing between 50% and 75% of actual cases, a common phenomenon in many other countries. This number coincides with the previous estimates of the proportion of the population for whom the SARS-CoV-2 infection is asymptomatic or mild.

  • The degree of case under-reporting has been declining in recent days, presumably thanks to the increasing number of viral infection tests performed daily. I find that the trend of increasing proportion of cases reported will only continue if social distancing measures will substantially reduce viral transmission.

  • The effectiveness of governmental social distancing measures will be crucial for curbing the epidemic in the coming months, helping to prevent tens of thousands of unnecessary deaths.

  • Even if thanks to an extreme reduction of viral transmission via social distancing measures the epidemic will fade away in the coming weeks, the majority of the population will be left susceptible, leaving Poland vulnerable to future COVID-19 outbreaks. Given how little we know about the impact of seasonal variation on SARS-CoV-2 transmission, Poland needs a long-term strategy of dealing with the pandemic by preventing future outbreaks until mass vaccination becomes available.

Introduction

As of today, I have seen few attempts to predict epidemiological dynamics of the COVID-19 pandemic in Poland. The ones I have seen rely on fitting exponential functions to the number of reported cases. This approach has at least two major flaws. The first flaw is that the exponential phase of the epidemic only holds at the beginning when the majority of the population is still susceptible to infection, and hence it cannot be used to assess when the epidemiological curve will slow down and reach a maximum. The second flaw is that, as COVID-19 is thought to be asymptomatic in a large proportion of the population, the number of reported cases will depend not only on the epidemiological dynamics but also on the extent to which people are being tested for the presence of the virus. Hence, the ability to predict the outcome of epidemics in Poland, as in any other country, requires an approach that at the very least can account for both of these factors.

Here I use an approach similar to the one developed by my colleagues at the University of Bern, which is based on the idea of fitting a SEIR epidemiological model to the number of reported deaths in COVID-19 patients, arguing that such cases are unlikely to be missed. I fit this model to the official data provided by the Polish Ministry of Health, kindly collected and publicly provided by Michał Rogalski under this link. These data show that the first reported and confirmed case of the SARS-CoV-2 infection in Poland was on March 4th. The Polish government introduced first restrictions 10 days later on March 14th. Using these tools, I attempted to use mathematical modelling to provide insight into the seeding time of the epidemic, estimate the proportion of undetected cases as of 26/03, and predict different epidemic scenarios depending on the efficacy of the governmental restrictions. I also consider caveats of this approach and comment on limitations of mathematical modelling in epidemiology. Finally, I provide the code written using R Markdown, which anyone is welcome to download and use.

Methodology

The underlying epidemiological model is based on the classical SEIR approach, with the important difference that it explicitly considers people who are admitted to the hospital, the ICU and those who die from infection. The model can be summarised

Figure 1. Summary of the epidemiological model used. Susceptibles (S) become exposed (E) when they get in touch with an infectious person (I) at a rate \(\beta I\), and the exposed become infectious at a rate \(\sigma\). The infecteds recover (R) at a rate \((1-\epsilon_1)\gamma\) and become hospitalised (H) at a rate \(\epsilon_1\gamma\). The hospitalised recover at a rate \((1-\epsilon_2)\omega_1\) and get admitted to the ICU (V) at a rate \(\epsilon_2\omega_1\). The ICU patients recover at a rate \((1-\epsilon_3)\omega_2\) and die (D) at a rate \(\epsilon_3\omega_2\). In this model, \(\beta=R_0\gamma/N\), where \(R_0\) is the basic reproduction number and \(N\) is the population size. The epidemic begins at time \(t_{case}-T_{seed}\), where \(t_{case}\) is the date of the first reported case (04/03) and \(T_\text{seed}\) is the period of time between the first infection and the first actual case. At time \(t_{case}+10\), namely 14/03, governmental restrictions are introduced at the transmission of the virus is assumed to be \(\kappa R_0\), where \(\kappa\in[0,1]\).

I used the existing literature to make assumptions about the parameters used in the model above.

Parameter Value Source
Population size of Poland 38 386 000 Statistics Poland (GUS) Office
Serial interval (latent+infectious period) 7.5 days Li et al.
Duration of hospitalisation for mild and severe cases 8 days Imperial College COVID-19 Response Team: Report 9
Additional duration of hospitalisation for critical cases 8 days Imperial College COVID-19 Response Team: Report 9
Proportion hospitalised cases 5% Adapted from Imperial College COVID-19 Response Team: Report 9
Proportion critical cases 2.5% Adapted from Imperial College COVID-19 Response Team: Report 9
Overall case fatality ratio 1.25% Adapted from Imperial College COVID-19 Response Team: Report 9
Basic reproduction number \(R_0\) (see here for explanation) 2.0-2.6 Adapted from Imperial College COVID-19 Response Team: Report 9

Table 1. Parameters of the COVID-19 transmission model. The model is fit to the data using a maximum-likelihood approach by comparing the predicted to the reported number of deaths. The daily number of deaths is assumed to be Poisson distributed. Using the parameter values assumed above, we estimate the values of \(T_\text{seed}\) and \(\kappa\) that best explain the observed data given the assumed model. We consider four different values of \(R_0=\{2.0, 2.2, 2.4, 2.6\}\).

Results

0. Importance of the parameter \(\kappa\) (kappa)

As far as the prediction of the model goes, the value of \(\kappa\) is the most important parameter to determine the impact of the COVID-19 pandemic in Poland. It represents the impact of the governmental social distancing interventions (so called “lock-downs”) on the viral basic reproduction number \(R_0\) (number of secondary infections from an infectious person in a fully susceptible population; epidemic grows when \(R_0>1\) and fades away when \(R_0<1\)). Specifically, \(\kappa=1\) corresponds to the situation where the interventions have no impact on viral transmission (\(R_0\) does not change), whereas \(\kappa=0\) corresponds to the situation where the interventions have complete impact on viral transmission (\(R_0=0\)). Neither of these extremes is likely to be the case, and in reality \(\kappa\) will lie somewhere in the range between 1 and 0. The purpose of this approach is to estimate \(\kappa\), however as of 26/03/2020 we do not have enough signal from the data to estimate it: for each of the preassumed values of \(R_0\), the estimated \(\kappa\) was close to 1, however this result was not statistically significant (95% confidence intervals were [0,1]).

1. When did the Polish epidemic really start?

On the other hand the estimates of \(T_\text{seed}\) were statistically significant, hence I attempted to estimate the time when the Polish epidemic started (i.e., was “seeded”). The results, shown in Figure 2, suggest that the epidemic we are seeing in Poland began in the second half of January. Please note that this does not mean that the virus physically appeared in Poland then as the true “patient zero” likely contracted the virus abroad and subsequently brought it to Poland. However, this suggests that the virus was most likely circulating in Poland much earlier than when the first case got reported in the media on 04/03.

Figure 2. Estimated time of seeding infection in Poland. Bars show the 95% confidence intervals of the estimated time, \(T_\text{seed}\), translated to actual dates for an assumed value of \(R_0\). Dashed red bar shows the start of February.

2. How many infected people there are in Poland right now?

A major challenge with the COVID-19 pandemic is that likely many people are asymptomatic, at least for a certain period of time, which makes it difficult to detect all cases available. A recent report from London School of Hygiene and Tropical Medicine suggests that in many countries there is a substantial degree of case under-reporting of SARS-CoV-2 infected patients. The number of daily RT-PCR tests carried out in Poland has been increasing each day, but it is unclear whether we have been getting better at countering such under-reporting or not. Using the proposed approach here, I tried to assess the scale and the trend in under-reporting assuming that the model can predict the actual number of cases well. The results are shown in Figure 3. They show that in Poland there exists a considerable degree of case under-reporting, however its magnitude will strongly depend on the efficiency of social distancing measures introduced by the government. I estimate that the reported cases constituted around 6%-8% of all infected people in Poland on 14/03 when such measures were introduced. However, my approach suggests that Poland has been getting better at countering under-reporting as the proportion of detected cases has been increasing. As of 26/03, I find that we have been capturing somewhere between 15% to 57% of all hCoV-infected people, largely depending on the value of \(\kappa\). This degree of under-reporting is not very surprising given the epidemiology of the disease and the testing capacity in many countries. For example, the LSHTM report mentioned above suggests that as of March 23rd, the proportion cases reported varied between countries from under 20% (eg., Italy, Spain, Turkey, UK, France) to somewhere between 50% and 95% (Germany, South Korea). However, such comparisons should be made with caution as different countries are in different stages of the epidemic and the scale of under-reporting is expected to change (hopefully decrease) over time.

If we assume that in reality the value of \(\kappa\) lies somewhere between 0.2 and 0.8, then in Poland we should be currently reporting somewhere around 18%-45% of all infected cases. Importantly, I found that the increasing trend in the number of tests performed daily (29700 tests performed on 26/03) may be not enough if transmission has not been substantially curbed. As demonstrated in Figure 3 (middle panel), for values of \(\kappa\) close to 1, I found that the number of tests per infected person is slowing down or decreasing, depending on the value of \(R_0\). By contrast, for values of \(\kappa\) closer to 0 such number has been steadily increasing. Given that we are somewhere in between, it is likely that we have been detecting an increasing proportion of all cases in Poland. For example, if we assume that \(R_0=2.6\) and \(\kappa=0.5\) (basic reproductive number has been halved to 1.3 due to lockdowns), then on 04/03 we detected (0.4% of all cases, on 14/03 we detected (7.7% of all cases and on 26/03 we detected (28.8% of all cases. Nevertheless, estimating the efficiency of governmental interventions is necessary to perform a more accurate estimate of the changing trend in case under-reporting over time.

Figure 3. Trends in case reporting and testing in Poland The plots show the trends in case reporting and testing over time (X-axis), with each row showing the results for a different assumed value of \(R_0\). The left plot shows the number of reported cases (black) and the predicted number of cases (infected individuals) for different values of \(\kappa\) (shades of blue), where \(\kappa=1\) reflects unchanged transmission post-intervention and \(\kappa=0\) reflects no transmission post-intervention. The predicted scale of under-reporting over time is estimated in the middle plot which shows the proportion of confirmed cases to the predicted number of cases for different values of \(\kappa\). The right plot shows the number of tests performed to the predicted number of cases for different values of \(\kappa\). Note that the greater number of tests than the predicted number of true cases does not imply that enough tests are being conducted as the huge majority of the people tested test negatively for the virus. The red dashed line shows the time of introduction of the governmental restrictions on 14/03.

3. How will social distancing impact the COVID-19 pandemic in Poland?

As the efficiency of the governmental social distancing measures, \(\kappa\), cannot be reliably estimated for now, I decided to project the impact of the COVID-19 pandemic in Poland in the coming months depending on such efficiency (see Figure 4) assuming the most infectious scenario of those considered, namely \(R_0=2.6\). The results show that the outcome of the epidemic will hugely depend on the post-intervention viral transmission. Even if the interventions have been so effective that \(R_0\) has been reduced to below \(1\) (epidemic stops growing and starts fading away), the time for the epidemic wave to end will still depend on the value of \(R_0\). For example, if \(R_0\) has been reduced to below \(0.5\), the epidemic curve should fade away during early summer; however if \(R_0\) ends up being between 0.5 and 1, it could take much longer. Conversely, if we cannot bring down \(R_0\) below 1, the effect on the population health and our health care could be considerable, with the risk of tens of thousands of deaths or even more by summer. Hence, these simulations show that the efficiency of governmental social distancing measures will play an important role in preventing thousands of unnecessary deaths.

However, efficient social distancing and quickly fading epidemic wave have a flip side as then most of the population would remain susceptible to the infection. As shown in Figure 4, a large number of susceptibles means that we would not be able to produce the level of herd-immunity required to prevent future COVID-19 outbreaks in Poland, and such outbreaks could occur following abandonment of the governmental social distancing measures. Hence, until the vaccination becomes widely available to help artificially generate such herd immunity, a long-term strategy of preventing such outbreaks in Poland should be considered (for example via alternative measures like serological tests, contact tracing, proactive isolation etc.).

Figure 4. Projections of the pandemic in Poland. Predictions for the impact of the COVID-19 pandemic in Poland depending on the value of \(\kappa\), represented by different shades (darkest to lightest represent the least to the most effective intervention, respectively). Top (blue) panel shows the proportion of the population who are susceptible to infection at a given time. Dashed blue line shows the maximum proportion of the population, \(1/R_0\) (38% assuming \(R_0=2.6\)), who are not immune to be able to maintain herd immunity in the population (having more than the dashed line could lead to new outbreaks). The second (red) panel shows the projected number of hospitalised people. The third (purple) panel shows the projected number of people at intensive care units. The fourth, bottom (black) panel shows the projected cumulative number of deaths. The red dashed line shows the time of introduction of the governmental restrictions on 14/03, and the black dashed line shows the most recent data entry.

Conclusions

This analysis is based on an approach that has several important limitations. First, any predictions of this model are constrained by the properties of the model as well as parameters it pre-assumes. I based parameter values on those reported in the scientific literature, however their magnitude will affect the quantitative predictions presented here. One important example is the case-fatality rate, here assumed to be 1.25%, which is expected to largely impact the estimated scale of under-reporting and it remains to be examined how the results differ when the assumed case-fatality rate is lower or higher. In general, it is important to not use the proposed framework to make quantitative claims about the impact of the epidemic in the future, rather to assess the impact of certain assumptions (eg, reduction in transmission due to governmental social distancing measures \(\kappa\)) on the epidemiological dynamics.

Nevertheless, the qualitative predictions from this analysis are not particularly surprising to anyone studying infectious disease dynamics. First, the Polish epidemic very likely started a few weeks before the first case was reported. Second, currently reported numbers of infected patients are probably a considerable under-estimate of the true numbers given the still relatively low number of RT-PCR tests carried out in Poland compared to other countries like South Korea (although low percentage of cases reported is common in many countries at the moment). Third, the efficacy of the governmental social distancing measures (“lockdowns”) will strongly impact the course of the epidemic. Finally, until COVID-19 vaccine becomes available, complete abandonment of social distancing measures will produce a risk of epidemic reemergence in Poland. Given how unlikely it is that SARS-CoV-2 will go away in the coming months, there is a need for complementary strategies of preventing such outbreaks.

While this analysis is my first attempt to predict the epidemiological dynamics of COVID-19 and I will be getting feedback from other researchers in the field, anyone is welcome to download the code, modify it and implement their own improvements.

Acknowledgements

I thank Christian Althaus for kindly sharing his approach publicly, Krzysztof Słomczyński for help with R Markdown issues and the Spokesmen for Science Society for the help in communicating the results of the report.