Blog Thumb
  • Esha Noor
  • 10 Jul
  • 6 min read

Predicting Disease Outbreaks Before They Spread Using AI

Since 1980, the number of global outbreaks has increased significantly. The recent COVID-19 outbreak alone is estimated to have taken the lives of over 7 million individuals. If this pandemic taught us anything, it's that early detection can make the difference between containment and catastrophe. Living in an increasingly digitized world, the first signs of a pandemic may not always appear to us as physical symptoms, but rather in the form of a spike in Google searches for the word “fever,” or a flurry of news reports from across the globe reporting on a mysterious illness. Similarly, potential outbreaks can also be signalled by certain biomarkers, such as the unexpected increase of viral fragments detected in wastewater, often surfacing days or weeks before clinical cases are first reported.

“ Exploring the ways in which AI and ML predictive tools are being implicated in the field of epidemiology. ”Quote Images

This is where AI comes in — through implementing machine learning (ML) and natural language processing (NLP) techniques, AI driven tools can sift through massive streams of online and public health related data, allowing for the early detection of patterns and anomalies that may indicate the onset of a disease outbreak. In this article, we'll examine how AI powered systems aim to transform outbreak monitoring framed through real world examples, also highlighting the promises and challenges faced when integrating AI into global public health.

Case Study: Bluedot — How AI Predicted A Pandemic

BlueDot is a digital-health company based in Toronto. Nine days before the World Health Organization (WHO) first reported on Coronavirus, BlueDot detected an unusual rise in cases of pneumonia in Wuhan, promptly alerting its clients. BlueDot would then go on to correctly identify 12 other major cities that would subsequently be impacted by the COVID-19 pandemic. This begs the question, how did a smaller, Canada-based digital health company predict a major disease epidemic before the WHO — the leading global authority on disease outbreaks and public health emergencies? The answer is through data analytics, AI, and ML techniques. BlueDot performed data ingestion at a massive scale, scanning new articles, health reports, airline data, and more all on a daily basis, and in over 65 different languages. The company also used NLP and ML to parse relevant and critical disease information, with the end goal of organizing the unstructured data into structured, spatiotemporal disease data, i.e., allowing users to see the time and place a particular pathogen may be spreading.

ML was again leveraged to identify clusters of pneumonia cases in Wuhan, which were compared against historical patterns in order to label them as anomalies. Integrating real time airline ticketing and flight data allowed BlueDot to accurately predict potential next destinations for viral spread. Through this multi-faceted approach, BlueDot was not only able to predict an impending outbreak, but also warn its clients in a timely manner. Evidently, just warning the client-base of the company alone was not enough to prevent the pandemic altogether, however, it was a step in the right direction. Imagine this type of solution integrated on a larger scale — while increasing the outreach and accessibility of such AI powered tools may not halt a disease outbreak in its tracks, it would at least provide the user with the information they need to make informed decisions in the wake of a pandemic.

Wastewater-Based Epidemiology: Tracking Outbreaks Below the Surface
Blog Thumb

Pathogens like SARs-CoV-2 are excreted in human waste. Wastewater-based Epidemiology (WBE) leverages this fact — sampling sewage from treatment plants or sewer mains — alongside ML, to develop models that can better quantify the proportion of viral genetic fragments in a particular community, gaining insights into the virulence of a pathogen. In a 2023 study published in Nature Communications, Li et al. demonstrated the utility of ML-based wastewater analysis. In their methodology, researchers collected 20 months worth of COVID-19 related data from close to 100 million patients over many different US Counties, and utilized it to build a random forest model capable of predicting weekly COVID-related hospital admission. Surprisingly, the model predicted an increase in cases one to four weeks before they happened, giving ample time for hospitals and public health officials to prepare beforehand. Additionally, the model's predictions were insanely accurate, with a mean absolute error of 4 to 6 new admissions for every 100000 people in a county.

Even as testing rates and case reporting declined and became less consistent, the model still remained accurate, highlighting the efficacy of using wastewater data based models as opposed to other means of predictive modelling. The accuracy and early-warning capability are both significant, as they allow the healthcare system to be adequately prepared prior to clinical cases surging, preventing the system from becoming overwhelmed (as was often the case during the pandemic). This study exemplifies how wastewater based epidemiology may bridge the gap left by diminishing clinical surveillance, offering accurate and actionable foresight into protecting public health.

Challenges of AI in Disease Outbreak Management

Evidently, the integration of AI into the management of disease outbreaks and the field of epidemiology offers tremendous potential, but it's also important to acknowledge the challenges it may bring. As aforementioned, AI is adept in quickly and efficiently processes data from a diverse range of sources, some of which may include news articles, social media feeds, electronic health records, and even wastewater epidemiological data, all with the end goal of detecting early warning signs of a pandemic before it can come into fruition.

This end goal, though ambitious, would enable public health officials to act swiftly, deploying the necessary resources and interventions needed to limit the scale of the emergent epidemic. Despite the successes of BlueDot and the WBE Researchers in predicting disease, challenges persist with respect to using predictive modeling in this sector of medicine, specifically, with respect to its global integration. Focusing on data types that require internet access (e.g., social media timelines, news reports) may be challenging for more remote or impoverished regions, which often have limited internet connectivity. Disease spread is often more rampant in such areas, and thus should be monitored more closely, but due to a lack of available data, this may not always be possible. Reliance on digital signals alone can inadvertently exclude and marginalize vulnerable populations who may not be as well represented by online data, raising equity concerns.

In such cases, it may be more advantageous to examine other more tangible markers of disease, like wastewater data or clinical admission records which may offer a broader and less biased perspective of community health irrespective of internet access. Addressing these gaps and limitations is crucial if we seek to integrate AI into the field of public health on a truly global scale.

The Path to Building a Proactive and Equitable Future
Blog Thumb

Looking ahead, it's apparent that the fusion of artificial intelligence with traditional epidemiological tools holds immense promise for safeguarding global health. Bluedot and the WBE study both illustrate the utility of AI in delivering earlier warning and sharper insights, allowing the public health sector to respond in a timely fashion. It’s important to consider that attaining the full potential of such innovations requires us to confront issues of equity, and data access. By combining AI and analytics with on-the-ground surveillance alongside ethical stewardship, the promise exists of a future where outbreaks are not just responded to, but anticipated and effectively contained.

Contact Us