Malaria transmission can be affected by multiple or even hidden factors, making it difficult to timely and accurately predict the impact of elimination and eradication programs that have been undertaken and the potential resurgence and spread that may continue to emerge. One approach at the moment is to develop and deploy surveillance systems in an attempt to identify them as timely as possible and thus to enable policy makers to modify and implement strategies for further preventing the transmission. Most of the surveillance data will be of temporal and spatial nature. From an interdisciplinary point of view, it would be interesting to ask the following important as well as challenging question: Based on the available surveillance data in temporal and spatial forms, how can we build a more effective surveillance mechanism for monitoring and early detecting the relative prevalence and transmission patterns of malaria? What we can note from the existing clustering-based surveillance software systems is that they do not infer the underlying transmission networks of malaria. However, such networks can be quite informative and insightful as they characterize how malaria transmits from one place to another. They can also in turn allow public health policy makers and researchers to uncover the hidden and interacting factors such as environment, genetics and ecology and to discover/predict malaria transmission patterns/trends. The network perspective further extends the present approaches to modelling malaria transmission based on a set of chosen factors. In this article, we survey the related work on transmission network inference, discuss how such an approach can be utilized in developing an effective computational means for inferring malaria transmission networks based on partial surveillance data, and what methodological steps and issues may be involved in its formulation and validation.
Technically, the problem of computationally inferring malaria transmission networks is both interesting and challenging because, during the process of disease spread, the reported infection cases do not directly reflect the full extent of transmission, nor the underlying transmission patterns. It would be desirable for us to detect such networks from the partially available surveillance data. In doing so, we may incorporate a malaria infection model, e.g., the Ross-MacDonald model.
In this article, we discuss how such a computational method differs from existing methods of network inference, in the light of the unique nature of malaria transmission dynamics. In computer science, related studies have been carried out to solve the problem of inferring information diffusion networks from Web data[7-10]. These studies only consider temporal information and cannot readily be extended sufficiently to incorporate additional information, such as spatial, environmental, climatic, and clinical information. Also, most of the methods are based on independent cascading models, assuming that one node will be independently infected by others with respective probabilities, and cannot readily integrate more complicated infection/propagation models.
We searched and reviewed the related research papers in (1) bibliographical databases including Web of Science and PubMed, (2) international conferences including ACM-SIGKDD, ICDM, ICML, SIAM-SDM, WWW, etc., and (3) World Health Organization (WHO) reports. The aim of our survey is to find and study (1) existing methods for modelling disease infections and epidemics-like transmission processes based on structural representations such as transmission paths or networks, and furthermore (2) those for inferring the underlying transmission networks based on temporal and/or spatial surveillance data.
We first examined existing studies on modelling temporal-spatial patterns of epidemic dynamics. We started by evaluating the scan statistics-based clustering methods and their related software tools for modelling malaria transmission (which are also useful for detecting active foci or hotspots over time and space). Our survey aims to identify the need for a more explicitly structured representation of disease spread, e.g., the interrelationships among different locations due to the heterogeneous temporal-spatial factors affecting hosts, vectors, and parasites at various scales. Such a representation would be particularly desired in planning cost-effective intervention strategies. We then surveyed the related studies in both epidemiology and other disciplines that had demonstrated how disease spread and/or information dynamics could be revealed based on network representations, i.e., disease spread dynamics on networks. In doing so, we focused on how dynamics may vary with respect to the characteristics of networks, e.g., regular, small-world, or scale-free networks, as well as human behaviour, e.g., mobility.
Once we confirmed the role of networks in understanding disease spread dynamics, our next logical step was to investigate how existing studies had attempted to predict the structures of underlying transmission networks, whether indirectly, e.g., through information on human mobility and social contact activities, or directly, e.g., based on observed/reported cases of infection. We paid special attention to the present methods that had been implemented for the purpose of inferring an underlying transmission network (links) from its observed (node) activities. The surveyed work, although may have appeared in the fields other than epidemiology, would provide us with a good understanding of the general methodology for detecting interaction networks from observations over time and space.
Malaria transmission can be affected by multiple factors, such as biology, environment, and socio-economy, that directly impinge on the interactions among hosts, vectors, and parasites at varying degrees and scales[17-19]. A feasible means to model malaria transmission is to rely on passive case reporting surveillance systems or surveys through local, national, or regional public health and medical organizations. Most of the data collected from such systems or surveys may contain temporal, spatial, clinical, and/or demographic information, and may cover only arbitrary locations and age-groups. At the moment, temporal-spatial scan statistics-based clustering techniques have been applied to the analysis and characterization of temporal-spatial patterns of malaria. In doing so, software tools[12-15] have been used to manage and geographically map reported malaria cases, and to test whether the cases are randomly or significantly distributed with spatial or space-time disease clusters. Sometimes, for a more accurate malaria map, additional information may be incorporated, e.g., by means of modelling entomological inoculation rates, vector capacity, or force of infection, or by using a combination of epidemiological, geographical, and demographic data. Generally speaking, such temporal-spatial techniques do not infer the underlying transmission patterns (modelled as networks) of malaria. Such transmission networks can be informative and perceptive, as they characterize how malaria is diffused or transmitted from one location to another (e.g., across villages) over time, providing a new way of detecting the active foci or hotspots of malaria transmission other than directly estimating epidemiological factors or identifying clusters with above-average transmission intensity. Therefore, by integrating both temporal-spatial clusters of cases of infection and temporal-spatial transmission networks of malaria, existing local, national, or regional surveillance systems can further be enhanced in their functional capacities of predicting and analyzing the impact of malaria transmission and their underlying factors, as well as evaluating existing intervention or eradication strategies and guiding new control efforts.
Next, we survey related studies on modelling the dynamics of epidemics on networks. Although the subjects involved in different epidemics can vary considerably, many can be modelled by either SIR/SIS models[68-70], cascading models, or threshold models[71,72]. The assumption behind the basic SIR or SIS models is that an individual in a population will be in one of the three states: suspected (S), infected (I), and recovered (R). If individuals are viewed as nodes, and the contacts between them as links, a network can be obtained that describes who will infect whom with what probabilities based on the SIR or SIS model. Grassberger first studied the dynamics of infectious diseases that propagate on regular networks using the percolation theory. Studies have revealed that many real-world networks, including social networks in which infectious diseases propagate[26,27], are either small-world or scale-free[24,25], rather than regular or random, as thought previously.
As the underlying structures of networks will influence the effect that dynamics of epidemics will have on them, researchers, such as Pastor-Satorras and Vespignani, have made several contributions to critical value analysis of typical epidemics on different types of complex network[28-33]. Based on the mean-field theory, they found that, compared with homogeneous networks, scale-free networks are fragile to the invasion of infectious diseases, computer viruses, or any other type of epidemics. In addition, researchers have also considered different spatial representations in modelling directly transmitted infectious diseases and the effects of human mobility on the dynamics of disease transmission on networks.
Epidemics on networks have also been studied in various disciplines. Sociologists are concerned with the diffusion of rumours or innovation on social networks[34-36]; economists have studied viral marketing and recommendation strategies by considering cascading dynamics as well as the network effects of vital nodes[37-39]; computer scientists are interested in how some topics can quickly cascade in virtual blog spaces, and their propagation trends[10,40,42,43]. 2b1af7f3a8