INTRODUCTION
Lassa fever is a febrile illness in West Africa. Severe infections are associated with bleeding, central nervous system manifestations, and renal failure (
1). The case fatality rate in the current hospital setting in Africa is around 30% (
2,
3). The causative agent—Lassa virus—is endemic in the West African countries of Guinea, Sierra Leone, Liberia, Mali, Côte d’Ivoire, and Nigeria (
4–6). Individual cases of Lassa fever have also been observed in Benin and Togo, though endemicity of the virus in these countries needs to be confirmed (
7).
Lassa virus is genetically diverse, and several lineages may be distinguished. Four lineages have been initially described by Bowen et al. (
4) as follows: lineages I, II, and III circulating in Nigeria and lineage IV circulating in Guinea, Sierra Leone, and Liberia. Strains from Mali and Côte d’Ivoire were proposed to represent lineage V (
8). The main natural reservoir appears to be
Mastomys natalensis, although
Mastomys erythroleucus has recently been described as an alternative host for lineages III and IV (
9–13). The newly discovered Lassa virus strain Kako from a
Hylomyscus pamfi rodent that was trapped in Nigeria and the virus from the nosocomial outbreak in Togo may be considered lineages VI and VII, respectively (
7,
12). Preliminary evidence for a putative further lineage has been provided by two short virus sequences obtained from patients in the south of Nigeria (
14).
Since 2008, a laboratory for the molecular diagnostics of Lassa fever has been in operation at the Irrua Specialist Teaching Hospital (ISTH), Irrua, in Edo State, Nigeria (
2). As it is one of a few centers in West Africa providing this service, it has become a reference center for Lassa fever in the country. About 1,000 to 4,000 suspected cases are tested annually with an average confirmation rate of about 10%. While the vast majority of confirmed Lassa fever cases originated from Edo State, a significant number of specimens originated from patients attending hospitals in other parts of the country.
Previous studies have provided a comprehensive overview of the sequence variability of the viruses circulating in the catchment area of ISTH (
11). Complete sequence information for Lassa virus strains from other parts of the country is scarce (
4,
14). Therefore, this study aimed to describe the sequence variability of Lassa virus across Nigeria and infer the temporal and spatial evolution of the virus in the country. Virus strains from patients who were diagnosed with Lassa fever at ISTH and presumably acquired the infection outside of Edo State were sequenced using next-generation sequencing (NGS) technology, and the sequences were subjected to phylogeographic analysis.
RESULTS
Samples from PCR-confirmed Lassa fever patients who had attended hospitals in 16 states of Nigeria were selected for virus isolation and sequencing. Seventy-two Lassa virus strains were isolated in cell culture. Complete S and L segment coding sequences were obtained for 72 and 55 isolates, respectively. If virus isolation was not successful, the virus was sequenced directly from clinical material generating another 2 and 12 complete and partial S segment coding sequences, respectively. In the phylogeographic reconstruction, we included partial or complete sequences from this study (NCBI BioProject accession number
PRJNA482054), our 2018 sequencing project at ISTH (NCBI BioProject accession number
PRJNA482058) (
15), and GenBank, for which the residence of the patient or hospital where the patient had been treated or the trapping site in cases of animals has been on record at least at the level of the State in Nigeria. The final data set comprised sequences and metadata for 219 unique Lassa virus strains sampled between 1969 and 2018 in 22 Nigerian states (216 sequences for the S segment and 157 sequences for the L segment) (see Table S1 in the supplemental material).
Root-to-tip regression analysis confirmed the presence of a significant temporal signal in the sequences of both segments as follows:
R2 = 0.41 for the L segment (
P < 0.001) and 0.21 for the S segment (
P < 0.001). These results are in line with previous analyses (
11,
15). As a next step in the analysis, we performed a time-scaled phylogenetic reconstruction using BEAST. The majority of the sequences clustered with the main Nigerian lineages II and III (
Fig. 1). None of the new sequences clustered with lineage I, the prototype Lassa virus strain from the town of Lassa. However, several sequences formed a new monophyletic cluster that is well supported (posterior values of 1 in both S and L tree) but shows an ambiguous relationship to lineages I, II, and III (different tree topologies and poor posterior values in S and L tree) (see Fig. S1 and S2 in the supplemental material). This cluster comprises the following two subclusters: (i) the Kako strains detected in
Hylomyscus pamfi (
12) and (ii) a novel full-length S and L segment sequence (LASV/H.sapiens-tc/NGA/2016/IRR_006 reported in this study) from Ekiti State as well as the short (0.3 kb) S segment sequences previously identified in patients from Edo State (Nig05-A08 and Nig09-045) (
14) (clades a, b, and c in
Fig. 1 and
2; see also Fig. S1 and S2).
Previous sequencing studies revealed that lineage III strains circulate in the north of Nigeria and lineage II strains are prevalent in the south (
4,
14). While this geographical pattern still holds true, the new sequences facilitate a much higher spatial resolution regarding the circulation of sublineages within the main two lineages. We applied the following set of criteria to define sublineages: they (i) form a monophyletic clade with high posterior support in the S and/or L tree, (ii) originate from deeper nodes, and (iii) have been sampled in a confined geographic area (see Fig. S3, S4, S5, and S6 in the supplemental material for classification of the sublineages in the trees and Fig. S7 and S8 in the supplemental material for the respective sampling maps). These criteria have solely been established with the purpose to describe the currently available data; they are not meant as formal criteria for subclassification of lineages. Within lineage III, five sublineages (3a to 3e) were distinguishable (
Fig. 1 and
2). Sublineage 3a was found in Plateau (
n = 8), Kaduna (
n = 4), Bauchi, Federal Capital Territory, and Benue (states are listed according to the number of sequences;
n ≤ 2 is not indicated); 3b in Benue (
n = 4) and Nasarawa; 3c in Nasarawa (
n = 12), Federal Capital Territory (
n = 3), and Plateau; 3d in Taraba; and 3e in Bauchi (
n = 19) and Plateau (
n = 3). Within lineage II, seven sublineages were distinguishable (2a to 2g) (
Fig. 1 and
2). Sublineage 2a was found in Ebonyi (
n = 3) and Abia; 2b in Imo (
n = 3), Rivers (
n = 3), Enugu, and Akwa Ibom; 2c in Ebonyi (
n = 20) and Imo; 2d in Taraba (
n = 5), Anambra, and Adamawa; 2e in Delta; 2f in Kogi (
n = 3) and Anambra; and 2G in Edo (
n = 46), Ondo (
n = 41), Kogi (
n = 3), Delta, Anambra, and Ekiti. Owing to the large number of sequences in sublineage 2g, it has even been possible to distinguish minor phylogenetic clusters in specific localities (see Fig. S5 and S6): 2g1 along the Niger River in Delta and Anambra; 2g2 around Uromi town in Edo; 2g3 around Owo and Ifon towns in Ondo; and 2g4 along the Niger River in Kogi. However, the matching of phylogenetic clusters with specific locations has to be interpreted with caution, as most viruses stem from patients and the precise location where the infection was acquired is not known. Indeed, the phylogenetic origin of a few sequences (mostly incomplete sequences marked with asterisks in Fig. S3, S4, S5, and S6) was implausible in view of the recorded origin of the patient. These sequences were excluded from the above mapping.
The geographic clustering of lineages and sublineages led us to attempt to reconstruct the spatial and temporal evolution of Lassa virus in Nigeria. The phylogeographic analysis estimated a weighted dispersal velocity of about 1 km per year and a spread into an area of about 50 km
2 per year for both lineage II and III (
Table 1). The dispersal of lineages II and III may have started approximately 300 and 800 years ago, respectively, though the estimations slightly differ for the S and L segment (
Fig. 3). The snapshots of the spatial distribution of the virus over time as estimated using S and L segment-based phylogenies suggest an origin of lineage II in the southeastern part of the country around Ebonyi and a main vector of distribution toward the west across the Niger River, through Anambra, Kogi, Delta, and Edo into Ondo State (
Fig. 4). Sublineages 2a and 2c still circulate in Ebonyi, while the movement toward the west has led to the evolution of sublineage 2f in Anambra and Kogi, 2e in Delta, and 2g in Edo and Ondo State. The frontline of virus movement appears to be in Ondo. However, a single sequence in 2g is from Ekiti, north of Ondo, and might be a first indication of virus movement into this state. Minor vectors of distribution are directed northeast toward Taraba and Adamawa (sublineage 2d) and south toward Imo, Rivers, and Akwa Ibom (sublineage 2b).
The origin of lineage III might be in northern Plateau State (
Fig. 5). The model suggests a centrifugal spread of the virus into the neighboring states of Kaduna (sublineage 3a), Nasarawa/Federal Capital Territory (sublineage 3c), and Bauchi (sublineage 3e). While movement of these three sublineages apparently was limited to the territory north of Niger and Benue rivers, sublineage 3b has moved south and crossed the Benue River into Benue State (
Fig. 2). This movement is well documented, as the rodents from which the sequences ONM-299, ONM-314, and ONM-700 were obtained had been trapped south of the river (
12). Sublineage 3d also seemed to have crossed the Benue River and moved into Taraba State, although evidence is weak because only a single sequence for this cluster is available (
Fig. 2).
DISCUSSION
A geographical pattern in the distribution of Lassa virus was disclosed by the first comprehensive sequencing study on Lassa virus published in 2000 (
4). It demonstrated a westward movement of the virus across West Africa and classified Lassa virus into lineages. Our study provides a geographic mapping of lineages and phylogenetic clusters in Nigeria at a higher resolution. In addition, we estimated the direction and time frame of virus dispersal in the country. Essentially, the data confirm the distribution of lineage III in the northern part of the country and lineage II in the southern part of Nigeria.
As the majority of human infections result from spillover of the virus from the natural reservoir rather than human-to-human transmission (
11,
15), the geographical pattern of Lassa virus is primarily a consequence of the temporal and spatial evolution of the virus in the rodent population. The observation that sublineages and clusters are confined to separated geographical areas and are not dispersed and mixed over the Nigerian territory is consistent with the limited home range of
M. natalensis (
16,
17). On the other hand,
M. natalensis populations also include mobile animals that may have driven virus dispersal on an evolutionary scale (
16,
18). Landscape structures that facilitate or restrict rodent movement might have contributed to shaping the current distribution of lineages and sublineages (
19). Specifically, the large Niger and Benue rivers may represent geographical barriers to the movement of rodents between the northern and southern parts of the country. However, previous ecology and our human studies indicate that infected rodents carrying lineage III viruses have crossed the Benue River from the north at least into Benue state (
12). Lineage II, which appears to originate from the southeast has clearly crossed the Niger River and moved into the southwestern part of the country. Our model estimated an average speed of virus dispersal of about 1 km per year. The estimate for the molecular clock of 7.8 × 10
−4 substitutions/(site × year) and time to most common recent ancestor of currently circulating strains in Nigeria of about 800 years are in agreement with previous studies (
11,
14). However, estimates for temporal and spatial dispersal have to be interpreted with caution the deeper we look back in time. Our estimates are based on sampling over the past 50 years; however, recent phylogenetic studies that included sequences of ancient DNA and RNA viruses estimated substitution rates approximately 1 order of magnitude lower than rates inferred solely from modern sequences (
20–22). Thus, it is conceivable that the velocity of virus dispersal is slower than that estimated here. It will be interesting to follow the dispersal of the virus at the frontline of lineage II, which currently seems to be in Ondo State, and experimentally challenge the estimates.
A main limitation and confounding factor of our study is that it is based on human cases attending hospitals rather than systematic sampling in the rodent population. Thus, there might be a sampling bias owing to the risk of rodent-to-human transmission in an area, the distance between villages with Lassa fever and the nearest secondary or tertiary hospital, the awareness of doctors in the local hospitals to suspect Lassa fever, the availability of diagnostic and treatment facilities, and the knowledge about the available options to test for Lassa fever at national reference laboratories. In addition, the mobility of patients may lead to a substantial distance between the site of infection and the site of presentation to a doctor, which often was the only available spatial information. Mixing up, mislabeling, or contamination of samples in the hospital or laboratory may also not be excluded and might have confounded the analysis. Phylogeographic inference remains conditioned on the sampling, and heterogeneous sampling density can result in more or less notable differences between the actual viral spread and the inferred dispersal history of lineages (
23). The phylogeographic inference performed here aimed at reconstructing the ancestral history of viral lineages in a continuous space but is not necessarily a detailed picture of the dispersal history of the entire viral population, even if sampling would reflect the true density. Influencing factors such as landscape structures or adaptation to new hosts during evolution have not been considered in this model.
Besides the known lineages I, II, and III, we found a well-supported monophyletic cluster of sequences that does not fall within these lineages. This clade comprises the strains from a
H. pamfi rodent trapped in Kako, southwestern Nigeria (
12), as well as human sequences from the Ekiti (this study) and Edo states (
14). Unfortunately, the sequences from Edo State are too short for an in-depth analysis of the relationship between the latter sequences. Formal analysis on whether the Kako and Ekiti sequences represent one or even two new Lassa virus lineages is pending. The relationship of Lassa virus in
M. erythroleucus and
H. pamfi with lineage III and the new cluster, respectively, raises the question of whether specific lineages or sublineages of Lassa virus are associated with specific host species. Although
M. natalensis is considered the principal host of Lassa virus, this association has been mainly confirmed for lineages II and IV and may not hold true for other lineages or sublineages (
9–11,
13). Besides the rare outlier strains from patients from Ekiti (
n = 1) and Edo (
n = 2), lineage I has been observed only once during a nosocomial outbreak in 1969 in the town of Lassa in the north of the country and never again. It is conceivable that the natural hosts of these rare strains or lineages are not commensal rodents living close to humans, such as
M. natalensis, but wild rodents, which usually do not have contact to humans, such as
H. pamfi. If host switching were involved in the evolution of Lassa virus lineages, this would increase the complexity and timelines of the spatiotemporal evolution of the virus.
This project not only aimed at generating Lassa virus sequences from across Nigeria but also at isolating and conserving the respective viruses for future research. This work has been performed by a Nigerian fellow (D. U. Ehichioya) in a biosafety level 4 facility in Hamburg. Besides their use in studies on virus evolution, the sequences and isolates are relevant to public health research. For example, they have been important for the development and evaluation of the RealStar Lassa virus reverse transcriptase PCR (RT-PCR) kits (altona Diagnostics, Hamburg, Germany), which are now a diagnostic reference assay in Nigeria. In addition, they served as background information for recent sequencing studies in Nigeria, demonstrating that the high incidences of Lassa fever cases in 2018 and 2019 are not associated with the circulation of newly emerging strains or lineages or increased human-to-human transmission (
15) (
http://virological.org/t/2019-lassa-virus-sequencing-in-nigeria-final-field-report-75-samples/291).
ACKNOWLEDGMENTS
This work was supported by a grant from the German Research Foundation (DFG) (GU 883/4-1) to S. Günther, D.U. Ehichioya, and D.A. Asogun and the European Union’s Horizon 2020 research and innovation program (grant agreement number 653316; EVAg). D.U. Ehichioya received fellowships from the “Dr. Ing. Wilhelm und Maria Kirmser-Stiftung” and the Alexander-von-Humboldt Foundation. S. Dellicour was supported by the Fonds National de la Recherche Scientifique (FNRS, Belgium) and the Fonds Wetenschappelijk Onderzoek (FWO, Belgium). P. Lemey acknowledges support by the Research Foundation—Flanders (Fonds voor Wetenschappelijk Onderzoek—Vlaanderen; G0D5117N), the Wellcome Trust through project number 206298/Z/17/Z, and the European Research Council under the European Union’s Horizon 2020 research and innovation program (grant agreement number 725422; ReservoirDOCS).
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
We thank Jürgen Sievertsen, Birgit Muntau, and Christa Ehmen for assistance with Illumina sequencing, Balázs Horváth for bioinformatics analysis of NGS data, and Stephan Lorenzen for helpful discussions.
We declare no conflict of interest.