TESTING FOR LATENT TUBERCULOSIS INFECTION
The goal of testing for LTBI is to identify individuals who are at increased risk for the development of active TB; these individuals would benefit most from treatment of LTBI (also termed preventive therapy or prophylaxis). Thus, only those who would benefit from treatment should be tested; a decision to test should presuppose a decision to treat if the test is positive (
6).
In general, testing for LTBI is indicated when the risk of development of disease from latent infection (if present) is increased; examples include likely recent infection (e.g., close contact of a person with TB) or a decreased capacity to contain latent infection (e.g., because of immunosuppression, as in the case of young children in contact with those with active TB, people living with human immunodeficiency virus [HIV] infection, or otherwise immunosuppressed persons because of medications or conditions such as uncontrolled diabetes). In contrast, screening for LTBI in persons or groups who are healthy and have a low risk of progressing to active disease is not appropriate, since the positive predictive value of LTBI testing is low and the risks of treatment can outweigh the potential benefits (
4). The balance of risk and benefit is also different in high-burden settings, where the risk of reinfection may be high and screening for LTBI will have a low negative predictive value. For children, the risk-to-benefit ratio is more favorable than for adults.
There is no diagnostic gold standard for LTBI, and all existing tests are indirect approaches which provide immunological evidence of host sensitization to TB antigens (
5). There are two accepted but imperfect tests for identification of LTBI: the tuberculin skin test (TST) and the gamma interferon (IFN-γ) release assay (IGRA). Both tests depend on cell-mediated immunity (memory T-cell response), and neither test can accurately distinguish between LTBI and active TB disease (
7,
8).
TUBERCULIN SKIN TESTING: OVERVIEW AND LIMITATIONS
The TST, performed using the Mantoux technique (
9), consists of the intradermal injection of 5 tuberculin units (TU) of PPD-S purified protein derivative (PPD) or 2 TU PPD RT23 (these are considered equivalent [
6]). In a person who has cell-mediated immunity to these tuberculin antigens, a delayed-type hypersensitivity reaction will occur within 48 to 72 h. The reaction will cause localized induration of the skin at the injection site, and the transverse diameter should be measured (as millimeters of induration) by a trained individual and interpreted using risk-stratified cutoffs (
5). It is important to note that cell-mediated immunity to tuberculin antigens can sometimes reflect exposure to similar antigens from environmental mycobacteria or
Mycobacterium bovis bacillus Calmette-Guérin (BCG) vaccination or a previous infection that has been cleared (through immunological mechanisms or treatment).
In interpreting a positive TST, it is important to consider much more than only the size of the induration (
10). Rather, the TST should be considered according to three dimensions: size of induration (for the current test as well as in relation to the induration on a previous test, if done), pretest probability of infection, and risk of disease if the person were truly infected (
10). Menzies and colleagues developed a simple, Web-based, interactive algorithm—the Online TST/IGRA Interpreter (version 3.0;
www.tstin3d.com)—that incorporates all these dimensions (
10) and also computes the risk of serious adverse events due to treatment.
The TST has several known limitations. False-positive and false-negative results can occur. There are two important causes of false-positive results: nontuberculous mycobacterium (NTM) infection and prior BCG vaccination (
11). NTMs are not a clinically important cause of false-positive TST results, except in populations with a high prevalence of NTM sensitization and a very low prevalence of TB infection (
11). The impact of BCG on TST specificity depends on when BCG is given and on how many doses are administered (
11). If BCG is administered at birth (or during infancy) and not repeated, then its impact on TST specificity is minimal and can be ignored while interpreting the results. In contrast, if BCG is given after infancy (e.g., school entry) and/or given multiple times (i.e., booster shots), then TST specificity is compromised (
11).
The BCG World Atlas (
www.bcgatlas.org) provides detailed information on BCG policies and practices in many countries (
12). While most developing countries have a policy of a single BCG vaccine administered at birth, some countries (
Fig. 2) give the vaccine later in life and also give booster shots.
False-negative TST results may occur because of limited sensitivity in particular patient subgroups (e.g., immunosuppressed individuals [due to medical conditions such as HIV infection or malnutrition] or those taking immunosuppressive medications) or because of preanalytical or analytical sources of test variability (e.g., improper tuberculin handling or placement or incorrect interpretation of test results) (
6). Unfortunately, individuals for whom the TST has limited sensitivity are often the very individuals that are at increased risk of progression to active disease if infected. Anergy induced by active TB itself can cause false-negative TST results (
6).
The TST is also known to have problems with reproducibility, with inter- and intrareader variability in measurements of induration (
13). Nonspecific variability is expected, and interpretation of repeat testing is complicated by immunologic recall of preexisting hypersensitivity to TB (i.e., boosting), conversions (i.e., new infection), and reversions (of positive results to negative) (
13). Cutoffs used for TST conversions are different from the cutoffs used for diagnosis of LTBI (
5).
Measurement of the long-term ability of a positive TST to predict development of active TB is difficult, requiring prolonged follow-up of unselected populations. Based on historical studies, there is a modest positive association between tuberculin reactivity and the risk of active TB (
14). However, a majority of individuals with positive TST results do not progress to active disease. As a result, many TST-positive individuals need to be treated in order to prevent one disease event (
4). Thus, targeted testing of high-risk groups is the common practice.
IGRA: ASSAY PRINCIPLES
IGRAs are
in vitro blood tests of cell-mediated immune response; they measure T-cell release of IFN-γ following stimulation by antigens specific to the
M. tuberculosis complex (with the exception of BCG substrains), i.e., early secreted antigenic target 6 (ESAT-6) and culture filtrate protein 10 (CFP-10). These antigens are encoded by genes located within the region of difference 1 (RD1) locus of the
M. tuberculosis genome (
15,
16). They are more specific than PPD for
M. tuberculosis because they are not encoded in the genomes of any BCG vaccine strains or most species of NTM, other than
M. marinum,
M. kansasii,
M. szulgai, and
M. flavescens (
17). However, not all NTMs have been studied for cross-reactivity. There is some evidence of cross-reactivity between ESAT-6 and CFP-10 of
M. tuberculosis and
M. leprae (
18,
19), but the clinical significance of this in settings where leprosy and TB are endemic (e.g., India and Brazil) is poorly characterized.
Two commercial IGRAs are available in many countries: the QuantiFERON-TB Gold In-Tube (QFT) assay (Cellestis/Qiagen, Carnegie, Australia) and the T-SPOT.TB assay (Oxford Immunotec, Abingdon, United Kingdom). Both tests are approved by the U.S. Food and Drug Administration (FDA) and Health Canada and are CE (Conformité Européenne) marked for use in Europe.
The QFT assay is an enzyme-linked immunosorbent assay (ELISA)-based, whole-blood test that uses peptides from the RD1 antigens ESAT-6 and CFP-10 as well as peptides from one additional antigen (TB7.7 [Rv2654c], which is not an RD1 antigen) in an in-tube format. The result is reported as quantification of IFN-γ in international units (IU) per milliliter. An individual is considered positive for M. tuberculosis infection if the IFN-γ response to TB antigens is above the test cutoff (after subtracting the background IFN-γ response of the negative control).
The T-SPOT.TB assay is an enzyme-linked immunosorbent spot (ELISPOT) assay performed on separated and counted peripheral blood mononuclear cells (PBMCs) that are incubated with ESAT-6 and CFP-10 peptides. The result is reported as the number of IFN-γ-producing T cells (spot-forming cells). An individual is considered positive for M. tuberculosis infection if the spot counts in the TB antigen wells exceed a specific threshold relative to the negative-control wells. Indeterminate IGRA results can occur due to a low IFN-γ response to the positive (mitogen) control or a high background response to the negative control.
TEST CHARACTERISTICS: SENSITIVITY AND SPECIFICITY FOR LTBI
Since there is no gold standard for LTBI, sensitivity and specificity are typically estimated using surrogate reference standards. Sensitivity is estimated among culture-confirmed TB cases, while specificity is estimated among low-risk individuals with no known TB exposure in low-incidence settings (
20).
Based on published meta-analyses (
7,
8,
21), IGRAs have a specificity for LTBI diagnosis of >95% in settings with a low TB incidence, and specificity is not affected by BCG vaccination. TST specificity is similarly high in populations not vaccinated with BCG (97%). Among populations where BCG is administered, the specificity is much lower (approximately 60%) and variable, depending on when and how often BCG is given. The sensitivity for the T-SPOT.TB assay appears to be higher than that for the QFT assay or TST (approximately 90%, 80%, and 80%, respectively). IGRA sensitivity is diminished by HIV infection and in children (see later discussion) (
22,
23).
Because IGRAs are not affected by BCG vaccination status, IGRAs are useful for evaluation of LTBI in BCG-vaccinated individuals, particularly in countries where BCG vaccination is administered after infancy or multiple (booster) BCG vaccinations are given (
12,
24). In such countries (
Fig. 2), the TST is unlikely to have high specificity.
Although this is based on limited evidence, IGRAs appear to be unaffected by most infections with NTMs which can cause false-positive TSTs (
17). However, infection with
M. marinum or
M. kansasii, which express ESAT-6 or CFP-10, has been shown to produce positive results in IGRAs, as with the TST (
25,
26).
TEST CHARACTERISTICS: REPRODUCIBILITY
While IGRAs have improved specificity over that of TST, concerns have been raised about issues with reproducibility of the test in settings where repeat testing is necessary (
27,
28). By nature, functional T-cell assays are highly susceptible to variability by numerous factors at multiple levels, including assay manufacturing, preanalytical processing, analytical testing, and immunomodulation. Therefore, reproducibility is an important consideration (
29) that makes it challenging to use a single cutoff value to distinguish between positive and negative results with one-time testing and to define conversion and reversion in individuals undergoing serial testing.
A systematic review on IGRA reproducibility in 2009, based on a small number of studies, showed that variability was substantial, with magnitudes of within-subject IFN-γ responses varying by up to 80% (
28). Since then, more research has emerged, providing a better understanding of the sources of variability in IGRAs. A list of potential sources of IGRA variability and their impacts is shown in
Table 1.
Figure 3 graphically illustrates the sources of variations, with the QFT assay as an example. Although each source can have a positive or negative effect on the assay response, the “total variability” is the net sum of all variability combined.
Variability Due to Manufacturing Issues
Like all diagnostic tests, IGRAs may be susceptible to manufacturing quality issues, with some lots or reagents affected by issues such as temperature during shipping. This was described for the QFT assay by Slater and colleagues, who investigated a sudden increase in the rate of positive QFT results, from 10% to 31%, at an academic institution in the United States (
30). The reason for the sudden increase in the false-positive rate during this incident could not be identified, although a similar issue, attributed to contamination of a specific lot of tubes, led to its withdrawal from the market by the manufacturer in 2012 (
31). By monitoring positivity and indeterminate rates, clinical laboratories can rapidly detect and halt utilization of potentially faulty lots, alert the manufacturer to investigate, and prevent reporting of inaccurate test results.
Preanalytical Sources of Variability
Preanalytical sources of variability are several and likely represent a large component of “total variability.” Among the list of potential sources shown in
Table 1, delay between blood collection and incubation of cells at 37°C has been studied extensively. The manufacturer of the QFT assay allows a 0- to 16-h range of delay before tubes can be incubated. However, Doberne and colleagues showed a significant decline in TB response with a delay in incubation within the recommended range (
32). Compared to immediate incubation, 6- and 12-h delays resulted in positive-to-negative reversion rates of 19% (5/26 samples) and 22% (5/23 samples), respectively, for individuals with a high risk for LTBI (
32). Individuals with reversion had a lower TB response, closer to the assay cutoff, than individuals whose results remained positive with incubation delay.
Incubation delay also has a negative effect on test results through reducing the mitogen response in the QFT assay and increasing the rate of indeterminate results (
29,
32,
33). Other preanalytical variables shown to impact QFT results include blood volume and tube shaking. Gaur and colleagues showed an inverse relationship between blood volume in the TB antigen tube, within the recommended range, and IFN-γ response (
34). Compared to 0.8 ml blood, 1.0 and 1.2 ml blood resulted in significant declines in TB-specific IFN-γ responses in the infected subjects, and 1.2 ml resulted in a significant decrease in the proportion of positive results. Vigorous shaking also caused a significant increase in IFN-γ response in the nil and TB antigen tubes and caused a significant elevation in TB response when vigorously shaken TB antigen tubes were paired with gently shaken nil tubes (
34). In the same study, duration of incubation within the recommended range was not shown to be a source of variability in the infected group (
34), but this may not be consistent with a similar study that did show that 24 h of incubation led to a higher TB-specific IFN-γ response than that with 16 h of incubation (
35). Variation in the timing of blood collection (evening versus morning) may also introduce variability into test results, but the mechanism is unclear (
36). While the reproducibility of the QFT assay is reasonably well studied, many of the above considerations also apply to the T-SPOT.TB assay.
Analytical Sources of Variability
The analytical sources of variability refer to fluctuations in measurements due to random errors caused by interference of uncontrolled factors in biological fluids (matrix effects), imprecision of pipetting, manipulation errors in centrifugation, decantation, and washing, and the imprecision of measurement of the final signal. Unlike preanalytical sources of variability, which are mostly systematic and therefore predictable, the analytical sources are mostly random and persist despite extensive efforts to improve the analytical reproducibility.
Indeed, studies such as that of Metcalfe and colleagues (
37) have shown considerable within-run and between-run variabilities in the quantitative results, producing discordant results when TB responses are close to the assay cutoff (
37). Whitworth and colleagues showed variability in QFT results from the same subjects when ELISAs were performed in different laboratories (
38). Analytical error originating from interreader variability has also been investigated, though it appears to be a problem primarily with the T-SPOT.TB assay, not the QFT assay (
39).
Immunological Sources of Variability
The two immunological sources of variability described to date include immune boosting and immunomodulation. Van Zyl-Smit and colleagues showed that a significant increase in TB response occurs when QFT and T-SPOT.TB testing is performed more than 3 days after PPD placement, through immunological recall of preexisting memory T cells to TB antigens (
40). Similar findings have been reported in other studies (
41 – 44), although it is not clear how long the IGRA boosting persists and whether the PPD formulation and amount used in TST contribute to boosting. The underlying mechanism of TST boosting is thought to be an anamnestic response of preexisting memory T cells to RD1 antigens, which are contained within PPD (
45,
46). In contrast, a previous IGRA will not boost the results of the subsequent IGRA result, as the test itself is performed
ex vivo.
Another source of immunological variability is caused by immunomodulation through conserved microbial products known as pathogen-associated molecular patterns (PAMPs), such as lipopolysaccharide and peptidoglycan (
47). PAMPs are recognized by the innate immune cells via several families of pathogen recognition receptors (PRRs), of which the Toll-like receptor (TLR) family is best characterized. Activation of PRRs triggers intracellular signaling pathways culminating in the expression of inflammatory mediators which stimulate the maturation of antigen-presenting cells and initiation of adaptive immune responses, such as the development and proliferation of antigen-specific effector T-cell subsets (
47).
Gaur and colleagues showed that
in vitro immunomodulation in the QFT assay may occur with Toll-like receptor agonists and at low concentrations and that this may enhance antigen-specific IFN-γ responses in individuals with presumed LTBI (
48). PAMPs in an IGRA may, for example, be derived from endogenous microbiota (which can be influenced by diet, antibiotics, and personal hygiene) or from exogenous contaminants (mostly from skin during blood draw) and may account for a fraction of the reported within-subject variability.
Overall, IGRA results can be affected by many sources of variation, not all of which are understood at present. While systematic sources of variability can be eliminated or minimized through standardization by the assay manufacturers and users of the test, random sources of variability are unavoidable and must be accounted for when interpreting results. Once total variability in IGRA responses is determined, appropriate cutoffs and borderline zones can be derived for interpreting serial testing results in light of a patient's TB risk factors and local laboratory practices (
49).
PREDICTIVE VALUE FOR PROGRESSION TO TB DISEASE
Diel and colleagues assessed the positive and negative predictive values of the commercial IGRAs relative to those of the TST for the future development of active TB in untreated individuals (
105). Their review suggested that the positive and negative predictive values of commercial IGRAs might be higher than those of the TST, in particular among high-risk populations. A limitation, however, was that the analytic approach did not take the different durations of follow-up into consideration, and therefore, the estimated predictive values were not adjusted for the number of person-years of follow-up.
In a meta-analysis by Rangaka and colleagues (
106), the prognostic ability of the IGRAs was summarized in the form of incidence rates and risk ratios for the longitudinal studies included in the review. Fifteen studies with a combined sample size of 26,680 participants were included in this analysis (
107 – 121). The incidence of active TB during a median follow-up of 3 years was 2 to 24 per 1,000 person-years for IGRA-negative individuals (
Fig. 5). For IGRA-positive individuals, the TB incidence was 4 to 48 cases per 1,000 person-years (
106), suggesting that a majority of IGRA-positive individuals did not progress to TB disease during follow-up. This is similar to the historic data on TST (
14).
Compared with negative test results, IGRA-positive and TST-positive results were much the same with regard to risk of TB development (the pooled incidence rate ratio [IRR] in the five studies that used both was 2.11 [95% CI, 1.29 to 3.46] for IGRA versus 1.60 [95% CI, 0.94 to 2.72] for TST at the 10-mm cutoff). However, the proportion of IGRA-positive individuals in 7 of 11 studies that assessed both IGRAs and TST was generally lower than that of TST-positive individuals (
106). The authors concluded that neither IGRAs nor the TST have high accuracy for the prediction of active TB, although the use of IGRAs in some populations might reduce the number of people considered for preventive treatment (
106).
Since the publication of the aforementioned review, five new longitudinal studies have been published (
122 – 126).
Table 3 presents the characteristics of all 20 longitudinal studies. Among IGRA-positive individuals, incidence rates ranged from 3.7 to 84.5 per 1,000 person-years of follow-up, while they ranged from 2.0 to 32.0 per 1,000 person-years for IGRA-negative individuals. The highest incidence rates, among both IGRA-positive and IGRA-negative individuals, were found in studies that followed immunocompromised subjects, such as HIV-infected mothers, HIV-exposed infants, or men with silicosis.
Three studies specifically assessed the prognostic value of the IGRAs in an exclusively HIV-positive cohort (
110,
118,
125). Two studies without possible incorporation bias (where IGRAs were not used to make a final diagnosis of active TB) and differential work-up bias (where IGRA-positive individuals were not investigated more intensively for active TB than IGRA-negative individuals) (
118,
125) found risk ratios of 2.69 (95% CI, 0.69 to 10.52) and 3.32 (95% CI, 1.09 to 10.08), respectively, meaning that individuals with a positive IGRA result had around a 3-fold-increased risk of progression to TB disease during the follow-up period of the study compared to individuals with a negative IGRA result. Although the rate of disease progression after a presumed TB infection is increased in HIV-infected individuals, there are currently no data that suggest that the predictive value of the IGRAs is better or worse in this subpopulation than in others.
While most longitudinal studies have assessed the predictive value of a single, cross-sectional IGRA result, only a single study has evaluated the predictive value of an IGRA conversion (
127). This study found that recent QFT conversion was indicative of an approximately 8-fold higher risk of progression to TB disease (compared to nonconverters) within 2 years of conversion in a cohort of adolescents in South Africa. However, even among QFT converters, the overall risk of TB disease was low (1.46 cases per 100 person-years) (
127). Although evidence is limited, this study does suggest that an IGRA conversion (which may indicate recent infection) may be more predictive than a single positive IGRA result.
Overall, the currently available data show that the predictive value of IGRAs for progression to TB disease is low and slightly but not significantly higher than that of the TST. The data suggest that a majority (>95%) of those with positive IGRA or TST results do not progress to TB disease during follow-up.
Why do existing LTBI tests have poor predictive value for active TB? There may be several reasons. First of all, the overall risk of progression from LTBI to active TB—in the absence of recent infection or severe immunosuppression—is low (<5% lifetime risk in healthy populations); thus, even a perfectly accurate test for LTBI would have a low predictive value for progression to active TB. Second, while IGRAs (and TST) are generally evaluated according to their ability to predict future active TB, their true aim is to identify individuals who would benefit from preventive therapy. Since future active TB is a combination of both reinfection events (arguably not amenable to preventive therapy) and reactivation events, and since LTBI may confer some protective immunity against repeat infection (
128), the ability of IGRAs to predict future active TB may misrepresent their ability to identify those who would benefit from preventive therapy. Third, IGRAs are immune-mediated tests, and the same immune system is responsible for yielding a positive IGRA result as well as preventing progression to active TB disease; as such, individuals with false-negative IGRAs may be the very individuals (e.g., highly immunosuppressed) at greatest risk of reactivation. Fourth, the sensitivity and specificity of IGRAs are imperfect and dependent on only a few antigens, and antigens expressed by
M. tuberculosis during latency may not be those expressed during active replication (
2,
129).
As a consequence of all the above factors, the IFN-γ response, although important, is probably insufficient to resolve the various phases of the latent TB “spectrum” as illustrated in the framework proposed by Barry and colleagues (reproduced in
Fig. 1) (
2). Among the stages shown in the figure, both TST and IGRAs are likely to be positive in all stages, with the possible exception of the innate immune response stage (i.e., exposed to TB but negative on both tests) (
3,
88).
For all these reasons, both TST and IGRAs are generally unable to select out the phenotypes that are most likely to benefit from LTBI treatment (
88,
130). This is underscored by the observed low rates of progression to disease even in IGRA- and TST-positive individuals (
106). A more predictive LTBI test or strategy will greatly help to target only those who will benefit from LTBI treatment.
COST-EFFECTIVENESS
A systematic review of cost-effectiveness analyses (CEA) was conducted by Nienhaus and colleagues (
131). Cost and cost differences between studies were not fully investigated, as the authors did not adjust or inflate to a common currency to permit comparisons. The study conclusions regarding cost-effectiveness were, however, compared for 7 available CEA studies. The authors concluded that in 6/7 studies, IGRA (as a dual-step strategy following TST or IGRA only) was reported as more cost-effective than TST only. However, the authors also state that comparison of the studies was hampered by several methodologic problems, including differences in assumed costs, test parameters, strategies modeled, and outcomes evaluated. They concluded that until some of these issues are addressed, recommendations regarding the cost-effectiveness of IGRAs should be interpreted with caution (
131).
Oxlade and colleagues also systematically reviewed the CEA literature (
132). They too reported substantial variability in the choice of test characteristics, parameters, and cost estimates used in models. When the IGRA and TST strategies were compared by using a common decision analysis model created by Oxlade and colleagues, predicted costs and effectiveness largely overlapped, emphasizing the difficulty in drawing conclusions about the cost-effectiveness of IGRAs (
132). Both systematic reviews ended with recommendations for conducting cost-effectiveness analyses on IGRAs that should improve economic studies to evaluate diagnostic strategies for LTBI and increase their value for informing individual and public health decisions (
131,
132).
CONCLUSIONS
Both TST and IGRAs are acceptable but imperfect LTBI tests, with advantages and disadvantages (
Table 4). IGRAs offer some improvements over the TST, but the improvement, as noted by others, is incremental rather than transformational (
136). There are situations where neither test is appropriate (e.g., active TB diagnosis in adults) and situations where both tests may be necessary to detect
M. tuberculosis infection (e.g., immunocompromised populations), and there are situations where one test may be preferable to another. For example, IGRAs may be preferable to the TST in populations where BCG is given after infancy or given multiple times. In contrast, TST may be preferable to the IGRAs for serial testing of health care workers. Both TST and IGRAs have reproducibility challenges, and dichotomous cutoffs are inadequate for interpretation.
The primary goal of IGRAs is to identify those who will benefit from LTBI therapy. Unfortunately, IGRAs (and TST) are limited in this regard, for reasons including the low absolute risk of progression to disease, inability to distinguish reactivation from reinfection, reduced accuracy in immunocompromised patients, and inability to discriminate the various stages within the spectrum of LTBI (
2,
88). To maximize the positive predictive value of existing LTBI tests, LTBI screening should be reserved only for those who are at sufficiently high risk of progressing to disease. Such high-risk individuals may be identifiable by using multivariable risk prediction models that incorporate risk factors (e.g., the Online TST/IGRA Interpreter [
www.tstin3d.com]) (
10) and by using serial testing to resolve underlying phenotypes. In the longer term, highly predictive biomarkers need to be identified. This is an active area of research (
93,
129), and future generations of LTBI tests should overcome the limitations of current assays.