- Split View
-
Views
-
Cite
Cite
Carol M. Preissner, Dennis J. O’Kane, Ravinder J. Singh, John C. Morris, Stefan K. G. Grebe, Phantoms in the Assay Tube: Heterophile Antibody Interferences in Serum Thyroglobulin Assays, The Journal of Clinical Endocrinology & Metabolism, Volume 88, Issue 7, 1 July 2003, Pages 3069–3074, https://doi.org/10.1210/jc.2003-030122
- Share Icon Share
Serum thyroglobulin (Tg) measurement is a major means of detecting thyroid cancer recurrence. Unlike anti-Tg autoantibody interferences, heterophile antibody (HAB) immunoassay interferences are not well recognized by laboratorians or clinicians as a Tg assay problem. When HAB interferences occur, they usually result in false positive test results. With the current trend to treat some thyroid cancer patients with radioiodine on the basis of an elevated serum Tg result alone, this has the potential to result in unwarranted therapy.
We evaluated the prevalence of HAB interference in a commonly used automated immunoassay in 1106 consecutive specimens with Tg values greater than 1 ng/ml. All Tg measurements were repeated after sample incubation in heterophile-blocking tubes (HBT). Results, which showed a more than 3 sd percentage difference from the original result, were considered to suffer from HAB interference. All possible interferences were confirmed by dilution testing.
After HBT treatment, Tg levels dropped to less than 1 ng/ml in 32 specimens (P < 0.0000001), 20 of which fell to less than 0.1 ng/ml (P < 0.00002). Of these 20, 17 were anti-Tg autoantibody negative, and all 32 showed a fall of greater than 3 sd percentage (>56.91%) compared with the original result. There were also two samples that showed a significant increase of greater than 56.91% after HBT treatment.
HAB interference is relatively prevalent (1.5–3%) in a commonly used automated Tg assay and can lead to clinically significant artifacts. It is currently unknown, but possible, that other immunometric Tg assays suffer from similar problems. Unless a Tg assay is confirmed to be free of HAB interference or uses additional blocking steps, as ours now does, HAB interference should be suspected if Tg results do not fit the clinical picture.
THE FOLLOW-UP of thyroid cancer patients who have undergone near total or total thyroidectomy has traditionally rested on two major pillars: periodic clinical assessment and diagnostic radioiodine scanning. Over the last 10–20 yr these approaches have been increasingly supplemented, if not eclipsed, by the use of regular serum thyroglobulin (Tg) measurements (1, 2). There continues to be a vigorous debate about the exact sensitivity and specificity of different serum Tg assays, both in comparison with the former gold standard, diagnostic radioiodine scanning, and with regard to measurements during T4 therapy vs. testing after T4 withdrawal or recombinant human TSH stimulation (1, 2). Despite these controversies, Tg testing has become one of the modern endocrinologist’s most important tools in thyroid carcinoma follow-up.
With increasing experience in the use of Tg assays, it has become apparent that even very low levels of detectable Tg might signify disease recurrence in athyrotic patients (2, 3). In response to these findings, manual competitive immunoassays have been gradually replaced by successive generations of immunometric assays, allowing reproducible detection of serum Tg concentrations down to 0.1 ng/ml. In addition, some of the most recent immunometric assays have been designed with the goal of minimizing interferences by anti-Tg autoantibodies (4, 5). Although it remains doubtful whether this latest generation of Tg assays will indeed improve result reliability in patients with anti-Tg autoantibodies, their automated nature and improved sensitivity, precision, and linearity have lead to their widespread adoption throughout the United States.
We replaced our previous immunometric assay with one of the latest generation of automated immunometric assays in August 2001. This allowed us to improve our diagnostic sensitivity from 0.5 to 0.1 ng/ml while at the same time also significantly improving assay linearity. With regard to clinical validation, the assay also performed admirably with a sensitivity of 83% and a specificity of 95.5% for the detection of persistent or recurrent disease (6). However, recently we came across a case where serum Tg had previously been undetectable, but was elevated with our current assay, and anti-Tg autoantibodies were absent. It was subsequently reported to us that the patient was given a therapeutic dose of radioiodine based on this result, but no metastatic deposits were seen on posttherapy scanning. As a consequence, we considered the possibility that the elevated Tg result might have been due to heterophile antibody (HAB) interference.
HAB are antibodies that can bind to animal antigens. In immunometric assays they can form a bridge between capture and detection antibody, leading to a false positive result in the absence of analyte or, if analyte is also present, to a false elevation in measured levels. Rarely, HAB can also lead to false negative or false low results (Fig. 1). Modern immunometric assays contain blocking reagents that are supposed to prevent these problems, but there are very few studies to support these claims (7, 8), and none of these has examined Tg assays. In the face of the clinical trend to sometimes treat thyroid cancer solely on the basis of elevated Tg (9–12), the uncertainty about HAB interference prevalence in modern Tg assays is disturbing, particularly as only a few clinically practicing physicians are aware of these potential problems. We therefore decided to examine the prevalence of significant HAB interferences in our current assay systematically by evaluating a large cohort of samples for HAB interferences.
Materials and Methods
Samples
Our laboratory performs 16,000–18,000 serum Tg assays/yr. Of these, approximately 10% are performed on Mayo Clinic Rochester patients, and 90% are performed on samples referred to us by outside clients. We market the assay primarily as a tumor marker. Although we do not know whether the usage patterns by outside clients conform to this recommendation, 50–60% of the intraclinic Tg assays are indeed performed in thyroid cancer patients.
We included all samples in our study that had been referred for Tg assays to our laboratory by Mayo Clinic physicians or outside customers during mid-December 2002 through to the first week of January 2003. After routine Tg measurement, we selected those samples with Tg levels equal to or greater than 1 ng/ml and tested them immediately for possible HAB interference as described below. The cut-off of 1 ng/ml was chosen for several reasons. First, the majority of samples analyzed in our laboratory fall above this threshold. Second, most clinicians would consider this the lowest decision making level in thyroid cancer follow-up. Finally, HAB interferences are known to cause false elevated/positive results in the majority of cases and are only rarely responsible for false negative or false low results (7, 8).
For all samples from Mayo Clinic Rochester patients we determined the clinical indication for testing from their medical records.
Identification of HAB interferences
For all study samples we repeated the initial Tg measurements after incubating 500 μl of each serum sample in heterophile-blocking tubes (HBT; Scantibodies, Santee, CA) at room temperature for 1 h. These tubes contain a proprietary mix of lyophilized mouse antihuman immunoglobulin M (IgM) with high affinity for human antianimal antibodies and are regarded as an effective means of blocking HAB interferences (7, 8). The heterogeneous nature of HAB interferences makes it impossible to arrive at conclusive figures for the sensitivity and specificity of different blocking regimens. However, specific blockers, i.e. IgG or IgM, directed against human antianimal immunoglobulins, such as immunoglobulin inhibiting reagent and heterophile blocking reagent/HBT (used in our study), are generally regarded as superior to nonspecific blockers (general mixtures of animal Igs). For example, the most commonly used nonspecific blocker, MAK33, may have very little blocking efficiency unless it is heat-treated (13). By contrast, IIR and HBR/HBT, as summarized in two recent comprehensive review articles (7, 8), have been shown to block between 75–100% of heterophile interferences.
We compared the results before and after HBT treatment, including noting the number of samples in which HBT treatment resulted in a fall in Tg levels to less than 1 ng/ml and the number of samples in which Tg fell to less than 0.1 ng/ml. For each sample we also calculated the differences between the original Tg value and the measurement obtained after HBT treatment, expressed as a percentage of the original result. We plotted the distribution of the difference percentages and considered every sample showing an absolute difference percentage of greater than 3 sd from the mean difference percentage as possibly affected by heterophile interference. The ±3 sd cut-off was chosen because statistically only 0.2% of measurements would be expected to fall outside this range, making it a highly specific cut-off not likely to be affected by random experimental error. We then reassayed all the samples falling into this category, with and without HBT treatment, and diluted the samples 1:2 and, if sufficient serum was available 1:4 or 1:5. All of our assays are validated to show linear dilution to at least a 1:8 dilution. Linear is defined as a recovery of between 80–120% of the expected value after dilution. Any sample that showed recoveries outside this range after dilution was deemed to exhibit nonlinear dilution. Confirmed original-HBT differences together with nonlinear dilution were considered confirmatory evidence of interference. We also subjected 20 randomly selected samples showing less than 10% difference between the original Tg result and the post-HBT treatment result to similar dilution series testing.
Tg assay
All Tg levels were measured on the Access or Access 2 immunoassay system using the manufacturer’s (Beckman-Coulter, Fullerton, CA) standard reagent packs and procedures. The assay’s limit of detection (2.5 sd above background noise) is 0.012 ng/ml, whereas our reporting limit of sensitivity (manufacturer’s recommendation) is 0.1 ng/ml. The upper limit for undiluted samples is 480 ng/ml. All samples exceeding this were diluted and reassayed. The between-assay coefficients of variation (CV) are 6.9% for low control pools (mean, 0.073 ng/ml), 5.55% for intermediate control pools (mean, 39.2 ng/ml), and 7.05% for high control pools (mean, 167.24 ng/ml). Dilution linearity (to dilutions of at least 1:8) ranged from 90–100% for anti-Tg autoantibody negative samples to 80–100% for anti-Tg-positive samples. Separate validation for HBT-treated samples showed comparable linearity values, while the average CV across the entire analytical range of paired original and HBT-treated Tg assays was 8.3%, very close to the regular interassay CV of the Tg assay listed above.
Results
A total of 1106 samples fulfilled the inclusion criteria. This represented 55% of all serum Tg measurements performed during the study period. The remainder of samples had Tg values less than 1 ng/ml. Of the 1106 samples 918 (83%) were anti-Tg autoantibody negative, and 188 (17%) were anti-Tg autoantibody positive. Eighty Tg assays were performed on Mayo Clinic Rochester patients, 48 for thyroid cancer follow-up and 32 for a total of 17 other reasons, ranging from thyroiditis/hypothyroidism (13 samples) to metastatic carcinoma with unknown primary (1 sample). No data are available for the case-mix among the non-Mayo Clinic samples.
The Tg values of the original samples ranged from 1–15,600 ng/ml, with a mean of 123.3 ng/ml, a median of 14 ng/ml, and a mode of 11 ng/ml. This was not significantly different from the corresponding values after HBT treatment, which resulted in a Tg measurement range of less than 0.1 to 14,021 ng/ml, with a mean of 127.3 ng/ml, a median of 15 ng/ml, and a mode of 13 ng/ml. However, although all of the original samples had a Tg value of more than 1 ng/ml, 32 samples had Tg values of less than 1 ng/ml after HBT treatment (Yates corrected χ2 = 36.47; P < 0.0000001), and of these 20 had values below 0.1 ng/ml (Yates corrected χ2 = 18.21; P < 0.00002).
Figure 2 depicts the distribution of differences between the original Tg measurements and the repeats after HBT treatment, all expressed as a percentage of the original value. The difference percentages followed a near-normal distribution, centered near a 0% change and with about equal numbers of samples showing a decrease and an increase in Tg values after HBT treatment, as would be expected for any repeat measurement. This indicates that HBT treatment has no significant effect on samples that do not contain blockable interferences.
The mean difference between original Tg values and Tg values after HBT treatment was −1.45% (median, 0%; sd, 18.97%). Two samples showed percentage increases of more than 3 sd percentages (>56.91% in excess of mean difference percentage) in Tg values after HBT treatment. Both samples were anti-Tg autoantibody negative. The pre-HBT treatment serum Tg values for these samples were 10 and 374 ng/ml, rising to 16 and 995 ng/ml, respectively. In 32 samples the drop in Tg values after HBT treatment exceeded 3 sd percentages (>56.91% less than mean difference percentage), with the mean percentage drop being 87.1% (median, 95.1%; range, 60–99.7%). Twenty-eight of these samples were anti-Tg autoantibody negative, and 4 were anti-Tg autoantibody positive. In 26 cases, of which 23 were anti-Tg negative, serum Tg levels dropped to less than 1 ng/ml after HBT treatment, with 20 cases (17 anti-Tg negative) dropping to less than 0.1 ng/ml. As shown in Fig. 3 HBT treatment did not introduce a significant systematic bias, nor was there any evidence that HBT treatment selectively reduced or increased Tg values over certain parts of the analytical range. Figure 3 also illustrates that there is no relationship between analyte level and the likelihood of difference percentages after HBT treatment that lie outside the ±3 SD boundaries.
All differences between untreated and HBT-treated results were reproducible, and dilution was nonlinear for 31 of the 34 cases that had shown upward or (mostly) downward changes in Tg results of more than 3 sd percentages (>56.91%) after HBT treatment. None of the 20 samples with percentage changes of less than 10% showed evidence of nonlinear dilution.
Among the 80 Tg assays performed on Mayo Clinic Rochester patients there were 8 showing possible heterophile interference (all Tg auto-antibody negative), 6 of which occurred in thyroid cancer patients. In 1 of these cases, this could have had therapeutic consequences, with Tg levels falling from 13 ng/ml before HBT treatment to undetectable levels after HBT treatment. Two potential heterophile interferences occurred in patients who did not suffer from thyroid cancer, 1 each in a hypothyroid patient and in a patient with metastatic breast cancer.
Discussion
Studying 1106 serum Tg samples, we detected likely HAB interferences in approximately 3% of the specimens tested. We identified samples possibly suffering from HAB interferences by reassaying all samples after treatment in HBT tubes. Specimens that displayed a percentage difference between the original result and the post-HBT treatment result of greater than 3 sd percentages (>56.91% more or less than the mean difference percentage) were considered to potentially suffer from HAB interference. Statistically, one would expect no more than 2 or 3 of 1106 specimens to exhibit a result change of this magnitude. However, we observed 34 samples that fulfilled these criteria, most showing very substantial downward changes in Tg results after HBT treatment. In most of these cases, nonlinear dilution supported the conclusion that the samples were indeed affected by HAB interferences. It therefore appears that the Beckman automated immunometric Tg assay suffers from much higher rates of false high and false positive results due to HAB interferences than would generally be regarded as acceptable. Even if one assumes that the 45% of specimens with original values less than 1 ng/ml that were not included in this study are all free of interferences, the resulting overall interference rate still exceeds 1.5%.
Most laboratorians and clinicians would generally assume that HAB interferences occur at much lower frequencies. However, there is little objective evidence to support this belief. It is based largely on the fact that since the late 1980s all assay manufacturers have added blocking reagents to immunometric assays, as it had become apparent that unblocked immunometric assays suffered from an HAB interference rate of between 2% and 5% (7, 8). Adding polyclonal IgG from one or many species to the assay buffer seemed to eliminate this problem under controlled in vitro conditions and was subsequently widely adopted, but unfortunately rarely verified as to its actual efficacy under clinical testing conditions in patient populations. Most studies performed were small, often based on a case-control design (8). Widely quoted low interference rates of less than 0.1% are in the main based on the results of one particular study involving the identification of discrepant peripheral thyroid hormone levels and TSH levels as a surrogate marker of possible HAB interferences. This study was one of the very few large studies ever performed to assess systematically the prevalence of HAB interference in a modern immunometric assays (14). It showed an heterophile interference rate of less than 0.03%. Based largely on these results and a limited number of much smaller studies, as summarized in two recent reviews (7, 8), it has been assumed that all modern immunometric assays exhibit similar low heterophile interference rates. However, even a superficial reading of the literature reveals that this is not the case. In particular many recent studies, some of substantial size, have hinted at much higher rates, at least 1–2%, of heterophile interferences for a large range of assays from many manufacturers, including such common analytes as TSH, troponin, CA-125, creatine kinase, and prostate-specific antigen (13, 15–23). Two possible conclusions can be drawn from this. First, it could be that the rates of HAB interferences in immunometric assays never really improved as substantially after routine addition of blockers, as had been assumed. Second, changing assay configurations and medical practices may have, over time, conspired to defeat the efficacy of standard blocking regimens. There is considerable circumstantial support for this second idea. Common, low specificity, low affinity polyspecific antibodies, which are capable of limited cross-reactivity with animal antigens, may be found in up to 40% of the normal population (8). They may cause transient interferences, but, given sufficient incubation time, they would eventually be bound by blocking reagents. However, most assays are now automated, meaning that reactions are rarely allowed to reach equilibrium, and there may be insufficient time to achieve complete blocking. Modern assays are also often configured with two or more mouse monoclonal antibodies for capture and detection. With the increasing use of monoclonal mouse antibodies in diagnostic imaging and medical therapy, and the resultant immunization of the recipients, the potential for heterophile interference increases significantly in this setting (7, 22).
It is difficult to say whether any host factors might have contributed to the HAB interferences observed in our study. For most patients we have no clinical data. If one extrapolates from the Mayo Clinic Rochester patients, then one might expect that about 60% of the serum Tg measurements were performed for thyroid cancer. Thyroid cancer is not a condition with any known predisposition to the development of HAB. The next most common indications for Tg measurement were hypothyroidism and thyroiditis. Both involve often vigorous host-immune responses. Autoimmune responses and regular immune responses to pathogens, both bacterial and viral, are well known to boost titers of polyspecific antibodies, which may cause heterophile interferences in immunoassays (8, 24, 25). However, only 1 of 13 patients within this group showed a possible interference, a rate no higher than the 6 of 48 observed in thyroid cancer patients. This latter interference rate exceeds 10%, significantly higher than the just under 3% observed in the overall cohort. This could be random variability or perhaps at least some thyroid cancer patients have a propensity to develop HAB, although this has not been previously described. In this case one might speculate that either a lesser proportion of outside referral tests were performed for thyroid cancer or that outside patients with thyroid cancer differed in some unknown way from Mayo Clinic Rochester patients with thyroid cancer.
For patients with thyroid cancer the clinical consequences of an artifactual elevation in serum Tg levels can be considerable. Besides creating patient and physician anxiety, in most cases additional investigations will ensue, which may involve radiation exposure or even invasive procedures. Moreover, during the last decade there has been a trend in thyroid cancer treatment of using radioiodine treatment for elevated serum Tg levels, even without performing prior diagnostic scanning or if diagnostic radioiodine scanning is negative (9–12). In a perfect world this may offer marginal benefits for some patients compared with the more conventional approach of expectant observation. However, as graphically demonstrated by our results, radioiodine treatment, which is given based solely on an elevated serum Tg result, can also result in unnecessary therapy of patients without actual recurrence. Potentially, all 17 anti-Tg autoantibody-negative cases in our study whose Tg levels became undetectable after HBT treatment fall into this category. Similar problems with unwarranted therapeutic interventions as a consequence of HAB assay interferences have received considerable publicity in recent years with regard to therapy of healthy women for trophoblastic disease as a consequence of false positive human chorionic gonadotropin measurements (26, 27). More recently, similar issues have also surfaced with regard to unnecessary adjuvant therapy for apparent prostate cancer recurrence, administered based on artifactual elevations in serum prostate-specific antigen levels (21).
Avoiding the trap of artifactual results due to HAB interference is not easy. Constant vigilance on the part of the clinician and close dialog with the laboratory are crucial. Just because an assay has not been shown to suffer from frequent heterophile interferences does not mean it does not have this problem. More likely, no one has looked. In addition, HAB interferences can on occasion be transient (28). A fall in serum Tg levels after radioiodine therapy can therefore not necessarily be interpreted as evidence for successful thyroid cancer treatment, at least if the posttherapy scan was negative. One may simply be observing a spontaneous fall in HAB levels rather than a true decline in serum Tg as a result of treatment. In this context it is interesting to note that in one of the published studies of radioiodine therapy of Tg-positive, scan-negative patients, serum Tg levels also fell in a historical control group that had not received any radioiodine therapy (11). The lack of a further rise in an elevated Tg level after T4 withdrawal or recombinant TSH stimulation can indicate possible heterophile interference. However, this approach has its pitfalls. For example, a thyroid remnant capable of producing small amounts of Tg may be completely dormant during T4 therapy, but will produce some Tg when stimulated by TSH. If the initial, false positive Tg level was relatively low (low titers of HAB or partial blocking), e.g. 4–6 ng/ml, even small amounts of Tg secreted by the remnant after TSH stimulation may be sufficient to result in a 30–100% rise in Tg levels. This may then be interpreted wrongly as evidence of persistent or recurrent disease. Conversely, some tumor metastases, which produce large amounts of Tg may show little further rise in Tg after TSH stimulation. In this case the Tg assay results may be dismissed falsely as a heterophile interaction.
Based on our findings we have since reassayed several months of stored samples after HBT treatment and issued amended reports when indicated as well as notified referring clinicians and laboratories. In this (separate from this study) group of an additional 1751 samples we found a similar heterophile interference rate of just under 2.9%. We now routinely treat all of our samples in HBT tubes before serum Tg measurement. Based on our data, this has brought the heterophile interference rate down to excellent, and in this case known and verified, levels of less than 0.1%. However, we are under no illusions that this approach will suffice in avoiding all such problems in the future. Even good assays will throw up the occasional case of HAB interference, and even the most elaborate blocking scheme will sometimes fail. If results do not fit the clinical picture they must still be questioned and confirmed.
To confirm questionable Tg results, repeating the measurements with the same assay is insufficient. Elevated results caused by HAB interferences are usually, as also shown in our study, reproducible. Similarly, testing the sample for the presence of HAB is unlikely to be helpful. Whereas it is possible to measure some types of HAB, the results show very poor correlation with the presence or absence of clinically significant assay interferences. First, the available HAB assays are all designed to measure only one particular subgroup of HAB, human antimouse antibodies (HAMA). Transient or permanent polyspecific antibodies, rheumatoid factor-like antibodies, and many other HAB are not measured. Second, within the group of HAMA assays, different HAMA assays correlate poorly with each other (7, 29). This is reflected by the fact that depending on the assay used, the published estimation of HAMA prevalence in the normal population varies between 1–80% (7). Consequently, these assays have limited value in excluding or confirming suspected clinically relevant HAB interference.
The simplest approach to suspected heterophile interferences is to repeat testing with a different assay, because a sample that shows interference in one particular assay may not show any problem in an assay from another manufacturer and vice versa (27). Alternatively, as we have done, samples can be treated with additional blocking reagents or assessed as to whether they behave linear in a dilution series. If these measures fail to resolve the issue, chromatographic separation of Igs can be attempted before reassaying. Some or several of these measures should allow the resolution of almost all questionable results and, in combination with good clinical judgement, prevent unwarranted investigations or therapy.
Acknowledgements
We thank William Reilly and Larry Dodge for shouldering much of the burden of organizing the workflow for assaying stored samples and organizing the routine HBT pretreatment of all current Tg samples. We also thank the people at Mayo Medical Laboratories for assisting us with alerting our customers to the problem and in fielding subsequent client inquiries. Finally, we thank all of the laboratory technicians who have provided and continue to provide the labor to enable the reassaying of stored samples and routine HBT treatment of all new samples.
This work was supported by the Mayo Clinic Department of Laboratory Medicine and Pathology and the Mayo Foundation.
Abbreviations:
- CV,
Coefficient(s) of variation;
- HAB,
heterophile antibody;
- HAMA,
human antimouse antibodies;
- HBT,
heterophile-blocking tubes;
- Ig,
immunoglobulin;
- Tg,
thyroglobulin.
Mizrahi I, Bray K, Kapsner K, Nunnelly P, Parson R, Smith T, Preissner C, O’Kane D 2001 Clinical performance of a chemiluminescent thyroglobulin assay on the Beckman Coulter’s Access Immunoassay System. Clin Chem 47(Suppl):A134
Committee on Gynecologic Practice American College of Obstetricians and Gynecologists
HAMA survey group