© 2003 American Thoracic Society
A Meta-analysis Derivation of Continuous Likelihood Ratios for Diagnosing Pleural Fluid ExudatesMedical University of South Carolina, Charleston, South Caroline; and University of New Mexico School of Medicine, Division of Pulmonary and Critical Care Medicine, Lovelace Health Systems, Albuquerque, New Mexico Correspondence and requests for reprints should be addressed to John E. Heffner, M.D., Medical University of South Carolina, Division of Pulmonary and Critical Care Medicine812CSB, 96 Jonathan Lucas Street, P.O. Box 250623, Charleston, SC 29425. E-mail: heffnerj{at}musc.edu In evaluating pleural effusions, clinicians perform thoracentesis and pleural fluid analysis most often by Light's criteria to establish the exudative or transudative nature of the effusion (1). Light's criteria (2) dichotomize effusions into exudative or transudative categories with the use of three pleural test criteria: pleural fluid-to-serum protein ratio, pleural fluid lactate dehydrogenase (LDH) concentration, and pleural fluid-to-serum LDH ratio. Several other pleural fluid test criteria have similar diagnostic accuracies as compared with each of the individual tests within Light's criteria (3). These criteria include pleural fluid cholesterol, pleural fluid-to-serum cholesterol ratio, pleural fluid protein, and pleural fluid-to-serum albumin gradient. All of these tests dichotomize effusions into exudative or transudative categories by determining whether the test results are above or below a single cutoff point. Problems exist, however, with these pleural fluid test criteria as commonly used in clinical practice. First, dichotomizing effusions into exudates and transudates by using a single cutoff point loses much of the information contained in pleural fluid tests, which generate continuous numeric results (4). Test results just beyond and those extremely beyond a cutoff point are treated the same in that both establish the presence of an exudative effusion. This binary diagnostic strategy explains why Light's criteria frequently misclassify as exudates the pleural effusions associated with congestive heart failure (57), which usually have borderline pleural fluid test results when in the exudative range. It also contributes to the misclassification that occurs in 1 to 10% of patients with malignant pleural effusions who appear to have transudative effusions by Light's criteria (8). Second, combining two or more tests using a single cutoff point for each test, as done by Light's criteria, increases sensitivity but decreases specificity because only one of the tests needs to be positive to define an exudative effusion (9). Finally, the lower specificity caused by combining several criteria encourages physicians to commonly order additionaland usually unnecessarypleural fluid tests when initial test results do not fit clinical circumstances. Experts, for instance, recommend measuring pleural fluid-to-serum albumin gradients when patients with congestive heart failure have an exudative effusion by Light's criteria (7). A Bayesian approach addresses these limitations of binary testing strategies (10). Rather than diagnosing the presence or absence of a condition, a Bayesian strategy uses test results to generate likelihood ratios that increase or decrease a clinician's pretest estimate of the probability of disease. Likelihood ratios represent the likelihood that a positive test result would be found in a patient with as opposed to without disease (10). Likelihood ratios are usually calculated using a single test result cutoff point. These binary likelihood ratios have values above 1 for test results that increase the likelihood of an exudative effusion and values below 1 that decrease the likelihood. We previously published multilevel likelihood ratios for pleural fluid test criteria commonly used to diagnose exudative pleural effusions (11). Multilevel likelihood ratios are calculated by using two or more cutoff points for the range of possible test results. These multiple cutoff points demarcate test result intervals that are each associated with a different likelihood that the patient has an exudative effusion. Although breaking up continuous test result values into ordinal intervals with multiple cutoff points improves diagnostic precision as compared with using a single cutoff point, they only provide an average value for the range of likelihood ratios that exist within each of the test result intervals (4). A more precise diagnostic approach would calculate an exact likelihood ratio for each possible discrete pleural fluid test result. Such discrete likelihood ratios are called continuous likelihood ratios (12). Methods for calculating continuous likelihood ratios have been reported for very few conditions. In this study, we analyze with logistic regression a multicenter registry of pleural fluid test values from patients with established diagnoses to derive equations that calculate continuous likelihood ratios for pleural fluid test criteria for exudative effusions. We also compare continuous likelihood ratios with multilevel and binary likelihood ratios to determine whether continuous likelihood ratios provide statistically significant and clinically important advantages. METHODS Since 1995, we have maintained a multicenter registry of pleural fluid test results for patients categorized with transudative or exudative pleural effusions. Monthly Medline searches (using the terms "pleural effusion," "diagnosis," and "exudates and transudates") identify articles published since 1976 that contain pleural fluid test information. We contact primary investigators and invite them to submit their data. Data are accepted if each discrete pleural fluid test result can be linked to an individual patient and the patient's underlying diagnosis (termed individual patient data), studies enrolled consecutive patients with pleural effusions of unknown etiology, and the clinical evaluation of patients used a battery of diagnostic tests and patient follow-up to establish the underlying cause of the effusion (e.g., congestive heart failure, pleural malignancy). Patients categorized with exudative effusions required confirmation of a specific diagnosis (e.g., positive pleural fluid cytology or culture). Patients categorized with transudative effusions could not have a confirmed condition associated with exudates (e.g., malignant effusion) and required clinical evidence of a condition associated with transudates (e.g., clinical, echocardiographic, and radiographic evidence of congestive heart failure). Studies were excluded if more than 15% of patients were excluded from the reports because of an inability to diagnose an underlying condition, which suggested an inadequate diagnostic evaluation. The following criteria for diagnosing exudative effusions were examined in this meta-analysis: pleural fluid protein, pleural fluid-to-serum protein ratio, pleural fluid LDH, pleural fluid-to-serum LDH ratio, pleural fluid cholesterol, pleural fluid-to-serum cholesterol ratio, and pleural fluid-to-serum albumin gradient.
Multilevel Likelihood Ratios
Binary Likelihood Ratios
Continuous Likelihood Ratios
Statistical and Clinical Comparison of Likelihood Ratios Diagnostic accuracies (proportion of effusions correctly categorized) were calculated for 0.5-U test result intervals for each Light's criterion to determine the effect on diagnostic accuracy of Light's criteria as each of the individual criteria approached its binary cutoff point. The study received an exemption from institutional review board approval. RESULTS
Primary Studies Patient demographics and etiologies of the pleural effusions are shown in Table 1 . The overall prevalence of exudative effusions in the registry is 74.6%.
The patient registry database contained a different number of patients for each of the measured pleural fluid test criteria because not all test results were performed in all of the primary investigations (Table 2) . Logistic regression models for each of the pleural fluid tests analyzed only the registry subgroup of patients that had results for the pleural fluid test criterion under analysis by the model. The prevalence of exudative effusions within each of these subgroups was similar (Table 2).
Table 2 shows the results of the logistic regression analyses for each pleural fluid test criterion assessed as an explanatory variable for the response variable of the presence or absence of an exudative pleural effusion. The p values in Table 2 show that each of the pleural fluid test criteria is statistically significantly associated with the response variable. The calculated value of x for each pleural fluid test criterion is shown in Table 2. The coefficients from each logistic model were used to derive an exponential equation for calculating continuous likelihood ratios for each pleural fluid test criterion. These derived equations are shown in Table 3 .
Binary (the use of a single cutoff point as recommended by Light's criteria) and multilevel (multiple cutoff points that demarcate test result intervals) likelihood ratios were also calculated. In Tables 410 , the multilevel likelihood ratios with 95% confidence intervals are shown for each of the test result intervals. The distributions of exudative and transudative effusions within each test result interval are also shown. Because some test result intervals had small numbers of patients with either transudative or exudative effusions, the 95% confidence intervals were wide for the multilevel likelihood ratios for these intervals. Consequently, the point estimate for some multilevel likelihood intervals did not continuously increase or decrease as would be expected as the test criteria increased. For instance, the point estimate for the multilevel likelihood ratio corresponding to a pleural fluid protein value of 5.1 or more was lower than the corresponding value for the test result interval of 4.6 to 5.0 (Table 4 and Figure 1) .
Binary likelihood ratios above and below each pleural fluid test criterion's single cutoff point are shown in Tables 410. These binary values are repeated in the tables for each of the test result intervals shown to allow a comparison with the corresponding multilevel likelihood ratios. Continuous likelihood ratios for each pleural fluid test criterion calculated with the use of the equations listed in Table 3 are also shown in Tables 410 to allow comparison with binary and multilevel likelihood ratios. A discrete continuous likelihood ratio was calculated for the boundary test criterion result that defined the result intervals shown in Tables 410. In Table 4, for instance, the continuous likelihood ratio value of 22.00 corresponds to the pleural fluid protein test result of 5.2 g/dl. The value of 9.49 corresponds to the test result of 4.6 g/dl, and the value of 18.59 corresponds to the test result of 5.0 g/dl. Consequently, a pleural fluid protein test result of 4.6 g/dl results in binary, multilevel, and continuous likelihood ratios of 4.70, 40.12, and 9.49, respectively, as shown in Table 4. Because continuous likelihood ratios are derived from logistic regression, which analyzes all of the data entered into the model rather than subgroups within an interval of test results, they have narrower confidence intervals as compared with multilevel likelihood ratios (Tables 410). Figures 1 and E1E6 in the online data supplement display graphically the relationship between binary, multilevel, and continuous likelihood ratios to assess clinical importance of differences between likelihood values. Results are displayed with two different scales of likelihood ratios on the ordinate axis to display curves at the upper and lower likelihood ratio ranges. Two authors (J.E.H. and K.H.) visually inspected each of these curves independently and agreed that sufficient differences existed between likelihood ratio values calculated by the three techniques to cause frequent misclassification of pleural effusions. For instance, a pleural fluid-to-serum LDH ratio result of 0.93 is associated with a binary, multilevel likelihood and continuous likelihood ratios of 5.32, 7.19, and 1.65, respectively, as shown by Table 7. A patient with congestive heart failure and a 20% pretest probability of an exudative effusion would have a post-test probability for an exudate of 57%, 64%, and 29% when evaluated by the binary, multilevel, and continuous likelihood ratio methods, respectively. Only the continuous likelihood ratio would correctly classify the patient as having a relatively low likelihood (29% post-test probability) of having an exudative as opposed to a transudative effusion. Light's criteria used in a dichotomous manner would define the effusion as an exudate. Two-way comparisons of binary, multilevel, and continuous likelihood ratios determined that both the multilevel and continuous likelihood ratios provided more information to a clinically important degree than the binary strategy for all criteria examined. Continuous likelihood ratios provided more information to a clinically important degree than the multilevel likelihood ratios for pleural fluid LDH, pleural fluid-to-serum LDH ratio, and pleural fluid-to-serum albumin gradient. Two-way comparisons of binary, multilevel, and continuous likelihood ratios for statistically significant differences for each pleural fluid test criterion determined that both the multilevel and continuous likelihood ratios provided clinicians with more diagnostic information than the binary testing strategy (p < 0.05). Continuous likelihood ratios provided statistically significantly more information as compared with multilevel likelihood ratios for the following criteria: pleural fluid LDH (p < 0.01), pleural fluid-to-serum LDH ratio (p < 0.01), pleural fluid cholesterol (p < 0.001), and pleural fluid-to-serum albumin gradient (p < 0.01). A correlation matrix of the pleural fluid test criteria is shown on the online repository (see Table E1 in the online supplement). The overall diagnostic accuracy of the binary Light's criteria strategy for patients in the database patients was 92%. Diagnostic accuracies for 0.5-U interval results of the Light's criterion pleural fluid-to-serum LDH ratio are shown in Table 11 , which demonstrates that the overall diagnostic accuracy of the three-test Light's criteria decreases as the results of pleural fluid-to-serum LDH approaches its cutoff point. This relationship of decreasing diagnostic accuracy of Light's criteria as test results near their cutoff points also occurs for pleural fluid LDH and pleural fluid-to-serum protein ratios (see Tables E2 and E3 in the online supplement).
DISCUSSION This meta-analysis of individual patient data used logistic regression to derive equations for calculating continuous likelihood ratios for discrete results of pleural fluid test criteria that diagnose exudative pleural effusions. Continuous likelihood ratios calculated from these equations as compared with binary likelihood ratios using a single cutoff point provided more information to a statistically significant and clinically important degree because they incorporated the full range of data available in pleural fluid testing, which generates continuous numeric results. Dichotomizing patients with a single cutoff point in a binary testing strategy, as recommended by Light's criteria, loses much of this information and misclassifies many patients who have test results near binary cutoff points (7). Diagnostic accuracy for Light's criteria decreased to as low as 65% to 86% as any one of the criteria approached its binary cutoff point (Tables 11, E2, and E3). For some but not all of the pleural fluid test criteria evaluated in this study, continuous likelihood ratios provided more information to a statistically significant and clinically important degree as compared with multilevel likelihood ratios. Readers must make their own assessment, however, of relative degrees of clinical importance between different testing strategies considering the subjective nature of this evaluation. Although multilevel likelihood ratios with the use of multiple cutoff points provide more diagnostic precision as compared with binary strategies (11), several limitations impede their widespread use. First, calculation of multilevel likelihood ratios requires stratification of pleural fluid test results into multiple ordinal intervals demarcated by test result boundaries. As the number of stratifications increase in an effort to increase diagnostic precision, the number of available database test results within each stratified interval decreases. Decreasing the number of test result values within an interval lowers the precision of point estimates of multilevel likelihood ratios for that interval. Even though our patient registry represents the largest pleural fluid test result database in existence to our knowledge, the 95% confidence intervals around the multilevel likelihood ratios reported in this study were wide. This decreased precision of point estimates explains why multilevel likelihood ratios did not progressively increase or decrease across ranges of test criteria results (example shown in Figure 1), as would be expected. Equations for calculating continuous likelihood ratios derive from logistic regression modeling, which uses all of the available data in the patient registry rather than data stratified to test result intervals. Consequently, this meta-analysis provides more precise point estimates for continuous likelihood ratios as compared with multilevel likelihood ratios as shown by the narrower 95% confidence intervals for continuous likelihood ratios (Tables 410). A second difficulty with multilevel likelihood ratios using multiple cutoff points relates to difficulties with their routine clinical use, which requires clinicians to recall multiple likelihood ratios and multiple cutoff points. Also, programming multilevel likelihood ratios into computerized decision support tools (electronic medical records, computerized physician order entry, dynamic websites, and personal digital assistants) requires complex conditional "ifthen" statements. In contrast, the continuous likelihood ratio equations derived by this study can be easily programmed into computerized decision support platforms or personal digital assistants for bedside use (27), which are increasingly using likelihood ratios for bedside patient care (28). We demonstrated the statistical and clinical advantages of continuous likelihood ratios in providing clinicians with more test result information than the other strategies, although our assessment of clinical importance was by necessity subjective. Nevertheless, continuous likelihood ratios prevent clinicians from classifying two patients differently (exudates versus transudate), as occurs with Light's criteria when their test results are nearly identical but fall just barely on either side of a cutoff point. Continuous likelihood ratio equations can be used in sequence when clinicians apply to pleural fluid analysis combinations of pleural fluid test criteria, as in Light's criteria (see APPENDICES E1 and E2 in the online supplement). In contrast to Light's criteria, however, pleural effusions are not dichotomized with continuous likelihood ratios into "exudative" or "transudative" groups but are characterized as to the probability that an effusion has one of the underlying disorders that is associated with an exudative effusion. This Bayesian approach decreases the relevancy of the conceptual model of exudative and transudative effusions and brings the clinician closer to the goal of laboratory testing, which is to estimate the relative probability of disorders within a differential diagnosis. The examples shown in the online supplement demonstrate how the use of continuous likelihood ratios to establish the probability of an exudative effusion (i.e., conditions associated with exudative characteristics) can prevent the misclassification of pleural effusions that occurs when effusions are dichotomized by Light's criteria in an overly simplistic manner. As with all meta-analyses, the primary limitation of the present meta-analysis relates to the quality of the primary investigations. We have previously reviewed the challenges of designing rigorous studies to assess the discriminative properties of pleural fluid testing for identifying exudative effusions (9) and the design limitations of existing studies (3). In the absence of a single gold standard test, high-quality investigations use variable combinations of diagnostic studies, clinical evaluations, and patient follow-up to categorize pleural effusions. We limited the impact of study design flaws on our findings by establishing inclusion criteria for our patient registry that excluded primary studies that did not conform to the major quality criteria of diagnostic test evaluation (29). Regardless of the limitations of the primary studies, however, their findings represent the best available data, which have been used during the last several decades to entrench binary pleural fluid testing criteria into routine clinical practice (30). This meta-analysis improves the precision of each of these primary studies and extends their recommendations by deriving continuous likelihood ratios for use in a Bayesian diagnostic strategy. A strength of this meta-analysis relates to its use of individual patient data and an ongoing multicenter registry that updates available data for analysis. Although meta-analyses of individual patient data require extensive effort and resources, they allow better standardization of case definitions, outcomes, and covariates as compared with meta-analyses using summarized data (31). The limited familiarity among many clinicians with Bayesian techniques for using diagnostic test results might be considered a drawback to the use of continuous likelihood ratios. Recent applications of Bayesian techniques to the evaluation of solitary pulmonary nodules (32, 33), interpretation of methacholine tests (34), and diagnosis of pulmonary embolism (35) and community-acquired pneumonia (36), however, are widening the awareness of these techniques among pulmonary clinicians. When using likelihood ratios in diagnostic strategies that use multiple test criteria, such as Light's criteria, each of the criteria in the combination should be independent of the others to avoid test convergence and an overestimation of post-test probabilities (37). We previously reported an analysis of an earlier version of the patient registry used in this study that demonstrated colinearity (Pearson product-moment correlation coefficient of 0.84) of pleural fluid LDH and the pleural fluid-to-serum LDH ratio (3). The present analysis of the updated patient registry confirms a high degree of colinearity (coefficient of correlation of 0.92), which indicates these two criteria should not be used together. When using continuous likelihood ratios in series, therefore, the "abbreviated Light's criteria" consisting only of pleural fluid-to-serum protein ratio and pleural fluid LDH should be used. We have previously demonstrated that other tests combinations have equal discriminative properties as compared with Light's criteria and can avoid the need for serum tests (e.g., the combination of pleural fluid LDH, pleural fluid cholesterol, and pleural fluid protein) (3). In conclusion, this study derives simple exponential equations for determining the post-test probability of exudative pleural effusions by calculating continuous likelihood ratios, which have improved precision compared with other diagnostic strategies in common use. Binary testing using a single cutoff point for each test, as done by Light's criteria, misclassifies a sizeable number of patients who have pleural fluid test results near the single cutoff points. We recommend that continuous likelihood ratios be used in clinical practice when clinicians apply Bayesian methods to pleural diagnosis. Acknowledgments The authors thank the primary authors for providing their primary, patient-level data. FOOTNOTES Supported by the Medical University of South Carolina Center for Clinical Effectiveness and Patient Safety. This article has an online supplement, which is accessible from this issue's table of contents online at www.atsjournals.org Received in original form January 11, 2003; accepted in final form April 8, 2003 REFERENCES
This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||