Published ahead of print on January 7, 2005, doi:10.1164/rccm.200405-578OC
© 2005 American Thoracic Society doi: 10.1164/rccm.200405-578OC
Six-Minute Walk, Maximal Exercise TestsReproducibility in Fibrotic Interstitial PneumoniaDepartment of Respiratory Services, Green Lane Hospital, Auckland, New Zealand; and Interstitial Lung Disease Unit, Royal Brompton Hospital, Chelsea, London, United Kingdom Correspondence and requests for reprints should be addressed to Athol U. Wells, M.D., Consultant Respiratory Physician, Interstitial Lung Disease Unit, Emmanuel Kaye Building, Manresa Road, Chelsea, London SW3 6LR, UK. E-mail: a.wells{at}rbh.nthames.nhs.uk
Resting pulmonary function and exercise variables are widely used to stage and monitor idiopathic interstitial pneumonia (IIP). However, the variability of exercise data (maximal exercise and the 6-minute walk test) has not been evaluated definitively. We have prospectively quantified the reproducibility of resting and exercise functional data in fibrotic IIP (idiopathic pulmonary fibrosis, fibrotic nonspecific interstitial pneumonia) and have evaluated interrelationships between variables. Thirty consecutive patients with fibrotic IIP underwent serial resting pulmonary function tests, 6-minute walk (n = 29), and maximal exercise (n = 24) at an interval of 1 week, with all testing performed in accordance with American Thoracic Society standards. Within-subject reproducibility was excellent for 6-minute walk distance (SD/mean = 4.2%) and clinically acceptable for resting pulmonary function indices and VO2max on maximal exercise testing. However, the amplitude of oxygen desaturation at the end of exercise was poorly reproducible in both 6-minute walk and maximal exercise testing (SD/mean > 25%). There was a highly significant relationship between VO2max on maximal exercise testing and 6-minute walk distance (rs = 0.78, p < 0.0001). In fibrotic IIP, the excellent reproducibility of the 6-minute walk distance is a major advantage in routine staging and monitoring, whereas maximal exercise variables are poorly reproducible.
Key Words: interstitial lung disease exercise tests variability Pulmonary function indices are central to the staging and monitoring of the two categories of idiopathic interstitial pneumonia (IIP) in which fibrosis predominates, hereafter termed fibrotic IIP: idiopathic pulmonary fibrosis (IPF) and fibrotic nonspecific interstitial pneumonia (NSIP). Resting pulmonary function tests provide invaluable prognostic information in fibrotic IIP, both at presentation (14) and with the evaluation of serial trends (58). The severity of oxygen desaturation at the end of maximal exercise testing is a major component of the old and new clinical-radiologic-physiologic (CRP) indices (1, 9). However, maximal treadmill testing is not always readily available and may be impracticable, when there is advanced lung disease or concurrent cardiac disease. Thus, there is increasing interest in less aggressive field exercise testing in diffuse lung disease. The 6-minute walk test (6MWT), widely acknowledged as a valuable clinical tool in chronic obstructive pulmonary disease (COPD), is currently under evaluation in IPF and provides more accurate prognostic information than resting pulmonary function tests in that disease (10). There is a surprising paucity of data on the reproducibility of exercise testing in fibrotic IIP. Without knowledge of reproducibility, it is difficult to determine, in individual cases, whether an apparent change in severity is significant or merely a reflection of measurement variation. Therefore, we have (1) compared the within-subject reproducibility of resting pulmonary function indices, incremental treadmill testing, and the 6MWT and (2) quantified interrelationships between these variables in 30 patients with fibrotic IIP. Some of the results of this study have been previously presented in the form of an abstract (11).
We recruited consecutive patients presenting to our unit, May 1998 to July 2000, with the clinical features of IPF, as defined by the American Thoracic Society (ATS)/European Respiratory Society consensus committee (12), and a clinical/high-resolution computed tomography (HRCT) diagnosis of fibrotic IIP: Clinical criteria were as follows:
For HRCT criteria, appearances compatible with fibrotic IIP were required, as in recent cohorts (5, 13), based on HRCT observations in patients with a biopsy diagnosis of usual interstitial pneumonia or NSIP and the clinical features of IPF (14). HRCT abnormalities were predominantly basal/subpleural in distribution and comprised a mixture of reticular and ground-glass abnormalities, with traction bronchiectasis when ground-glass attenuation was prominent and no consolidation or nodules. Appearances were subcategorized (13, 14) as follows: (1) typical of IPF; (2) indeterminate, but suggestive of IPF; and (3) suggestive of fibrotic NSIP. Exclusion criteria were as follows: clinically unstable, resting PaO2 of less than 7 kPa on air and major comorbidity (e.g., ischemic heart disease, malignancy). Local ethics committee approval was obtained, and signed, informed consent was received from all patients.
HRCT
Pulmonary Function Testing
Exercise Testing
FVC and DLCO were the primary resting variables. The primary exercise variables were VO2max and 6MWT distance.
Statistical Analyses
As shown in Table 1, patients were predominantly male with a mean age of 73 years. The mean duration of dyspnea was 41 months, and 77% of patients were current or previous smokers. The HRCT findings are outlined in Table 1, with the majority assigned as typical of IPF or indeterminate but suggestive of IPF. Resting pulmonary function indices are summarized in Table 2; reproducibility was clinically acceptable (SD of differences/mean value [SD/mean] < 10%).
Reproducibility of Exercise Testing Twenty-nine patients were available for 6MWT reproducibility analyses (one patient was not available for the second test). No patients declined the test; all patients completed it as per standard protocol. Within-subject reproducibility for 6MWT distance, illustrated in Figure 1, was higher than for any other variable in the present study (r = 0.98; SD/mean = 4.2%). By contrast, both measures of 6MWT desaturation were very poorly reproducible (Table 2). Measurement variation did not correlate significantly with disease severity (as judged by DLCO levels and the extent of disease on computed tomography).
Twenty-four patients completed the maximal exercise test protocol on two occasions. Four patients declined to perform maximal exercise testing at recruitment (nonspecific aversion to the test) and a further patient declined the second exercise test for the same reason. One patient developed transient ventricular ectopy during the first test, and a repeat test was not performed. No other adverse effects were observed; in all cases, patients reached symptom limitation with the test terminated by dyspnea. Within-subject reproducibility for VO2max verged on clinical acceptability (r = 0.88, SD/mean = 10.5%; Figure 2), although it was slightly lower than the reproducibility of resting variables. However, oxygen desaturation on exertion was associated with very major measurement variation, illustrated in Figure 3, and this increased further when corrected for measured VO2max as a percentage of predicted VO2 (r = 0.61, SD/mean = 50.7%).
Exercise variables were reevaluated in 17 patients with HRCT features typical of IPF. As shown in Table 3, reproducibility (good for 6MWT distance and desaturation to 88% or lower at the end of the 6MWT; poor for oxygen desaturation, for both 6MWT and maximal exertion) was virtually identical to reproducibility findings in the whole study population.
Reproducibility of the Composite Scores Within-subject reproducibility of the composite physiologic index was clinically acceptable (SD/mean = 7.8%). By contrast, the physiologic component of the CRP score was associated with considerable measurement variation (SD/mean = 25.9%), reflecting the major contribution made by oxygen desaturation on maximal exercise testing to the CRP score. We examined all variables for an order effect. Mean values for all resting variables declined slightly at Visit 2, but on paired t testing, this was statistically significant only for TLC (p < 0.05). For exercise variables, both VO2max and 6MWT distance increased at the second visit, although only the 6MWT distance reached significance on paired t testing, 416.9 (141.1) m increasing to 434.6 (146.9) m (p < 0.005).
Correlations between Exercise Variables and Other Data
Bivariate equations were constructed with the 6MWT distance as the dependent variable and VO2max as one covariate; the other variables were examined as the second covariate in separate equations. This showed, for all analyses, that VO2max was a very strong determinant of the 6MWT distance, with no other variable having a significant relationship with 6MWT distance after adjustment for VO2max.
Survival in Relation to Resting and Exercise Variables
Despite the cardinal role of serial pulmonary function tests in monitoring fibrotic IIP, no prospective evaluation has been performed of the reproducibility of exercise indices in a sizeable cohort of patients. We report major intertest variation in indices of maximal exercise used in routine evaluation in many centers. By contrast, VO2max was acceptably reproducible, and the 6MWT distance exhibited minimal intertest variation. The measurement of pulmonary function and exercise indices is integral to the assessment and monitoring of patients with IPF and may also provide powerful prognostic information and facilitate treatment decisions, including timing of lung transplantation (3, 5, 26). Thus, within-subject reproducibility is a crucial consideration. Serial trends in indices with low "measurement noise" can be interpreted with greater confidence. However, data on the reproducibility of exercise testing in IPF are surprisingly limited. In a study of six patients with a variety of restrictive lung diseases, "maximal" incremental cycle ergometry was reported as reproducible, but measurement variation was evaluated at preselected VO2 levels (i.e., at 40 and 70% of predicted maximum) and not at the end of maximum exercise (27). The prognostic value of 6MWT data, notably oxygen desaturation at end exercise rather than the 6MWT distance, was an important additional finding, confirming the observations of Lama and coworkers (10). In both studies, the total amplitude of desaturation had a slightly higher prognostic value than observed desaturation to 88% or less. However, as shown in Table 2, the latter variable was strikingly reproducible in the present study, whereas the total amplitude of desaturation was not. For the same reason, the presence or absence of desaturation to 88% or lower at the end of the 6MWT may be a preferable staging variable, and this applies equally to less reproducible maximal exercise variables, which were similarly predictive of mortality. In COPD, studies of maximal cycle ergometry suggest variable reproducibility (3.529%), although intermeasurement variation may be minimized with familiarity (2831). We noted acceptable reproducibility of both treadmill VO2 max and post-Borg dyspnea scores in our study population. These results, in particular the acceptable reproducibility of maximum VO2, indicate that the level of effort and respiratory work in the second test was truly comparable to that of the first test. Thus, the often large differences in desaturation variables are not spurious (because of major variations in patient tolerance) but are likely to reflect the fact that exercise becomes maximal when the patient enters the steep part of the oxygen desaturation curve. It appears that minor differences in patient tolerance and apparently trivial delays in stopping the test may have a disproportionate effect on final saturation. Furthermore, similar variation in oxygen desaturation was evident at the end of the 6MWT, despite the striking reproducibility of the 6MWT distance. Measurement variation is likely to be further exacerbated by the inherent noise of cutaneous oximetry, for which the 95% confidence limits may be as wide as ± 4 to 5% (32). Exercise arterial gas measurements, advocated by some authors, are often unattractive to both patients and clinicians. To our knowledge, the reproducibility of the 6MWT has not been systematically evaluated in IPF. The 6MWT is widely acknowledged as an objective measure of functional capacity in COPD (23), with established credibility as an interventional outcome measure, in pulmonary rehabilitation, prescription of ambulatory oxygen, and as a predictor of morbidity and mortality. However, mechanisms of dyspnea differ between COPD and restrictive lung disease, and therefore extrapolation from COPD to IPF is unwarranted. Our results show that the 6MWT distance is highly reproducible in fibrotic IIP. A small learning (training) effect was less than generally reported for COPD (33). It seems likely that with IPF, as has been shown in severe COPD (34), exercise intensity achieved during a 6MWT may approach maximal exercise. VO2max had a particularly strong positive correlation with 6MWT distance. Therefore, the 6MWT distance may be regarded as a surrogate marker for VO2 in IPF, as previously demonstrated in COPD, cardiac failure, and end-stage lung disease (3537). The 6MWT offers a number of advantages over maximal exercise testing. It does not require sophisticated equipment. It may correlate better with quality-of-life indices than maximal exercise testing (38) and is seldom associated with patient aversion. Five of 30 study participants declined to undergo maximal treadmill testing, whereas none expressed reservations about the 6MWT, despite a lack of familiarity with either test. Furthermore, in fibrotic IIP, maximal exercise testing is often contraindicated by the severity of disease or by concurrent ischemic heart disease. In a study of 68 patients with IPF, maximal exercise testing was performed as part of protocol, but was contraindicated in more than 30% of cases (39). By contrast, the 6MWT is widely performed with little morbidity in elderly patients with significant cardiac failure (40). The high reproducibility of spirometric volumes and lower but acceptable reproducibility of gas transfer in the present study is in keeping with previous reports (27, 41). It has recently been reported that short-term trends in gas transfer (5) or FVC (6, 7) are the most accurate determinants of survival in fibrotic IIP, with FVC trends sometimes easier to interpret, because of lower variability. In this regard, the high reproducibility of the composite physiologic index is reassuring; this composite index corrects for concurrent emphysema, which may mask disease progression by causing FVC levels to be spuriously preserved (1). By contrast, the physiologic component of the old CRP score (9) was unacceptably variable, even though resting arterial gases (a small component of the CRP score) were not repeated but analyzed as identical results. Our study, a prospective evaluation of consecutive patients meeting clinical criteria for IPF, was designed specifically to capture the spectrum of disease encountered in routine clinical practice. HRCT appearances were either typical of IPF or, in a large minority of cases, compatible with either IPF or fibrotic NSIP. In the latter scenario, usual interstitial pneumonia is found at biopsy in the majority of cases (13). On the basis on this finding, and the poor survival in our own population, we estimate that the proportion of patients with underlying histologic appearances of fibrotic NSIP was unlikely to have exceeded 20%, the proportion of NSIP in the population in which the prognostic value of the 6MWT was reported (10). We were reluctant to exclude patients with possible fibrotic NSIP as the outcome in this disorder has significant overlap with IPF (5, 42). More important, expert radiologic evaluation in distinguishing between usual interstitial pneumonia and NSIP on HRCT is not always available in routine clinical practice. However, the findings in the present study changed minimally when 13 patients with "intermediate" HRCT appearances were excluded from analysis. Our study population differed from patients with IPF who underwent biopsy in another respect: concurrent emphysema was evident on HRCT in more than 40% of cases. This finding may reflect the relatively advanced age of our patient group, another aspect that is representative of routine clinical practice. The high prevalence of coexisting emphysema is likely to have contributed to a relative preservation of lung volumes in our patients (1); as a result, mean FVC levels are higher than in many studies of younger patients with fibrotic IIP. However, the reproducibility of resting and exercise functional variables was not linked to the functional severity of disease at baseline. Because a wide range of disease severity was studied, our findings can be applied in routine practice. A more important limitation was the logistic need to separate maximal exercise testing by 1 week. Confounding factors that might, in theory, reduce reproducibility were minimized. Patients exercised at the same time of day, were clinically stable, with no changes in medication, and there was careful standardization of patient instructions, protocol, operator, equipment, and calibration. Despite this, it is theoretically possible that even at an interval of 1 week, progression of disease might have occurred in some cases. Furthermore, in two cases, a decline in spirometric volumes in excess of 10% was noted, despite the absence of symptomatic deterioration. However, removal of these cases did not materially improve the reproducibility of exercise variables and, in any case, progression of disease might be expected to influence all variables equally. In designing this study, we aimed to reproduce the application of exercise testing in the routine evaluation of diffuse lung disease. Patient characteristics were typical of populations with fibrotic IIP managed outside major referral centers. The use of oximetry rather than arterial gases reflects clinical practicability. Similarly, the 6MWT and maximal exercise testing were performed on only two occasions. The performance of two practice walks has been advocated by some (43); however, our findings support the recent ATS guidelines for the 6MWT, which state clearly that only one practice walk is required (23). The reproducibility of the 6MWT distance appears to be exceedingly high in IPF, and highly unlikely to improve materially with repeated testing.
Conclusions
The authors gratefully acknowledge clinical research assistance from Ms. S. Rudkin and W. Fergusson.
Supported in full by the Health Research Council of New Zealand. Conflict of Interest Statement: T.E. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript; P.Y. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript; D.M. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript; A.U.W. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. Received in original form May 2, 2004; accepted in final form December 29, 2004
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||