Published ahead of print on May 13, 2003, doi:10.1164/rccm.200209-1093OC
© 2003 American Thoracic Society
Composite SpirometricComputed Tomography Outcome Measure in Early Cystic Fibrosis Lung DiseaseDivisions of Pediatric Pulmonology and Pediatric Radiology, Department of Radiology, and the Division of Biostatistics HRP, Stanford University Medical Center, Palo Alto, California Correspondence and requests for reprints should be addressed to Terry E. Robinson, M.D., Pediatric Pulmonary Division, Stanford University Medical Center, 701 Welch Road, Whelan Building #3328, Palo Alto, CA 94304-5786. E-mail: ter{at}stanford.edu
With the advent of therapies aimed at young patients with cystic fibrosis, who have mildly reduced pulmonary function, the need for improved outcome measures that discriminate treatment effects has become important. Pulmonary function measurements or chest high-resolution computed tomography (HRCT) scores have been separately used to assess interventions. We evaluated these modalities separately and together during a treatment study to develop a more sensitive outcome measure. In a 1-year trial, 25 children randomized either to daily Pulmozyme or to normal saline aerosol were evaluated at randomization and at 3 and 12 months. Outcome variables were pulmonary function test (PFT) results, a global HRCT score, and a composite score incorporating PFTs and HRCT scoring. Regression analyses with generalized estimating equations permitted estimation of the difference in treatment effect between groups over time for each outcome. The largest difference in treatment effects observed at 12 months, measured by the percentage change from baseline, were with the composite total and maximal CT/PFT scores (35.4 and 30.4%), compared with mean forced expiratory flow during the middle half of the FVC (FEF2575%) (13.0%) and total and maximal global HRCT scores (6.2%, 7.2%). The composite total and maximal CT/PFT scores were the most sensitive outcome measures for discriminating a treatment effect in children with cystic fibrosis with normal or mildly reduced pulmonary function during a 1-year trial of Pulmozyme.
Key Words: cystic fibrosis composite computed tomography/pulmonary function test score high-resolution computed tomography pulmonary function Cystic fibrosis (CF) is the most common lethal autosomal recessive disease in white populations, affecting nearly 60,000 people worldwide (1, 2). The predominant cause of morbidity and mortality is from progressive obstructive lung disease, resulting from reduced mucociliary clearance, bronchiolar and bronchial obstruction, recurrent endobronchial infections, and persistent inflammation and destruction of the airways. Progressive decline in pulmonary function, particularly in the slope of the decline of percent predicted FEV1 per year has been noted in younger children, who for the most part have milder lung function abnormalities and fewer respiratory tract exacerbations compared with older patients with moderate to severe obstructive CF lung disease (36). Despite often normal or only mildly reduced pulmonary function in young children, there is an exuberant inflammatory response and increased deoxyribonucleic acid levels in bronchial airways, as noted in bronchoalveolar lavage studies of infants and young children (69), which lead to worsening obstruction and progressive abnormalities in the CF airway. Additional structural changes in the lung involving bronchiectasis, bronchial wall thickening, and air trapping have been demonstrated in infants and young children with CF (1016). To address these concerns, studies have been conducted in younger children with CF to prevent progressive decline in lung function and disease (6, 15, 17). With the advent of therapies aimed at the young patient with CF, the need for improved outcome measures that can discriminate treatment effects in subjects with mild CF lung disease are becoming ever more important. Previous trials have used percent predicted FEV1 as the best documented outcome surrogate for CF disease severity and progression (18). Unfortunately, FEV1 cannot discriminate early or mild CF lung disease, and indices of small airway changes, such as mean forced expiratory flow during the middle half of the FVC (FEF2575%), have inherently large variances, requiring large sample sizes (3, 5, 6, 19). High-resolution computed tomography (HRCT) imaging has shown potential to detect both early and transient pulmonary structural changes (1016, 20, 21). As a result, there is increasing interest in using HRCT to evaluate early or mild CF lung disease. Because chest HRCT, unlike pulmonary function measures, can discriminate regional variation in CF lung disease severity, we were interested in determining whether HRCT scoring would further supplement pulmonary function outcomes during a drug intervention trial. Furthermore, pulmonary function measurements (FEV1 and FEF2575%) (2, 6, 2225) and chest HRCT scores (13, 15, 20, 21) have been separately used to assess the effect of an intervention. During the course of this study we became interested in evaluating whether combining these two modalities could increase the ability to detect changes from a therapeutic intervention in children with mild CF lung disease. To this end, we evaluated changes in pulmonary function measurements, HRCT scores, and a composite score, which combined pulmonary function and HRCT imaging modalities, during a 1-year prospective, randomized, double-blind, placebo-controlled trial of dornase alfa (Pulmozyme; Genentech, Inc., South San Francisco, CA) in children with mild CF lung disease.
Study Population Children with a diagnosis of CF, confirmed by pilocarpine iontophoresis sweat chloride test and/or CF gene mutation analysis, were enrolled from four CF centers in northern California. Inclusion criteria included routine medical care in a CF clinic, age 618 years, FVC greater than or equal to 85% and percent predicted FEV1 of about 70%, and ability to perform reproducible pulmonary function tests (PFTs). Exclusion criteria were inability to perform reproducible upright and supine spirometry, inability to take the trial medication, acute asthmatic attack, recent lower respiratory tract infection before enrollment requiring a change in antibiotic, bronchodilator, or anti-inflammatory therapy, or use of dornase alfa within a previous 3-week period before enrollment and testing. Before enrollment into the study, informed consent was obtained from the patients and their parents. The study protocol was approved by the Stanford University Administrative Panel in Human Subjects.
Definition of Terms
Study Design and Procedures Each subject with CF and his/her family were given diary sheets to record treatment medications taken, and all subjects with CF were asked to return all used and unused vials of study drug at 3 and 12 months after initial baseline testing. All subjects with CF were assessed after completing 3 months and 1 year of the intervention. During each testing session, a brief medical history was obtained and physical exam performed (yielding an acute change clinical score) (26), height and weight measured, and PFTs and spirometer-triggered CT imaging were performed. Lung function was assessed by standard spirometry using a portable pulmonary function system (Vmax model 229; SensorMedics Corporation, Yorba Linda, CA) and by body plethysmography (model 1085; Medical Graphics Corp., Minneapolis, MN). Spirometry and lung volumes were obtained in the sitting position. Supine pulmonary function measurements were also obtained for purposes of spirometer triggering of the CT scanner (27). Pulmonary function measurements (FVC, FEV1, and FEF2575%) were expressed as percentages of predicted based on normal prediction equations derived from the data of the Harvard Six Cities Study (28). In addition, a chest radiograph at the time of initial testing was obtained in all subjects except one, who was unable to get authorization for this part of the testing. Additional airway bacterial cultures were obtained in all subjects at test 1, either by expectorated sputum samples or by oropharyngeal swabs.
Chest HRCT images were obtained using a previously described protocol (21). Contiguous 1.5-mm collimation images were obtained at near full inspiration (
HRCT Scoring For both total and maximal scores, a subject's global score was computed as the sum of all seven HRCT component scores. One radiologist also evaluated initial chest radiographs using the Brasfield scoring system (29); the scorer was blinded to subject identity and results of the alternate scoring system. All chest radiographs were assessed more than 2 months after the scoring of the chest CT scans.
Statistical Analysis
Post Hoc Development of a Composite Score We explored the possibility that a composite score incorporating PFT and HRCT variables would increase the ability to detect differences in rate of change between placebo and treatment groups. To develop composite scores, six separate principal components analyses were conducted on the nine component outcomes (seven HRCT and two PFT) separately for total and maximal CT scores, one analysis for each combination of treatment arm and elapsed months. Each principal components analysis used the correlation matrix of the nine outcomes to produce nine separate scores, which were uncorrelated among themselves, so that each score provided information independent of that provided by the other scores. The first of these scores (first principal component) had the greatest differences among individual observations; the second principal component had the second greatest differences; and so on, such that the ninth principal component contained the smallest differences.
Principal components analysis revealed that the first two components accounted for most (
To create S, PFTHRCT outcomes were separately averaged and then standardized to a common scale (from 0 to 1). Also, the scale for PFT outcomes was reversed (by subtraction from 1) so that high values represent poor lung condition, as with HRCT outcomes. Let P be the set {FEF2575% percent predicted, FEV1 percent predicted}, and let H- be the set of HRCT component outcomes excluding atelectasis/consolidation (and global). The proposed composite score S is as follows:
Minima and maxima are meant to represent the respective lower and upper bounds on the possible range that outcomes of that type (e.g., HRCT total scores) can assume. Analysis used H-min = 6 and H-max = 24 for total HRCT outcomes, H-min = 1 and H-max = 4 for maximal HRCT outcomes, and Pmin = 35 and Pmax = 160 for PFT outcomes. HRCT and PFT terms are multiplied in the index so that only a lung in perfect condition on both measures will receive the highest value of S = 1.
Modeling
Interpreting the Estimated Treatment Effect
A total of 25 subjects with CF were enrolled and randomized. Of the 25 subjects randomized, 21 completed the 1-year trial, whereas 4 subjects completed 3 months on study drug and had follow-up testing. All four noncompleters withdrew for nonstudy drug-related reasons. (Three of the subjects were administratively censored due to late enrollment and were only able to complete 3 months of study period. A fourth subject withdrew from the study after 3 months for nonmedical reasons. Each group had 2 subjects each who did not complete test 3.) There was no statistical difference between these 4 subjects and the other 21 subjects for baseline characteristics (data not shown). As none of these reasons for not completing the study appear to have been outcome dependent, the data for these four subjects were included in the analysis per intent-to-treat.
Adherence (average proportion of medication taken on the basis of returned used and unused vials and diary sheets) over 12 months was 86.9% for the placebo group and 85.6% for the dornase
Baseline Statistics
Treatment Effects Estimates of the difference in treatment effects between groups and their confidence intervals in terms of percentage change from baseline were derived from GEE estimates of the regression models' coefficients for total and maximal HRCT scores as well as for percent predicted FEV1 and FEF2575% (Table 2) . For all outcome measures except air trapping, regression fits suggested improvement over time in the dornase alfa treatment group as compared with the placebo group. From baseline to 12 months, differences between treatments in estimated percent change from baseline for percent predicted FEV1 and FEF2575% were -4.1 and -13.%, respectively. The differences between treatments for estimated percent change from baseline for the Total Global HRCT score and Maximal HRCT score were 6.2 and 7.2%, respectively. The largest difference between treatments in estimated percent change from baseline to 12 months using total HRCT scores was in the composite score S (35.4%) followed by FEF2575% (-13%), mucus plugging (12.1%), and others (Table 2). The composite score based on total HRCT scores was at least 2.7 times more sensitive at detecting differences in percent change from baseline between groups than any other outcome measure examined, including pulmonary function measurements, global HRCT scores, and individual HRCT component scores (Table 2). When maximal regional component scores were used, the largest difference in estimated percent change from baseline after 12 months of therapy was again seen in the composite score S (30.4%), followed by mucus plugging (26.9%) and extent of bronchiectasis (15.7%) (Table 2). Table 2 also summarizes tests of the null hypothesis of no difference between treatment and normal saline placebo groups in change over time on outcome measures (i.e., H0: ß2 = 0). A statistically significant effect (defined as two-sided p value < 0.05) is indicated where a confidence interval does not include zero (i.e., for total and maximal composite scores, Table 2). No correction for multiplicity of comparisons was made in the construction of the tables. Overall, the composite score S was the most sensitive measure of difference between treatments using either total or maximal HRCT outcome scores. Furthermore, except for extent of bronchial wall thickening and severity of bronchiectasis, estimates of treatment effect were greater for maximal than total HRCT scores (Table 2). As a result, the maximal composite score appeared to be more sensitive for detecting differences in change than the total composite score.
Adjustment for Influential Covariates
In this 1-year study evaluating the effect of dornase alfa in children with mild CF lung disease, we found that a composite CT/PFT score was the most sensitive outcome measure for detection of a treatment effect. When pulmonary function measurements (percent predicted FEV1 and FEF2575%) were alone used to assess the effect of dornase alfa, there was a treatment effect difference of 4.1% (p = 0.33) and 13.0% (p = 0.07) at 1 year, respectively. Quan and coworkers (6) in a large (n = 410) 2-year study in children with mild CF lung disease found a mean treatment effect difference in percent predicted FEV1 of about 3% between groups (approximately +2% for dornase alfa and -1% for placebo at 48 weeks) (6). For percent predicted FEF2575% at 48 weeks, they also found a mean treatment effect difference of 7.7% between groups (approximately +8% for dornase alfa and +0.3% for placebo). Quan and coworkers (6) fit regression models using generalized least-squares, adjusted for baseline percent predicted FEV1, whereas our current study used GEE, adjusting for weight in kilograms and age in years. Our estimates of dornase alfa spirometric treatment effects were similar to but slightly higher than those of Quan and coworkers (6). In this study, for all outcome measures except the HRCT air trapping score, regression fits suggested improvement over time in the dornase alfa treatment group compared with the placebo group. This finding can likely be explained by the fact that our air trapping scoring (1, absent; 2, < 25% lobar surface area; 3, 2550% lobar surface area; 4, > 50% lobar surface area) was not sensitive enough to pick up the milder extent of air trapping seen in patients with CF in this study. This has recently been confirmed by the use of an automated approach to quantitative air trapping in subjects with CF who participated in this study, which found that the extent of air trapping in our population of patients with mild CF at baseline using a stringent definition was 721% (33). After 1 year of treatment, the Pulmozyme intervention group had a decline in air trapping, whereas the normal saline placebo group had an increase in air trapping (34). In this study, the composite CT/PFT score, which combined aspects of functional assessment (PFT measures) and HRCT structural analysis, had greater ability to detect changes from a therapeutic intervention than PFT outcome measures or HRCT scores alone. When PFT measures, global HRCT scores, or individual HRCT scoring components were used to assess differences between the dornase alfa and placebo treatment groups, there was a 0.4 to 26.9% difference between groups, as measured by percent change from baseline at 1 year. Although there were no significant differences demonstrated between groups for PFT measures, global HRCT scores, and individual component scores, there was a trend toward differences observed between groups that may have reached statistical significance if a larger sample size had been studied. These differences in treatment effects, however, all ranked below those of their corresponding composite CT/PFT score, which was the only outcome measure that demonstrated significant differences between groups using either Total or Maximal HRCT outcomes (Table 2). These findings suggest the potential advantage of using combined CT/pulmonary function composite scoring for future clinical trials. However, these results were obtained in a small sample of 25 subjects with CF, who have normal or minimally decreased pulmonary function and minimal lung disease, as determined by HRCT scoring. These results should therefore not be inferred beyond the particular subjects within the sample. Subsequent work will need to be done to validate this composite scoring system, using a larger sampling of children and adults with CF lung disease obtained using a probability-sampling design, before it can be recommended as an acceptable outcome measure for future clinical trials. The maximal composite score was the most sensitive outcome measure for detecting differences in rate of change between placebo and dornase alfa treatment groups (p < 0.01, Table 2). This finding likely results from the fact that this measure combines global pulmonary function with information from those lung zones that show the greatest disease severity. Use of regional HRCT scoring has a distinct advantage over conventional pulmonary function measurements in that detection of focal disease severity is possible, thus maximizing the sensitivity to discern a treatment effect. In conclusion, we found that a composite CT/PFT score was a more sensitive outcome measure for discriminating a treatment effect in young patients with CF, who have normal or mildly reduced pulmonary function than PFT or CT alone. The most sensitive measure for determining a treatment effect was a maximal composite CT/PFT score, which combined global pulmonary function measurements with regional assessment of the greatest disease severity. We propose the maximal composite CT/PFT score as a potential new outcome measure for intervention trials in early and/or mild CF lung disease. In light of the need to develop more sensitive outcome measures to study interventions in subjects with mild CF lung disease, we have presented a new composite index score, combining two different modalities of assessment of subjects with CF during a specific intervention. The composite index offers a promising new assessment tool that could advance the field of outcomes research in patients with CF, but clearly needs further study for future validation. Applicability to other lung diseases should also be explored.
The authors thank Malayattil Vijayalakshmi, Anne S. Bonnel, Krishnaveni Kesavaraju, Prachi Bhise, and John Oehlert for their participation in the study and also thank Glenn Hodge (pediatric pulmonary function laboratory) and Lisa McClennan and Diane Holmes (Pediatric Radiology SectionUltrafast CT imaging) for their participation in the study.
Supported by a research grant from the Cystic Fibrosis Foundation (Moss98AO) and Genentech, Inc. (Z1970n). Conflict of Interest Statement: T.E.R. received a research grant from Genentech, Inc. of $3,700.00 representing the cost of pharmacy set-up and inventory activities associated with Pulmozyme and placebo medications for the study; A.N.L. has no declared conflict of interest; W.H.N. has no declared conflict of interest; F.G.B. has no declared conflict of interest; F.P.C. has no declared conflict of interest; D.A.B. has no declared conflict of interest; T.H.H. has no declared conflict of interest; R.B.M. has received research grants from Genentech, Inc. for studies related to Pulmozyme since 1993. Received in original form September 24, 2002; accepted in final form May 12, 2003
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||