help button home button
AJRCCM
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by HNIZDO, E.
Right arrow Articles by DOWDESWELL, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by HNIZDO, E.
Right arrow Articles by DOWDESWELL, R.
Am. J. Respir. Crit. Care Med., Volume 160, Number 6, December 1999, 2006-2011

Assessment of Reliability of Lung Function Screening Programs or Longitudinal Studies

EVA HNIZDO, GAVIN CHURCHYARD, DAVE BARNES, and ROB DOWDESWELL

National Center for Occupational Health, Johannesburg; Arum Health Research, Welkom; Anglogold Health Services, Welkom; and Precious Metal Refiners (Pty) Ltd., Kroonsdal, South Africa

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

The aim was to determine reliability of lung function measurements performed according to recommendations of the American Thoracic Society (ATS) at a screening program in a large South African gold mine and to determine the usefulness of the reliability coefficient G for monitoring the reliability of lung function measurements in a mass screening program. The reliability coefficient G estimates the amount of random error of measurement, relative to the total variation in a measurement. The coefficient G was calculated as a correlation coefficient between two consecutive lung function tests performed within 6 mo, over a period of 43 mo on 3,378 miners. There was significant temporal variability in the reliability. For FEV1, the coefficient G showed increased variability over the first 5 mo and stabilized at a value of 0.93 for the next 23 mo, after which it systematically declined over the next 15 mo. We estimated that in a large screening program, an optimal sample size of around 900 miners, examined randomly throughout the year, on a yearly basis, would provide a sufficient sample to examine monthly or quarterly fluctuation in the reliability. The value of the reliability coefficient G did not change when the time between two consecutive tests increased up to 15 mo. In conclusion, monitoring of lung function reliability in a screening program by the reliability coefficient G should improve data quality, and provide a measure on which the confidence in a decision-making process could be based when examining temporal changes in lung function for individual subjects. Hnizdo E, Churchyard G, Barnes D, Dowdeswell R. Assessment of reliability of lung function screening programs or longitudinal studies.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Screening for lung function impairment in subjects exposed to respiratory hazards should be able to identify those individuals whose lung function falls below predicted values and those who demonstrate accelerated loss of lung function (1). The accuracy with which subjects with "true" undue loss of lung function are identified depends on the reliability of the measurements. In longitudinal lung function testing, continuous monitoring of reliability, in addition to the quality control measures recommended by the American Thoracic Society (ATS) (2), should improve the data quality and also provide an index of reliability on which decisions can be based.

Generally, the reliability of lung function measurements reflects both systematic errors (e.g., procedural differences) and random errors of measurement (e.g., due to temporary restriction) (3, 4). The amount of variation caused by random error of measurement, i.e., an error that cannot be explained by known systematic effects, can be measured by the reliability coefficient G (5). The reliability coefficient G was previously used to correct for the effect of lung function fluctuation within individual subjects when predicting "true" loss. Application of the reliability coefficient to monitoring of the reliability of lung function measurements in a screening program has not been described previously in the literature.

Exposure to silica dust is a known risk factor for chronic obstructive pulmonary disease (COPD) (6). Thus, an effective lung function screening program in silica-exposed workers could lead to prevention of COPD. In the present study we had the following objectives: (1) to evaluate the reliability of a lung function screening program in a South African gold mine and to determine the usefulness of the data for epidemiologic research; (2) to evaluate the applicability of the reliability coefficient G for the assessment of a temporal pattern in reliability; and (3) to develop a method of monitoring reliability of lung function measurements in such a screening program.

    METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Study Population

Miners at a large South African gold mining company have spirometry done routinely at an initial examination, periodically every 3 yr, and on leaving the company (exit). The population of miners decreased from 71,515 in 1994 to 43,359 in 1997 due to closure of entire mine shafts for economic reasons, then all miners were discharged; or due to downsizing of mine shafts, then miners mostly took voluntary retirement. From May 1994, when screening started, to March 1998, data from 113,120 tests were computerized (14,267 in 1994, 28,402 in 1995, 25,288 in 1996, 32,381 in 1997, and 12,782 in 1998).

All black males who had two lung function tests within 6 mo between May 1994 and March 1998 were used to investigate objectives (1) and (2). Six months was the shortest period that provided a sufficient number of subjects to evaluate a trend in the reliability coefficient G while ensuring no aging effect. The most frequent reason for a second examination was an exit examination, or miners returning from extended leave. Miners with medical reasons for a test were excluded. In total, there were 3,513 miners who qualified. Of these, we excluded 80 (2.3%) in whom FVC or FEV1 were outside the 99% confidence intervals (CI), and 55 (1.6%) in whom the within-person difference was outside the 99% CI, leaving 3,378 miners. To study objective (3), we used the same selection criteria, except that there was no limit on the time interval between the two tests, and identified 16,249 miners.

Lung Function Measurements

Maximal forced expiratory maneuvers are recorded in a computerized database using a Hans Rudolph pneumotachograph (Flowscan; Electromedical Systems Inc., South Africa). The system software requires and validates a calibration with a 3-L syringe. Calibration is done 3 to 4 times per day. Barometric pressure and temperature are entered via the keyboard for correction of volumes to BTPS. During testing, flow versus volume tracings are displayed. A minimum of three acceptable and reproducible forced expiratory maneuvers are obtained according to the standards recommended by the ATS. The miners only perform an exhalation maneuver. All testing is done by nursing personnel with a college diploma in spirometry testing, and trained in the techniques of performing spirometry to ATS standards. Height is measured to the nearest centimeter in stocking feet. Data recorded for each test includes the date of test, date of birth, height, weight, the highest FVC, the highest FEV1, and forced expiratory flow at 25 to 75% of forced vital capacity (FEF25-75%).

Statistical Methods

Reliability coefficient G---background. The lung function tests (FVC, FEV1, FEF25-75%, and the ratio of FEV1/FVC [FEV1%]) are continuous normally distributed variables with the mean, µ, and the variance, sigma 2. It is well recognized that lung function tests are prone to measurement errors (3, 4). The errors can be broadly categorized as systematic errors of measurement and random errors of measurement. Theoretically, a systematic error could be removed from the data, provided we have information on its origin (e.g., procedural changes, a technician effect, seasonal variability). A systematic error changes mainly the mean, µ, i.e., it shifts the distribution. Generally, by a random error of measurement we understand not only the random error in the measurement procedure itself but also, and more importantly, random fluctuation in the measured quantity that reflects the variability in lung function within an individual subject. This fluctuation can be due to factors such as a subject's fatigue, bronchoconstriction, diurnal or seasonal variation, and acute response to allergens. By definition, the random error of measurement does not change the mean, but can change the size of the variance, sigma 2. When testing the reliability of a lung function measurement, we estimate the size of the random error of measurement, relative to the total variation in the measurement across subjects, i.e., we compare the amount of the within-person variability relative to the between-person variability (5). The statistic that measures the relative size of the random error of measurement is the reliability coefficient G (5, 9, 10). A provides statistical details on the coefficient G.

Appendix shows that the simplest method of estimating the reliability coefficient G is to repeat lung function tests for a set of subjects in a few weeks or months and to calculate the correlation coefficient rho MaMb between the first and second measurements. The time interval T between the two tests should be long enough to include all the potential short-term effects involved in the random measurement error, but short enough to avoid systematic changes, e.g., due to age.

Temporal changes in reliability. To determine temporal changes, we calculated the overall average within-person difference in lung function and the coefficient of reliability G. Then, we examined the temporal changes in the average monthly within-person differences and in the reliability coefficient G. We used the analysis of covariance (SAS PROC GLM) to adjust for the effect of age, the month of testing, and the time interval T between the two tests, on the monthly within-person difference in FEV1 (second test - first test) (11). Next, we identified a reliable period of testing and recalculated the reliability statistics for the reliable period only. To demonstrate usefulness of the data for epidemiologic purposes, we examined the loss of lung function within age strata for the reliable period.

Method of monitoring of reliability in a large screening program. We first examined the relationship between the reliability coefficient G and the time interval between two tests T, to establish whether the reliability coefficient G changed with time T. We used the model (5)
ρ<SUB>t,t+T</SUB>=G exp(−λT), (1)

where the correlation coefficient rho t,t+T, calculated for increasing time interval T, is related to T. Then G and lambda  are estimated from a simple linear regression obtained by taking the natural logarithm of Equation 1, i.e., log (rho t,t+T) = g - lambda T. The estimated value of G, G = exp(g), provides the best estimate of G at time T = 0. The estimated slope lambda  provides information on the change in rho  with increasing time T.

Next, we estimated the optimal sample size required for regular yearly testing, so that monthly monitoring of the reliability coefficient G could be done. We drew the required sample size by random sampling of subjects from a period of 1 yr; calculated the monthly reliability coefficients Gi, i = 1 to 12; and plotted the average reliability coefficient G = Sigma i=1 to 12 Gi/12 and the coefficient of variation CV (calculated from Gi, i = 1 to 12), against the changing sample size. The sample size at which the CV became constant is considered optimal.

    RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Table 1 (A) shows the characteristics at the first and second lung function tests, for the 3,378 miners who had two tests within 6 mo. The average age at the first test was 41.8 yr (SD, 10.1) and the average period between the two tests was 3.73 mo. The average within-person differences for the lung function tests were negative and statistically significant. The reliability coefficient G was highest for FEV1.

                              
View this table:
[in this window]
[in a new window]
 

TABLE 1

CHARACTERISTICS OF SUBJECTS WITH TWO TESTS: (A) FROM MAY 1994 TO MARCH 1998; (B) FROM OCTOBER 1994 TO AUGUST 1996

Figure 1a shows the average difference between second and first FEV1, according to the month of the first test, adjusted for age and the interval T. There is a period of larger variability up to September 1994, and decreased variability up to August 1996, and a period of large negative changes from September 1996 to June 1997, followed by large positive changes. Figure 1b shows that the reliability coefficient G for FEV1 also declines from September 1996. 


View larger version (14K):
[in this window]
[in a new window]
 
Figure 1.   Temporal changes in (a) the difference in FEV (2nd FEV1 - 1st FEV1), and (b) the reliability coefficient G.

To obtain the "best" estimate of the random error of measurement for the individual lung function, we selected the period when the screening program was most reliable, i.e., from October 1994 to August 1996. Subjects whose first or second tests were done outside the "reliable" period were excluded. Table 1 (B) shows the improvement in the reliability statistics for the 1,001 subjects who had both tests done within the reliable period. To demonstrate the usefulness of the data for epidemiologic purposes, we present the reliability statistics by age categories for the reliable period only (Table 2).

                              
View this table:
[in this window]
[in a new window]
 

TABLE 2

RELIABILITY STATISTICS BY AGE STRATA, FOR FEV1, FVC FOR "RELIABLE" PERIOD FROM OCTOBER 1994 TO AUGUST 1996

Figure 2 shows fitted regression curves for the relation between the reliability coefficient G and time T between two tests, for all miners who had two tests regardless of the time interval. Figure 2a includes data on 2,802 miners who had two tests within the reliable period (October 1994 to August 1996). The maximum time interval T was 22 mo, but the number of subjects with tests more than 15 mo apart was small, and the coefficient G became unreliable after T = 15. The value of G was consistently above 0.90 up to T = 15, the estimated value of G at T = 0 was 0.93 (95% CI, 0.91-0.99), and the value of the slope lambda  = -0.0010 (p = 0.20). When the regression curve was fitted up to 22 mo, then the estimated value of G at T = 0 was 0.95 (95% CI, 0.92-0.99). Figure 2b includes the whole screening period for 16,249 subjects. The maximum time interval T was 36 mo. The reliability coefficient G declined steeply within 15 mo---the estimated value of G at T = 0 was 0.87 (95% CI, 0.86-0.89), and the value of the slope lambda  -0.0014 (p = 0.002). The number of subjects was large for all points (ranging from 206 to 790).


View larger version (12K):
[in this window]
[in a new window]
 
Figure 2.   Relationship between the reliability coefficient G for FEV1, and the time interval T for: (a) the reliable period October 1994 to August 1996, and (b) for the total period October 1994 to September 1997.

Figure 3 shows the relationship between the average reliability coefficient G (calculated from the monthly coefficients), the CV for the monthly reliability coefficients G, and the sample size required per year to monitor the lung function reliability on a monthly basis. The optimal sample size is approximately 900 subjects.


View larger version (14K):
[in this window]
[in a new window]
 
Figure 3.   The average reliability coefficient G, and the coefficient of variation CV, plotted against the total sample size per year.

    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

The reliability of lung function tests was evaluated in a screening program where the testing was designed and intended to be done according to ATS recommendations. A temporal trend was established for within-person difference in FEV1 and for the coefficient of reliability G calculated on 3,378 miners who had two consecutive tests within 6 mo, over a period of 43 mo. Figure 1 shows that the reliability stabilized after 5 mo of testing, remained consistently "reliable" over a period of 23 mo (G = 0.93) after which it systematically declined. The negative difference in FEV1 (2nd - 1st test) after September 1996 is a result of much lower second tests. When subjects with any measurements done outside of the reliable period were excluded, the reliability coefficient increased and became more consistent over time. During the initial period of 4 mo (May to September 1994), the variability may have been higher because of a learning process. The decrease in reliability from September 1996 may have been caused by increased layoffs in the mine and increased workload for the lung function technicians. In a retrospective evaluation of the screening program practices, after the reliability analysis was completed, it was also found that there were lapses in the auditing of the calibration log to ensure that calibration is done accurately.

With regard to the usefulness of the data for epidemiologic purposes, only the data from the "reliable" period (average reliability coefficient for FEV1 G = 0.93) appear to be useful. The within-person differences and the coefficient of reliability G improved substantially during the reliable period (see Table 1). The data also show consistency when stratified by age (see Table 2). The within-person differences show a negative decline in FEV1 from 35 to 44 yr of age. Although the within-person differences were not statistically significant from zero, the pattern is consistent with a published longitudinal study in which the onset of decline in FEV1 for males was from 36 yr of age (12). In contrast, cross-sectional studies report the decline in FEV1 per year to be constant at 20 to 30 ml/yr (13). The cross-sectional means for FEV1 (first measurement means) in Table 2 also show a decline starting from 25 yr of age. The cross-sectional decline in younger ages in our study could be the result of a strong cohort effect, as height declined systematically with age and the young miners were much taller than the 50-yr-olds. Thus, the availability of the reliability coefficient G could provide a measure of usefulness of longitudinal studies, or screening program data for epidemiologic research.

How reliable should be a lung function screening program, so that it is able to identify accelerated loss of airflow in specific groups of subjects, for example, in smokers? Even if the coefficient of reliability G is approximately 0.93, a large sample size is required to identify small losses to be statistically significant. For example, the observed change in FEV1 for the age category 35 to 44 yr of -22.5 ml per 3.78 mo (see Table 2) is much higher than expected. However, a minimal sample size required for this difference to be statistically significant is approximately 463 subjects. The literature suggests that at least 4 yr of follow-up are required to detect the effect of smoking in a group of subjects in a longitudinal study (16). Whether this is so depends on the data reliability. The higher the reliability coefficient G, the more likely undue loss of lung function caused by disease, smoking, or occupational exposure can be detected. The reliability coefficient G provides a measure of confidence that can be assigned to a change in lung function observed in groups of subjects or in individual subjects. For example, if any of the two measurements are from a period with low reliability, then the confidence in the observed change in lung function is lower than if both tests were done during a reliable period.

The results demonstrate that monitoring of data reliability in screening programs, or longitudinal studies, could help to identify lapses in the reliability at an early stage, and provide a measure of confidence in the data. In a large screening program, as in our study, when the lung function testing is not done on a yearly basis, a "small" dynamic cohort of subjects tested on a yearly basis could provide a basis of a reliability program. How regularly should the subjects be tested? According to the data from the "reliable" period, the reliability coefficient G for FEV1 declined little up to the time interval T = 15 mo and the best estimate of G at T = 0 was 0.93 (or 0.95 when the model was fitted to T = 22) (Figure 2a). Thus, in a reliable program, the subjects could be tested every year and this should not have an effect on the reliability coefficient G. However, if the program is not reliable, then the reliability coefficient G would decline rapidly with time T (Figure 2b).

How large should the dynamic cohort be? According to Figure 3, the optimal sample size required to monitor reliability on a monthly basis using the reliability coefficient G is approximately 900 subjects. (At those sample sizes the variability [CV] in the monthly reliability coefficient G stabilizes.) For quarterly monitoring, the sample would be smaller. If the subjects have lung function tests done yearly, and the testing is evenly distributed throughout the year, then a trend in the monthly or quarterly reliability coefficients G can be monitored. For the first year of the program, the second tests could be done after 3 mo to get an early feedback from the data, and to obtain good baseline data on each subject. The reliability coefficient is simply calculated as the correlation coefficient between the first and second tests, across all the subjects who had two tests within a year. The reliability can be also evaluated for individual technicians and spirometers.

A limitation of the present study is that the random error of measurement included systematic effects (e.g., technician, instruments) that, because of lack of recorded data on these effects, could not be excluded. Despite this, the "best" estimate of the random error of measurement, i.e., within-subject variation, was estimated as 5 to 7% of the total variation in the FEV1 for the reliable period. Another major limitation of the program is lack of records on the acceptability and reproducibility of each lung function. However, a longitudinal study of white South African gold miners who had a 1-yr interval between two lung function tests (10), and who were tested in one main lung function laboratory, reported similar values of the reliability coefficient G for FVC, FEV1, FEF25-75%, and FEV1% of 0.899, 0.929, 0.836, and 0.786, respectively.

In conclusion, the study shows that despite the fact that the lung function testing was designed and intended to be done according to the standards recommended by the ATS, there were lapses in reliability over time. In response to the reliability analysis, a retrospective evaluation of the program identified various limitations. Thus, continuous monitoring of data reliability should help to maintain good data quality. The coefficient of reliability G appears to be a simple tool for monitoring the data reliability that can also provide a measure of confidence on which the assessment of changes in lung function in groups of subjects or individual subjects can be based. The study also demonstrates that it is possible to have a reliable screening program that generates data for epidemiologic research, and that the availability of a reliability coefficient G provides a measure of usefulness of the data for epidemiologic research.

    Footnotes

Correspondence and requests for reprints should be addressed to Eva Hnizdo, National Center for Occupational Safety and Health, 1095 Willowdale Road, MS PB 163, Morgantown, WV 26505. E-mail: EXH6{at}cdc.gov

(Received in original form February 5, 1999 and in revised form June 23, 1999).

Acknowledgments: The authors thank the Anglogold Health Services from the Freegold mines in Welcome, South Africa, for allowing us to use lung function data, Ms. Tanusha Singh who helped with computer programming, and Dr. Jill Murray from NCOH for her valuable comments.

Supported by the Safety in Mining Research Advisory Committee.

    References
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

1. American Thoracic Society. 1982. Surveillance for respiratory hazards in the occupational setting. Am. Rev. Respir. Dis. 126: 952-956 [Medline].

2. American Thoracic Society. 1995. Standardization of spirometry: 1994 update. Am. J. Respir. Crit. Care Med. 152: 1107-1136 [Medline].

3. American Thoracic Society. 1991. Lung function testing: selection of reference values and interpretative strategies. Am. Rev. Respir. Dis. 144: 1202-1218 [Medline].

4. Becklake, R.. 1986. Concepts of normality applied to the measurement of lung function. Am. J. Med. 80: 1158-1164 [Medline].

5. Shepard, D. S.. 1981. Reliability of blood pressure measurements: implications for designing and evaluating programs to control hypertension. J. Chron. Dis. 34: 191-209 [Medline].

6. Wiles, F. J., and M. H. Faure. 1977. Chronic obstructive lung disease in gold miners. In W. H. Walton, editor. Inhaled Particles IV, Part 2. Pergamon Press, Oxford. 727-735.

7. Cowie, R. L., and S. K. Mabena. 1991. Silicosis, chronic airflow limitation, and chronic bronchitis in South African gold miners. Am. Rev. Respir. Dis. 143: 80-84 [Medline].

8. Hnizdo, E.. 1990. Combined effect of silica dust and tobacco smoking on mortality from chronic obstructive lung disease in gold miners. Br. J. Ind. Med. 47: 656-664 [Medline].

9. Gardner, M. J., and J. A. Heady. 1973. Some effects of within-person variability in epidemiological studies. J. Chron. Dis. 26: 781-795 .

10. Irwig, L., H. Groeneveld, and M. Becklake. 1988. Relationship of lung function loss to level of initial function: correcting for measurement error using the reliability coefficient. J. Epidemiol. Community Health 42: 383-389 [Abstract].

11. Snedecor, G. W., and W. G. Cochran. 1967. Statistical Methods. Iowa State University Press, Ames, IA.

12. Burrows, B., M. D. Lebowitz, A. E. Camilli, and R. J. Knudson. 1986. Longitudinal changes in forced expiratory volume in one second in adults. Am. Rev. Respir. Dis. 133: 974-980 [Medline].

13. Crapo, R. O., A. H. Morris, and R. M. Gardner. 1981. Reference spirometric values using techniques and equipment that meet ATS recommendations. Am. Rev. Respir. Dis. 123: 659-664 [Medline].

14. Knudson, R. J., M. D. Lebowitz, C. J. Holberg, and B. Burrows. 1983. Changes in the normal maximal expiratory flow-volume curve with growth and aging. Am. Rev. Respir. Dis. 127: 725-734 [Medline].

15. Quanjer, P. H., editor. 1983. Standardized lung function testing: report of the working party. Bull. Eur. Physiopathol. Respir. 19(Suppl. 5):1- 95.

16. Dales, R. E., J. A. Hanley, P. Ernst, and M. R. Becklake. 1987. Computer modelling of measurement error in longitudinal lung function data. J. Chron. Dis. 40: 769-773 [Medline].
    APPENDIX

To describe the statistical theory for the reliability coefficient (4, 8, 9), let us assume that for an individual subject there is a "true" value of a lung function, L. This true value L is observed with a random error delta , resulting in measurement M, where M = L + delta . The observed value M is distributed normally with a mean µ and variance sigma M2 . Assuming that L and delta  are normally distributed, then the variance of M is sigma M2sigma L2 + sigma delta 2 . The ratio of the true value variance sigma L2 to that of the variance sigma M2 of the observed value is referred to as the reliability coefficient G and can be expressed as
G<FR><NU>σ<SUB>L</SUB><SUP>2</SUP></NU><DE>σ<SUB>M</SUB><SUP>2</SUP></DE></FR><FR><NU>σ<SUB>M</SUB><SUP>2</SUP>−σ<SUB>δ</SUB><SUP>2</SUP></NU><DE>σ<SUB>M</SUB><SUP>2</SUP></DE></FR>1−<FR><NU>σ<SUB>δ</SUB><SUP>2</SUP></NU><DE>σ<SUB>M</SUB><SUP>2</SUP></DE></FR>.   (1)

The variance sigma delta 2 , required for calculation of G, can be estimated from repeated independent measurements (Ma and Mb) of the same true value L on the same subject over a period of time T (weeks or months). It can be shown that the within-person variance of the difference of the two measurements (sigma 2Ma - Mb) is twice the variance of the random error of measurement, 2sigma delta 2 . This follows (10) because
σ<SUB>Ma-Mb</SUB><SUP>2</SUP>=σ<SUB>Ma</SUB><SUP>2</SUP>+σ<SUB>Mb</SUB><SUP>2</SUP>−2σ<SUB>Ma Mb</SUB>. (2)

Further, we can assume that sigma 2Masigma 2Mbsigma 2M and that delta  is independent of L. Then the covariance term is the same whether derived from M or L, i.e.:
σ<SUB>Ma Mb</SUB>=σ<SUB>La Lb</SUB>=σ<SUB>L</SUB><SUP>2</SUP>. (3)

It follows then that
σ<SUB>Ma-Mb</SUB><SUP>2</SUP>=2(σ<SUB>M</SUB><SUP>2</SUP>−σ<SUB>L</SUB><SUP>2</SUP>)=2σ<SUB>δ</SUB><SUP>2</SUP>. (4)

Finally, using Equation 3 and the fact that sigma 2Msigma Ma 2 · sigma Mb2 , one gets for the reliability coefficient G
G=<FR><NU>σ<SUB>L</SUB><SUP>2</SUP></NU><DE>σ<SUB>M</SUB><SUP>2</SUP></DE></FR>=<FR><NU>σ<SUB>MaMb</SUB></NU><DE>σ<SUB>Ma</SUB>σ<SUB>Mb</SUB></DE></FR>=ρ<SUB>MaMb</SUB>. (5)

The above shows that the simplest method of estimating the reliability coefficient G is from a reexamination of lung function in a series of cases over a few weeks or months and calculation of the correlation coefficient rho MaMb. The 95% confidence interval (CI) for the observed correlation coefficient r is estimated by CI(rho ) = r ± Zalpha · (1/<RAD><RCD>N−3</RCD></RAD> ).

The value of G and the size of the random error of measurement can be also calculated directly from Equations 2 and 5 using variances from Table 1. For example, if we substitute the variances for FEV1 in Table 1 into Equation 2, then sigma MaMb = 1/2 (0.4223 + 0.4318 - 0.0938) = 0.3802. Then from Equation 5, G = 0.3802/(<RAD><RCD>0.4223</RCD></RAD> · <RAD><RCD>N−3</RCD></RAD> ) = 0.8903. The variance of the observed FEV1 is defined as sigma M2sigma L2 + sigma delta 2 , where the variance of the true values L, sigma 2L = 0.3803, and the variance of the random error of measurement sigma delta 2 = 1/2 (0.0938) = 0.0469.





This article has been cited by other articles:


Home page
Chronic Respiratory DiseaseHome page
N F Schlecht, K Schwartzman, and J Bourbeau
Dyspnea as clinical indicator in patients with chronic obstructive pulmonary disease
Chronic Respiratory Disease, October 1, 2005; 2(4): 183 - 191.
[Abstract] [PDF]


Home page
Occup. Environ. Med.Home page
E Hnizdo, L Yu, L Freyder, M Attfield, J Lefante, and H W Glindmeyer
The precision of longitudinal lung function measurements: monitoring and interpretation
Occup. Environ. Med., October 1, 2005; 62(10): 695 - 701.
[Abstract] [Full Text] [PDF]


Home page
ChestHome page
R. Perez-Padilla, J. Regalado-Pineda, L. Mendoza, R. Rojas, V. Torres, V. Borja-Aburto, and G. Olaiz
Spirometric Variability in a Longitudinal Study of School-Age Children
Chest, April 1, 2003; 123(4): 1090 - 1095.
[Abstract] [Full Text] [PDF]


Home page
Occup. Environ. Med.Home page
E. Hnizdo, G. Churchyard, and R. Dowdeswel
Lung function prediction equations derived from healthy South African gold miners
Occup. Environ. Med., October 1, 2000; 57(10): 698 - 705.
[Abstract] [Full Text]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by HNIZDO, E.
Right arrow Articles by DOWDESWELL, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by HNIZDO, E.
Right arrow Articles by DOWDESWELL, R.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Proc. Am. Thorac. Soc. Am. J. Respir. Cell Mol. Biol.
Copyright © 1999 American Thoracic Society