Published ahead of print on October 14, 2004, doi:10.1164/rccm.200310-1360OC
© 2005 American Thoracic Society doi: 10.1164/rccm.200310-1360OC
Clinical Usefulness of Home Oximetry Compared with Polysomnography for Assessment of Sleep ApneaDepartments of Medicine and Community Health Sciences, University of Calgary, Calgary, Alberta, Canada Correspondence and requests for reprints should be addressed to W. A. Whitelaw, M.D., Ph.D., Department of Medicine, University of Calgary, Heritage Medical Research Building, 3330 Hospital Drive NW, Calgary, AB, T2N 4N1 Canada. E-mail: wwhitela{at}ucalgary.ca
The practical purpose of diagnostic assessment in most cases of obstructive sleep apnea is to predict which patients have symptoms that will improve on treatment. We measured the accuracy with which clinicians make this prediction using polysomnography compared with oximeter-based home monitoring. Patients referred to a sleep center with suspicion of symptomatic obstructive sleep apnea were randomized to have polysomnography or home monitoring. Patients with comorbidity or physiologic consequences of sleep apnea were excluded. Sleep specialists estimated the likelihood of success of treatment as greater than 50% (predicted success) or less than 50% (predicted failure) on the basis of clinical data and test results. All patients were treated for 4 weeks with autoadjusting continuous positive airway pressure. Success was defined as an increase greater than 1.0 in Sleep Apnea Quality of Life Index. Correct prediction rates were compared. Two hundred eighty-eight patients were enrolled. Initial patient characteristics, compliance, and improvement in quality of life at 4 weeks were not different in the two groups. The correct prediction rate was 0.61 with polysomnography and 0.64 with home monitoring (not significant). We conclude that the ability of physicians to predict the outcome of continuous positive airway treatment in individual patients is not significantly better with polysomnography than with home oximeter-based monitoring.
Key Words: diagnosis home monitoring outcomes treatment Polysomnography, which typically records sleep state, apneas, hypopneas, cardiac rhythm, and leg movements, is recommended for diagnosing obstructive sleep apnea (OSA) (1). It is relatively expensive, however, and capacity for testing is limited relative to the large number of people suspected of having the disorder (2). Approximately 90% of people with the condition remain undiagnosed (3). Portable monitors record primarily oxygen saturation, can be used at home without supervision, and are cheaper. Their validity has been assessed by comparing them with polysomnography using recordings made either in the laboratory on the same night or at home on a different night. The apneahypopnea index (AHI) (number of events/hour of sleep) obtained from polysomnography is compared with the respiratory disturbance index (RDI) (number of events/hour of recording time) obtained from portable monitors, which do not record sleep state. Using a cutoff value of AHI as the definition of a positive diagnosis, portable monitors recorded with polysomnography on the same night show sensitivities from 97 to 84% and specificities from 97 to 67% (4). Recorded at home on a different night they show sensitivities from 98 to 84% and specificities from 98 to 48% (4), depending on the device and the apneahypopnea cutoff value. It can be argued that accuracy at classifying patients as having or not having sleep apnea, defined by a cutoff value of AHI determined by polysomnography, is not particularly relevant for clinical practice. Both polysomnography and portable monitors have considerable night-to-night variation, which accounts for some loss of agreement between single-night observations of the two tests. As far as we know at present, AHI by itself has limited clinical significance, correlating poorly with symptoms or with outcome of treatment (510). The population distribution of AHI is unimodal, offering no rationale for selecting a particular cutoff value. There is no statistical or practical difference between patients a few index points apart, one above and one below a threshold. In this study we therefore propose a different method for evaluating portable monitors and apply it to a population of patients suspected of having symptomatic OSA. Most cases seen in family practices or sleep centers with suspected symptomatic OSA have neither major physiologic complications nor important comorbidity. For them the only benefit of treatment for OSA so far proven by randomized placebo-controlled trials is improvement in quality of life (1114). If it is accepted that for these cases the main purpose of treating OSA is to improve quality of life, the purpose of doing overnight testing should be to determine whether the patient will improve on treatment, rather than to measure the exact value of AHI. A perfect test would lead physicians to offer treatment to all patients who will benefit and avoid trying it on those who will not. The accuracy with which clinicians make that choice should, therefore, be the best measure of the clinical utility of a test for sleep apnea. The primary objective of this study was, thus, to measure the accuracy with which sleep physicians can predict which patients will benefit from treatment of OSA and to compare their accuracy using home monitoring with their accuracy using laboratory polysomnography. A preliminary report of the results has been presented in abstract form (15).
Study Design Patients were seen by experienced sleep physicians and randomized to have either polysomnography or home monitoring. Using those results, physicians predicted how likely patients were to improve on treatment. All patients had agreed to use nasal continuous positive airway pressure (CPAP) for 4 weeks; change in sleep apneaspecific quality of life after treatment defined whether patients improved. The rate of correct predictions using portable monitoring was compared with the rate using polysomnography. Patients were a randomly selected subset of consecutive eligible patients referred by family doctors to a sleep center. The inclusion criterion was a history suggesting OSA in association with somnolence or fatigue. Exclusion criteria were: a nonrespiratory sleep disorder as primary reason for referral, absence of significant daytime symptoms, important comorbidity, and significant physiologic consequences of OSA.
Protocol We also made a random selection of 45 study patients and distributed charts containing all recorded clinical and laboratory data among six experienced sleep physicians in other Canadian centers who studied them and made their own predictions. Polysomnography was a standard full-night diagnostic study (18). Treatment trials during polysomnography were avoided so that patients in both arms of the study would have equivalent initial experience with CPAP. The home monitor was Snoresat (Sagatech, Calgary, AB, Canada) (19, 20). Patients completed the shortened, self-administered version (21) of the SAQLI just before and after treatment, as well as the SF-36 Health Survey (Medical Outcomes Trust, Inc., Boston, MA), a general health status instrument (22), and the Epworth Sleepiness Scale (23). Analysis was based on intent to treat. Treatment was considered successful if the patient showed an improvement of the minimal clinically important change of 1.0 in SAQLI (6). Patients who elected not to complete the 4-week regimen but had an improvement in SAQLI after a shorter course of treatment were considered successes. Predictions of less than 50% chance of success were called predicted failure and of more than 50%, predicted success. Correct predictions were those that matched degree of improvement. Compliance was defined as greater than 4 hours/night of use (24) over the last 2 weeks of treatment. See the online supplement for a more detailed account of the methods.
Patients A total of 4,767 consecutive referral letters were reviewed and 2,088 (44%) patients were determined to be provisionally eligible. Of these, 439 were randomly selected, flagged as study prospects, and given appointments. Of the 414 who attended, 85 had exclusion criteria and 22 refused. The remaining 307 were randomized. Of these, 8 (5 in the polysomnography group and 3 in the home monitoring group) decided not to have their overnight test and 11 (8 in the polysomnography group and 3 in the home monitoring group) had a test but withdrew from the trial without trying CPAP after learning the result of their test. The total rate of refusal to participate was 12%. Two hundred eighty-eight subjects went on to begin a trial of CPAP, but 51 quit before the end of 4 weeks. Of those who quit, 15 (8 polysomnography and 7 home monitor) completed an exit SAQLI but 36 (11 polysomnography and 25 home monitor) failed to return for follow up. Thirty-one of these were noted to have had trouble sleeping with CPAP, three kept removing the mask in their sleep, and two had severe nasal congestion or sinus infections. They had lower mean AHI or RDI values than the rest of the patients, but there were no significant differences in pretreatment mean values for SAQLI or Epworth Sleepiness Scale. Characteristics at baseline of the two groups of patients were similar (Table 1): 32% of polysomnography patients had an AHI of less than 10 and 44% of home monitor patients had a RDI of less than 10.
Predictions of Success of Treatment The proportion of polysomnography patients to total patients assigned to individual physicians ranged from 0.42 to 0.53, with no statistically significant differences. Patients were evenly distributed among prediction categories, with 27%, 23%, 24%, and 25%, respectively, predicted to have a likelihood of improving less than 25%, 25 to 50%, 50 to 75%, and greater than 75%. The proportion of patients in each category who improved was similar to the proportion predicted by the physicians except for the 50 to 75% category, where the rate of improvement was lower than predicted (Table 2). In each prediction category, however, there was a large scatter in actual increase in SAQLI with treatment, as shown in Figure 1. By two-way analysis of variance, there is a significant relationship between increase in SAQLI and prediction category (p < 0.001) but not in type of overnight test (p = 0.55). Overall, physicians predicted success in 50% of patients, but only 42% met the criterion for improvement. Correct prediction rates were worse for patients predicted to have a 50 to 75% chance of success than in other categories (Table 3).
The correct prediction rate was 0.606 for patients who had polysomnography and 0.635 (p = 0.72) for those who had home monitoring. With 95% confidence, the correct prediction rate using polysomnography was no more than 0.073 better than the rate using home monitoring. Cohen's Kappa (± SE) comparing prediction with outcome was 0.23 ± 0.08 for polysomnography and 0.25 ± 0.08 for home monitoring (not significant). Individual clinicians had correct prediction rates ranging from 0.58 to 0.68. The combined correct prediction rate for the six sleep physicians from other centers evaluating a total of 45 patient records was 0.60. The SAQLI score was unknown to the physicians at the time they made their estimates of the probability of improvement. Some of the patients had high initial SAQLI scores and could have failed to improve because of a ceiling effect. Correct prediction rates on the subset of patients with initial SAQLI scores less than 5.0, the maximum SAQLI being 7.0, were 0.67 for polysomnography and 0.63 for home monitoring (not significant). The low rate of successful predictions might have been due to the choice of cutoff value for a clinically significant increase in SAQLI if the clinicians' idea of a significant improvement did not correspond to the minimal clinically significant increase of 1.0 used as a criterion in the study. Changing the cutoff for an increase in SAQLI over the range from 0.6 to 1.4 at increments of 0.2 made no significant difference to the mean correct prediction rate or to the difference in correct prediction rates between polysomnography and home monitoring, however (see Table E1 in the online supplement). Although clinicians probably took into account some estimate of the likelihood that patients would tolerate CPAP when they made their predictions, it was hard to evaluate how much this might have affected the results. We therefore calculated correct prediction rates using only the patients who were compliant at the end of the 4-week trial. They were 0.59 for polysomnography and 0.59 for home monitoring. The method used to compare predictions with treatment success used only two categories for prediction and left out information available in the four-category prediction. Correct prediction rates were slightly higher for predictions in the lowest and highest categories (Table 3). For home monitoring and polysomnography, respectively, they were 0.76 and 0.69 for patients with a predicted less-than-25% chance of improving and 0.75 and 0.63 for patients with a predicted greater-than-75% chance of improving, not significantly different between the tests. A way of looking at accuracy using all four categories is to compare polysomnography with home monitoring using Receiver Operating Characteristic (ROC) curves for the categories against treatment success or failure. These show no difference in area under the ROC curve between the types of overnight test for cutoffs ranging from 0.4 to 1.6 in SAQLI increase, and no effect of the choice of cutoff on area under the curve. Figure 2 shows the curves for SAQLI increases of 1.0. (See Table E2 in the online supplement for results for other SAQLI cutoff values.) A likelihood ratio Chi square test was used to determine if there was a significant difference between types of test in the pattern of prediction across the four categories. It was negative (p = 0.5).
An alternate criterion of improvement is compliance. Assuming that those who were using CPAP more than 4 hours a night had a clinically significant improvement and the rest had not, correct prediction rates with greater than 4 hours of CPAP use per night as the criterion for success of treatment were 0.63 for polysomnography and 0.62 for home monitoring patients (not significant). Using compliance at 3 months after the end of the trial as the criterion of improvement, correct prediction rates were 0.69 for polysomnography and 0.67 for home monitoring (not significant). The corresponding ROC curves are shown in Figure 3.
For comparison, Figure 4A shows ROC curves for AHI and RDI alone as predictors of improvement in SAQLI at 4 weeks. Areas under the curves are not significantly different between the two tests. The same applies when the cutoff value for change in SAQLI is varied from 0.6 to 1.4 (Table E3). Figure 4B shows ROC curves for AHI and RDI alone as predictors of compliance at 3 months. They look better than those for prediction of improvement in SAQLI, but the 95% confidence intervals for areas under the curves overlap. Figure 4C shows the ROC curve for improvement in SAQLI as a predictor of compliance at 3 months. It is not significantly different from the AHI or RDI curves.
Clinical Outcomes Measured outcomes of treatment in the two groups of patients were not statistically significantly different. Mean increases in SAQLI were 0.92 in the polysomnography group and 0.82 in the home monitoring group (p = 0.50; 95% confidence interval for the difference in SAQLI: 0.40 to +0.20). Forty-four percent of the polysomnography and 40% of home monitor patients improved (p = 0.47; 95% confidence interval for the difference: 4 to +17%). Mean decreases in Epworth scale were 4.0 for polysomnography and 3.4 for home monitoring (p = 0.27). The mean RDI on treatment was 5.7 for polysomnography and 4.2 for home monitoring (p = 0.06). Mean scores and improvements in scores in domains of SF-36 were not significantly different. (See the online supplement for details of the methods.) Compliance rates in the second week of treatment were 0.62 for polysomnography patients and 0.56 for home monitoring patients (p = 0.27; 95% confidence interval for the difference: 5 to +17%). Mean hours of use of positive pressure/day were 3.8 in polysomnography patients and 3.3 in home monitoring patients (p = 0.4; 95% confidence interval for the difference: 1.1 to +0.05 hours/day). Twenty-one (29%) of the 79 patients classified as having a less than 25% chance of improving showed an increase of 1.0 or more in their SAQLI score and 3 were compliant with CPAP for greater than 3 months. In the polysomnography group, 12 of the 32 patients with an AHI of less than 10 were improved at 4 weeks and 4 were compliant for greater than 3 months. In the home monitoring group, 18 of the 69 patients with a RDI of less than 10 were improved at 4 weeks and 3 were compliant for greater than 3 months. Eventually nonrespiratory sleep disorders causing fatigue or somnolence (11 depression, 1 periodic limb movements, 1 narcolepsy, 3 idiopathic hypersomnolence, and 2 chronic fatigue) were diagnosed in 18 patients (8 home monitoring and 10 polysomnography), of which 5 cases or 2% (those with narcolepsy, periodic leg movements, and idiopathic hypersomnolence) needed polysomnography for diagnosis.
The study failed to show a significant advantage of polysomnography over home monitoring for the purpose of identifying patients whose quality of life would improve with treatment specific for sleep apnea. The rate of correct predictions using either test was low at 63%, with a very low Cohen's Kappa (0.24) for agreement between outcome and prediction. Several reasons for this may be considered. (1). In predicting that CPAP will improve quality of life, physicians are saying that the patient has symptoms severe enough to reduce quality of life, that the symptoms are due to OSA, that the patient will tolerate treatment that is technically adequate, and that disadvantages and side effects will not outweigh benefits of treatment. In the context of this trial they had to suppose, as well, that SAQLI would provide an accurate assessment of symptoms most troublesome to each patient. Given all these unknowns, the observed correct prediction rate is perhaps not surprising. (2). If physicians make correct evaluations of all these factors, discrepancies between prediction and outcome can still arise due to a placebo effect or regression to the mean (1113, 25, 26). (3). Design of the trial required physicians to categorize patients above or below a 50% probability of successful treatment. Patients who were judged to have nearly an even chance of improving would have fallen into the above or below 50% category mainly by chance, perhaps making the prediction rate artificially low. (4). The SAQLI difference may misrepresent actual improvement in quality of life through test-retest variability. In a post hoc analysis using compliance 3 months after beginning long-term CPAP as the definition of improvement, correct prediction rates were again not significantly different between the two overnight tests, supporting the hypothesis that polysomnography is no better than home monitoring for deciding which patients are likely to benefit from treatment. The failure of polysomnography to improve correct prediction rates implies that accuracy in measuring the AHI obtained from polysomnography is unimportant for predicting outcome. This agrees with previous work showing only weak correlations between improvement in quality of life and AHI (79, 27, 28). In fact, the index obtained from portable monitoring agrees quite closely with AHI in most cases (20), even though it is not perfectly sensitive or perfectly specific for a particular cutoff value. As well, an index determined in the laboratory may not reflect conditions of sleep at home as well as an index determined at home. The information about sleep stages, arousals, and periodic limb movements from bioelectric signals obtained by polysomnography also seems to be of limited value for predicting improvement in quality of life (79, 28, 29). This observation fits with studies of factors that correlate with long-term use of positive airway pressure, which is assumed to correlate with benefit from treatment. They show only a weak correlation with information derived from the bioelectric signals. (3032) This could be because bioelectric data are not relevant or because we do not know how to interpret them. The 30% of patients with low indices (either RDI or AHI < 10) who showed improved SAQLI scores at 4 weeks and the 8% who were compliant with CPAP at 3 months might be explained by a placebo effect, or by underestimation of OSA due to night-to-night variation in AHI, or to differences between sleep at home and in the laboratory, or by unusual sensitivity to respiratory events that do not meet criteria for hypopneas, or indirectly through reduction in snoring. The portable monitor used for this study has been tested most thoroughly in Calgary at an altitude of approximately 1,000 m. However, its specificity and sensitivity for diagnosis of sleep apnea when tested against polysomnography at this altitude are not substantially different from the best of other oximeter-based monitors evaluated near sea level (4). It is reasonable to suppose that the conclusions of the study would apply to similar populations of patients when using monitors with comparable performance. Polysomnography may be of value not only in the diagnosis but also in the treatment of sleep apnea. One study has shown better compliance with CPAP in patients who have had polysomnograms for both diagnosis and titration than in patients who had home titration. (33). The possible beneficial effect of CPAP titration during polysomnography was not addressed in this study, because all patients had autotitration. However, the experience of a diagnostic polysomnogram and the accrural of more complete information might help to persuade patients to continue indefinitely with treatment. Our data (not statistically significant), showing slightly higher compliance at 4 weeks with polysomnography, do not provide strong support for this. Technical advances in autoCPAP, masks, or education of patients may in future minimize the possible importance of laboratory titration for ensuring compliance. Some patients suspected of having symptomatic OSA turn out to have narcolepsy or some other disorder that might be diagnosed with polysomnography but not by portable monitoring. However, the prevalence of those conditions was only 2% in our patients. Patients referred to most sleep clinics who pass through a filter similar to the entry criteria for this study could be expected to have similar rates of nonOSA diagnoses. Failure to do polysomnography at the outset might delay diagnosis in a few such patients, but they should be identified quickly if they have daytime symptoms and their home test is negative or if they continue to have symptoms in spite of treatment for sleep apnea. For most of the population at risk, delay from time of referral to diagnosis might in fact be reduced by wider use of home monitoring. It is important to say that the data do not support omitting sleep tests in the assessment of patients suspected of having symptomatic OSA. The subjective outcome of a therapeutic trial would be unreliable as the criterion for diagnosis in individual patients.
Limitations of the Study Conclusions drawn from the study will be invalid if and when there is evidence for a cause and effect relationship between asymptomatic OSA and cardiovascular or other diseases that can be rectified by treatment of OSA. Many physicians and patients even now may argue that with current evidence treatment should be recommended for people with diabetes or for those at risk for cardiovascular diseases.
Conclusions The conclusion applies only to cases without substantial comorbidity and without suspected major physiologic complications of sleep apnea. It will also be invalid in any circumstance where improvement in quality of life is not the principal health benefit to be realized from treating OSA.
The authors thank clinical colleagues in Calgary: K. Fraser, W. Tsai, T. Pederson, J. Remmers, and S. Clark who participated in the trial; colleagues in other Canadian centers: K. Ferguson, J. Road, D. Sin, D. Morrison, F. Series, and J. Kimoff who reviewed cases; and especially our research coordinator, Sharon Tanguay.
Supported by Alberta Heritage Foundation for Medical Research, the Calgary Health Region, and Respironics Inc. This article has an online supplement, which is accessible from this issue's table of contents at www.atsjournals.org Conflict of Interest Statement: W.A.W. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript; R.F.B. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript; W.W.F. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. Received in original form October 3, 2003; accepted in final form October 7, 2004
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||