Published ahead of print on May 3, 2007, doi:10.1164/rccm.200703-462PP
© 2007 American Thoracic Society doi: 10.1164/rccm.200703-462PP
The Opportunities and Challenges of Developing Imaging Biomarkers to Study Lung Function and Disease1 Department of Internal Medicine and The Mallinckrodt Institute of Radiology, Washington University School of Medicine, St. Louis, Missouri Correspondence and requests for reprints should be addressed to Daniel P. Schuster, M.D., Washington University School of Medicine, 660 South Euclid Avenue, St. Louis, MO 63110. E-mail: daniel.schuster{at}wustl.edu ABSTRACT Recent advances in imaging offer exciting opportunities to develop and validate lung-specific biomarkers as valuable adjuncts to diagnosis, tests of treatment efficacy, and/or treatment monitoring. State-of-the-art structural, functional, and molecular imaging methods allow the lungs to be visualized noninvasively in vivo at submillimeter and subsecond spatial and temporal scales. However, the development and validation of imaging biomarkers present some special challenges, including the following: equipment evaluation, procedure standardization, data regarding reproducibility and replication, interrater variability, the production and measurement of reference standards, sensitivity to interventions or disease progression, intersubject variance, choice of image reconstruction and segmentation algorithms, automated versus observer-dependent image analysis, data acquisition during conditions of standardized lung volume, whether a reliable association can be demonstrated between the imaging biomarker and a clinical endpoint, and whether its use will have a favorable cost-effective impact on drug development or disease management. Establishing such performance characteristics, especially for single investigators at single institutions, can be daunting if not impossible for costly biomarkers such as imaging. Therefore, to take full advantage of the opportunities presented by state-of-the-art imaging methods, new approaches to analytic and clinical validation must be developed in collaboration with industry, foundation, and federal funding agencies.
Key Words: drug discovery drug development positron emission tomography magnetic resonance imaging X-ray computed tomography America's investment in biomedical research justifies a timely translation of basic science discovery into clinical medical practice. Thus, it is both disappointing and surprising that, despite tremendous advances in discovery, the number of new medical therapies making it to clinical practice has actually slowed recently (1). One strategy to accelerate progress is to use newly developed biomarkers, to phenotype patients more accurately, to establish a diagnosis more definitively, to manage disease, and to predict prognosis. Biomarkers that support the efficacy or identify the toxicity of new therapeutic interventions can also have a dramatic effect on "go–no go" decisions for development, on the costs associated with development, and on the time to complete development. Until recently, most biomarkers have been measures of physiology or assays of key biochemicals in blood or other bodily fluids. Recent advances in imaging, however, offer exciting opportunities to develop and validate new organ-specific biomarkers. Interestingly, most reviews of pulmonary biomarkers have failed to even mention imaging as a strategy for biomarker development (2–4). Indeed, the development and validation of imaging biomarkers present some special challenges. Although imaging also plays an increasingly important role in animal studies and preclinical drug development, the primary focus of this perspective is the opportunities and challenges associated with developing and validating in vivo imaging biomarkers to measure biological phenomena in the lungs of humans. DEFINITIONS AND CONCEPTS The most widely accepted definition of a biomarker, the result of a National Institutes of Health/U.S. Food and Drug Administration working group, is that it is any "characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention" (5, 6). Biomarkers may be classified in different ways, but one useful scheme (particularly relevant to drug development) is to divide biomarkers into those that show "proof of mechanism," "proof of principle," or "proof of efficacy," depending on whether they demonstrate an interaction with an intended target, a modification to a pathway downstream from the target, or an effect on the intended disease, respectively (Figure 1). Biomarkers of the last type are sometimes also called "surrogate endpoints" or "clinical correlates" (7).
Surrogate endpoints are meant to replace clinical endpoints (Figure 2). Clinical endpoints are specifically those biomarkers that "reflect how a patient feels, functions, or survives" (5). Clinical endpoints (e.g., mortality, length of intensive care unit stay, number of disease exacerbations per year) have intrinsic value to patients. However, clinical endpoints can be difficult to quantify and may not develop until long after exposure to a risk factor or drug. For these reasons, surrogate endpoints can be valuable when the markers change in some predictable way earlier in the disease process than can clinical endpoints (allowing clinical trials of shorter duration) and/or they can be measured with greater precision than clinical endpoints (allowing clinical trials to be planned with a smaller "n"). Although surrogates such as cholesterol reduction, glucose control, viral DNA levels, and so forth are well known (8), imaging biomarkers have also been used as surrogates, in studies of rheumatoid arthritis and of cancer, among others (9). They have not, however, been used in registration trials for new pulmonary drugs.
In some cases, "surrogate endpoint clusters" (10) may be better predictors of clinical outcome than single surrogates, if for no other reason than disease is rarely the result of a single factor that can be entirely encapsulated by one biomarker. These clusters might include both imaging and nonimaging surrogates (see below). No funding or regulatory agency has specifically defined what constitutes an "imaging biomarker." Smith and colleagues defined these to be "anatomic, physiologic, biochemical, or molecular parameters detectable with imaging methods [be they microscopic, ex vivo, or in vivo] used to establish the presence or severity of disease [i.e., either diagnosis or prognosis]" (11). In reality, imaging biomarkers are simply biomarkers for which imaging is the quantitative instrument of measurement. DEVELOPING AND VALIDATING BIOMARKERS Imaging biomarkers, like biomarkers in general, can be developed and validated by a process not unlike that used to develop and test new drugs (Figure 1). Eventually, the biomarker may be qualified by the U.S. Food and Drug Administration (FDA) as a surrogate endpoint if it is "reasonably likely, based on epidemiologic, therapeutic, pathophysiologic, or other evidence" that such an effect predicts clinical benefit or on the basis of an effect on a clinical endpoint other than survival or irreversible morbidity" (cited from Subpart H, Sec. 313.510, of the Code of Federal Regulations, accessible via the FDA website, www.fda.gov). (Other useful definitions regarding biomarker validation are given in Reference 6). Prentice proposed a reasonable set of criteria to establish surrogacy status for a new biomarker (12): (1) the treatment must modify the surrogate, (2) the treatment must modify the clinical endpoint, (3) the surrogate and clinical endpoint must be significantly correlated, and (4) the effect of treatment on the clinical endpoint should disappear when statistically adjusting for its effect on the surrogate. These criteria, however, are not universally accepted (13). Both analytic and clinical validation (the latter referred to as "qualification" by the FDA (see http://www.fda.gov/ohrms/dockets/ac/04/slides/2004–4079s2.htm [accessed February 17, 2007]) (14) are critical steps in biomarker development (Table 1, Figure 3). Issues to be resolved during analytic validation include the following: equipment evaluation, procedure standardization, quality assurance protocols, reproducibility and replication, interrater variability, the production and measurement of reference standards (which, in the case of imaging, would include the use of standardized "phantoms" and reference imaging sets), sensitivity to interventions or disease progression, and intersubject variance. Analytic validation of imaging biomarkers also requires attention to the choice of image reconstruction algorithm, image segmentation algorithm (to isolate classes of lung structures—airways, vessels, parenchyma, etc.—from one another), automated versus observer-dependent image analysis, and, specifically for the lungs, data acquisition during conditions of standardized lung volume (15). The principal issue to be resolved during clinical validation is whether a reliable (preferably causal) association can be demonstrated between the biomarker and the clinical endpoint (criterion 3 above), and whether its use as a surrogate will have a favorable cost-effective impact on drug development or disease management.
Relatively few imaging biomarkers have achieved surrogacy status. Reasons include lack of method standardization, inadequate information about performance (sensitivity, specificity, reproducibility, etc.), and inadequate or nonexistent validation against clinically meaningful endpoints. In general, bioassays that are performed on human tissue specimens for diagnostic or disease management reasons are performed by laboratories certified under the Clinical Laboratory Improvement Amendments (CLIA) using standards developed by various nationally recognized institutes or committees (6). Similarly, a lung-specific biomarker, like pulmonary function testing, has been standardized by the American Thoracic Society and European Respiratory Society (information available at http://www.thoracic.org/sections/publications/statements/index.html [last accessed March 17, 2007]). No comparable set of standards exists for imaging biomarkers, however. The importance of standardizing imaging methodology is crucial, and institutes such as the National Cancer Institute are beginning to address this issue for potential imaging surrogates in cancer (16). Although similar issues have been raised regarding X-ray CT to obtain imaging biomarkers for CF and chronic obstructive pulmonary disease (COPD) (15), a set of agreed-upon standards do not yet exist. Despite this, characterization of the issues and agreement about how to resolve them are much farther along in the case of CT than is the case with other forms of imaging (e.g., magnetic resonance [MR] or radionuclide imaging). Dynamic imaging (the acquisition of repeat images over brief periods of time) can be used to derive a variety of functional measurements relevant to lung physiology or pathophysiology (e.g., ventilation, blood flow, or transfer rate of some tracer from one tissue compartment to another) by using a mathematical model to analyze changes in the regional concentration of some source of contrast (e.g., a contrast agent in the case of CT or radioactivity in the case of radionuclide imaging), ultimately changing the set of images into a single map representing the regional distribution of the physiologic process being studied. It is this transformed parametric image that is then analyzed and interpreted biologically. These mathematical models must incorporate such factors as delivery of the contrast agent to tissue, blood concentration of contrast agent, tissue uptake, metabolism, recirculation of metabolized and unmetabolized tracer, and the heterogeneity of tissues within the resolution volume of the image (the "voxel") (17). The physiologic data finally analyzed from such an "imaging" study will only be as accurate as the mathematical model used to calculate them—the same imaging data may yield different results when analyzed mathematically with different models. Assessing the sensitivity and accuracy of these models is an important, but often overlooked, aspect of the technical validation of an imaging method. Obtaining imaging data for validation, especially against clinical endpoints, is not easy. For nonimaging biomarkers, one common strategy is to obtain tissue/blood samples during the course of a clinical trial that is otherwise being conducted anyway with clinical endpoints. These tissue banks then allow associations between a potential new biomarker and a clinical endpoint to be determined in either case-control (retrospective) or cohort (prospective) studies. Later, promising surrogates can be tested during interim evaluations in subsequent clinical trials to determine whether the candidate biomarker indeed reliably predicts clinical outcome. This strategy, however, is not possible for candidate imaging biomarkers because the imaging studies have to be obtained prospectively, often leading to difficult choices about technique, platform, and cost before the performance characteristics of the various imaging alternatives are known. As with any other biomarker (Figure 1), a candidate imaging biomarker may fail clinical validation for the following reasons: it does not measure a biological phenomena that is actually in the causal pathway of the disease process, there are several causal pathways but the intervention only affects the pathway represented by the biomarker, the biomarker is not in the pathway of the intervention's effect or is insensitive to the intervention's effect, or the intervention has mechanisms of action independent of the disease process (7, 18, 19). IN VIVO IMAGING PLATFORMS AND MODALITIES Imaging is performed for different reasons. Anatomic imaging is used to display structure (e.g., airway diameter) or to make measurements related to structure (e.g., lung volumes, such as functional residual capacity). Functional imaging usually depends on data obtained over finite time periods to measure dynamic physiologic processes such as ventilation, perfusion, or pulmonary vascular permeability. Molecular imaging represents a relatively new set of techniques used to "directly or indirectly monitor and record the spatiotemporal distribution of molecular or cellular processes [e.g., enzyme activity] for biochemical, biologic, diagnostic, or therapeutic applications" (20). Different imaging platforms (CT, MR, and radionuclide imaging [e.g., positron emission tomography (PET)]) can be used for these different purposes (anatomic, functional, or molecular imaging) (Table 2). Increasingly, images from more than one modality are superimposed on one another, allowing structure–function and function–function relationships to be studied on a regional basis. This new capability is useful to determine with certainty the tissue compartment from which functional or molecular imaging signals originate. In addition, however, multimodality imaging raises the possibility of developing imaging "signatures"—that is, the combination of imaging biomarkers that characterize a particular disease. Such signatures, with or without additional clinical information, may be a novel way to phenotype patients for genetic association studies, for pharmacogenetic studies, or to identify subsets of patients in whom to test new drugs.
SOME EXAMPLES OF LUNG IMAGING BIOMARKERS IN DEVELOPMENT
Proof-of-Mechanism Imaging Biomarkers Because drugs may be delivered to the lungs by inhalation, deposition of the drug along the airways or its distribution throughout the lung parenchyma will depend greatly on the method of delivery (i.e., the issue is not just whether the drug interacts with its target but whether it ever even reaches its target). Here, imaging can again be of special value, as demonstrated, for instance, in asthma (24, 25).
Proof-of-Principle Imaging Biomarkers
Structural imaging. CT has also been used to measure lung parenchymal density as an imaging biomarker of tissue injury (e.g., decreased density in emphysema and increased density in the acute respiratory distress syndrome) (31). Diffusion hyperpolarized 3He-MR imaging provides a new strategy for evaluating lung microstructure noninvasively, in this case without the need for ionizing radiation (32). In normal lungs, the apparent diffusivity of 3He gas (i.e., the Brownian motion of 3He within small, distal, primarily acinar, airways and airspaces) is restricted relative to its behavior in large airways such as the trachea. With lung destruction (as in emphysema), this apparent diffusivity (quantified as the apparent diffusivity coefficient) increases, providing a quantitative measure of airspace enlargement.
Functional imaging. Likewise, each of the major imaging platforms can be used to measure pulmonary perfusion (35–37). (Pulmonary perfusion here refers to microvascular blood flow and is distinct from angiography, which shows blood flow through large conducting vessels). These methods have been used to study such phenomena as hypoxic pulmonary vasoconstriction, and the effects of positive end-expiratory pressure, posture during mechanical ventilation, and bronchoconstriction on regional pulmonary perfusion.
Molecular imaging of inflammation. Primarily as a result of the latter, numerous studies document that FDG-PET imaging can detect various inflammatory processes in humans, including those involving the lungs (38, 39). In general, acute increases in whole lung uptake of [18F]FDG reflect tissue invasion by activated neutrophils (although acute diseases such as eosinophilic pneumonia can be an exception [40]). Despite the correlation with neutrophil influx, parenchymal cells may also contribute to the imaging signal (41). FDG-PET imaging can detect and quantify the focal inflammatory response induced by either segmental allergen challenge (a model of atopic asthma) (42) or segmental instillation of low-dose endotoxin (a model of neutrophilic airway inflammation) (43). These studies may provide a basis for early testing of antiinflammatory drugs (39). In this regard, the cellular nonspecificity driving the FDG-PET imaging signal may or may not be valuable. For instance, FDG-PET imaging of a new antiinflammatory agent might not be adequately sensitive if a study was designed to specifically test the drug's impact on a particular component driving the imaging signal (e.g., neutrophil influx). On the other hand, the drug's mechanism of action might simultaneously cause favorable decreases in both neutrophil influx and activation of resident parenchymal cells. In this case, the nonspecificity of the FDG-PET imaging signal would increase test sensitivity. Other PET tracers in development may provide additional information about the inflammatory response. For instance, [11C]PK11195, a compound that binds to peripheral benzodiazepine receptors that are expressed on activated macrophages, might be useful for imaging macrophage responses in lung disease (44).
Proof-of-Efficacy Imaging Biomarkers: The Challenges of Establishing Surrogacy Indeed, for the goal of establishing surrogacy, one might imagine a kind of hierarchy of imaging biomarkers: in the natural history of any disease, changes in molecular imaging (indicating changes in the expression of a specific molecule) should precede changes in functional imaging, which in turn should precede anatomic changes detected by structural imaging—all before expression of or changes in the clinical disease become evident. As a consequence, it may be difficult to validate changes in molecular imaging as a surrogate endpoint, being "farther" from the downstream expression of clinical disease (Figure 1). The value of a molecular imaging biomarker could still be highly significant in understanding disease pathogenesis or in early drug development (proof of principle, Figure 1). For instance, to develop new drugs for inflammatory lung diseases such as CF, COPD, or asthma, one might envision using FDG-PET imaging to first test the potential antiinflammatory effects of a new drug in normal volunteers after inducing focal inflammation with segmental instillation of low-dose endotoxin or allergen (42, 43). A favorable result could be followed by testing the drug in a small group of actual patients using FDG-PET imaging as the proof-of-principle biomarker. Such a study could be combined with functional imaging of regional ventilation to determine if antiinflammatory effects (as judged by the results with FDG-PET) could be correlated with improvements in airway function. Alternatively, functional imaging might be used to identify ("phenotype") subsets of patients for such studies. Finally, a study of moderate duration could be designed to determine whether the new drug had important effects on structural imaging endpoints because these, either alone or in combination with other clinical biomarkers, might be most likely to correlate with—and thus be surrogates for—important clinical endpoints (45–47). RECOMMENDATIONS None of the promising imaging biomarkers just discussed is yet routinely used in proof-of-principle or proof-of-efficacy studies because the required performance data (Table 1) have not been acquired. Reasons include cost, complexity, the need to obtain the raw imaging data prospectively, and a low priority for funding such work. But without such information, the promise of these technological breakthroughs will not be fully realized. To meet this challenge, new strategies to achieve analytic and clinical validation are needed, preferably involving a multidisciplinary effort that includes imaging scientists representing each of the different imaging platforms, biologists and physiologists who can apply the new imaging tools to answer relevant biological and clinical questions, and representatives from both industry and federal funding agencies. Protocol and method standardization for both image acquisition and image analysis must be a top priority. Such standardization would include the following: methods manuals; calibration against common imaging phantoms; the use of standardized reference imaging sets (48); observer-independent image analysis tools (49); and mechanisms for archiving, sharing, and transporting imaging data to external sites for later or independent review (50). Professional societies and/or federal governmental agencies could play a valuable role by organizing symposia or workshops for the purpose of defining and implementing these standards, preferably with industry input. Another priority should be to acquire sufficient patient imaging data to adequately characterize test performance (Table 1). This goal cannot be achieved by individual investigators at single institutions. It will only be possible in the context of networks or consortia that agree to work together. To fund such an effort, one could imagine a mechanism not unlike the contracts that currently fund groups such as the ARDS Network. Thus, groups would be invited to submit proposals for the development and validation of lung imaging biomarkers (perhaps with enough lead time to encourage pilot data to be gathered through an NIH R21 grant mechanism), the best would be chosen by peer review, and those groups that were chosen would meet to subsequently decide which proposals should be implemented first. Finally, journals should consider demanding that the details of image acquisition and analysis be described in manuscripts they accept for publication (these details, of course, could be archived online rather than in the main text). An FDA whitepaper, in discussing how to accelerate the development of useful new therapies (1), notes that "how soon these new [imaging] tools will be available for use will depend on the effort invested in developing them specifically for this purpose." That is indeed the challenge; but the opportunity is a substantial improvement in time and proof of efficacy, and in the translation of discovery to clinical practice. Acknowledgments The author thanks Dr. David Gierada for his review of the manuscript. FOOTNOTES Originally Published in Press as DOI: 10.1164/rccm.200703-462PP on May 3, 2007 Conflict of Interest Statement: D.P.S. currently has a grant from Pfizer, Inc., to study FDG-PET imaging as a biomarker of lung inflammation. Received in original form March 21, 2007; accepted in final form May 2, 2007 REFERENCES
This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||