Desktop version

Home arrow Health arrow Computational Anatomy Based on Whole Body Imaging: Basic Principles of Computer-Assisted Diagnosis and Therapy

Generic Scheme in CADe Systems for Lung Cancer

A generic scheme in CADe systems of LDCT screening for lung cancer usually consists of the following major steps: LDCT acquisition, preprocessing, segmentation of pulmonary structures, initial candidate detection, reduction of false-positive (FP) detections, and lung nodule detection.

  • (a) LDCT acquisition: The scan and reconstruction parameters of CT are important factors in the performance of CADe. Narrow collimation with reconstructions of thin sections is recommended to improve the detection of nodules [26-28]. Radiation dose is a key concern [29, 30]. With the advance of iterative reconstruction techniques, ultralow-dose chest CT with a radiation dose comparable to that of chest radiography might be considered when designing future screening protocols [30]. To develop, train, and validate CADe systems, public databases are available. The Lung Image Database Consortium (LIDC) offers annotated chest CT scans [31]. To compare CADe systems for nodule detection, the ANODE09 dataset is another publicly available database in which all data were provided by the University Medical Center Utrecht and originate from the NELSON study, the largest CT lung cancer screening trial in Europe [32].To benchmark the performance of developed CAD systems, the publicly available CT databases become more important [33, 34].
  • (b) Preprocessing: Before beginning with detection steps, some initial processing is performed on the original CT images to remove defects caused by the image acquisition process such as noise and to enhance the characteristics of lung nodule candidates [25, 35-37].
  • (c) Segmentation of pulmonary structures: Figure 4.1 shows anatomical structures on an axial chest CT section. The segmentation of the left and right lungs from chest CT images is performed to restrict the nodule detection to the lung volumes. The major portion of the lungs comprises lung parenchyma that is
Pulmonary structures on CT images

Fig. 4.1 Pulmonary structures on CT images. Within the thorax, the ribs enclose the lungs, and the diaphragm lies beneath the bases of the lungs, separating the thoracic and abdominal cavities. The mediastinum between the two lungs consists of the heart, major blood vessels, the esophagus, and the trachea. The pulmonary arteries enter the lungs; the pulmonary veins exit the lungs. The blood vessels, airways, and lymphatics at the root of each lung collect in the hilum and enter the mediastinum. The lungs consist of airways, vessels, and the lung parenchyma. The left and right lungs are usually subdivided into two lobes (the upper and lower lobes) and three lobes (the upper, middle, and lower lobes), respectively. The five lobes are separated by fissures of varying completeness, which are potential spaces lined by the visceral pleura. The visceral pleura covers all the lung parenchyma, and a second layer of parietal pleura is attached to the chest wall and the mediastinum. The lobes are further subdivided into segments, which are defined by bronchial supply involved in gas exchange. Because the lung parenchyma has a lower density, around —900 HU, than the surrounding tissue in chest CT images of healthy subjects, many lung segmentation algorithms are based on a thresholding approach. The threshold-based methods consist of three major steps: (1) extraction of the preliminary lung regions using thresholding, (2) identification of lungs and separation between left and right lungs, and (3) refinement of the lung shapes to smooth the borders and include vessels in the segmentation result [18, 23, 24]. When the higher densities of the abnormalities are included in the lungs compared with the density of normal lung parenchyma, the conventional threshold-based lung segmentation methods result in segmentation errors [24, 38]. To handle this situation, atlas-based segmentation of pathological lungs [39], hybrid lung segmentation in which a conventional threshold-based method and a multi-atlas-based algorithm using nonrigid registration are combined [40], and a graph cut-based segmentation in which multiple possible shapes of lungs can be taken into account [41] have been proposed. Segmentation methods of vessels, airways, pleurae, lobes, segments, and ribs have been studied [18, 24, 42]. Figure 4.2 presents some segmentation results of pulmonary structures [42-45].

  • (d) Initial candidate detection: After preprocessing, initial candidate detection is employed to locate potential lung nodules. There are many strategies to detect nodule candidates [18-25, 32]: multiple gray-level thresholding [36, 46, 47], fuzzy clustering and surface curvature [45], template matching [48], a model- based image understanding technique [49], a mathematic model of anatomic structures [50, 51], mathematical morphology [52, 53], a convergence index filter [54], Gaussian curve fitting [55], shape-based genetic algorithm [56], geometric model based on the analysis of the signed distance field [57], shape index [35, 58], intensity structure enhancement [37, 59, 60], and gradient analysis [37, 61, 62].
  • (e) Reduction of FP detections and lung nodule detection: The pattern features of lesion candidates such as gray-level-based features, texture features, and morphological features are extracted. Once a candidate’s characteristics are obtained, the step tries to remove FPs and retain potential lung nodules. In this procedure, classifiers are widely used [18-25, 32]. The role of the classifiers is to determine optimal boundaries between lung nodules and non-lesions in the multidimensional feature space, which is generated by the input features of the candidates. There are a number of classification techniques: linear discriminant analysis [36, 46, 54, 60-62], rule-based classifier [45, 47, 48, 53, 56, 59], 3D MRF models [50], neural network [37, 52, 55, 63], Bayesian classifier [51], fuzzy logic [49], shape similarity [57], support vector machine [58], and k- nearest neighbor [35].

Table 4.1 summarizes the principal methods of lung nodule detection that are

covered in this section, looking at sensitivity, FP rate, number of nodules used

in validation, and size of nodules. The performances given are the best performances when performances based on several conditions are included in the study

Examples of the segmentation of pulmonary structures

Fig. 4.2 Examples of the segmentation of pulmonary structures. (a) Bones including ribs and vertebrae. Color codes represent the classifications of the ribs and vertebrae. (b) Lung lobes. Green, pink, and red represent the upper, middle, lower lobes of the right lung. Blue and yellow represent the upper and lower lobes of the left lung. (c) Lung segments. Color codes represent the separation of each lobe into lung segments. (d) Pulmonary vessels, trachea, and bronchi. White: trachea and bronchi. Blue and pink represent the segmentation result of pulmonary vein and artery

description. The comparison of the results of CADe performance using different datasets, the different natures and characteristics of the nodules, and various evaluation methods [20] is of limited value. Furthermore, though the reference standard for positive cases is thought of as a gold standard (ground truth), the determination of a perfect gold standard is not an easy task. Because the reference standard is usually determined by an expert panel, substantial variability has been reported in the definition of a gold standard for identifying nodules on CT images [31]. In the ANODE09 study, a comparative study on the same dataset was performed [32]. The performances of six CADe systems (five from academic groups and one commercially available system) for lung nodule detection were evaluated using a database of 55 scans and demonstrated that combining the outputs of the CADe

Table 4.1 Summary of the reported best performance of CADe systems for detection lung nodules

Author

Year

Detection schemes initial candidate detection/FP reduction and lung nodule detection

No. of nodules

CAD

sensitivity

(%)

FP rate (%)

Data (No. of patients, section thickness, radiation dose)

Nodule size

Arnrato et al. [46]

1999

Multiple gray-level thresholding/linear discriminant analysis

187

70

3/section

17 patients. 10 mm.

3.1-27.8 mm

Kanazawa et al. [45]

1998

Fuzzy clustering and surface curvature/rule-based classifier

230

90

2.8/scan

450 patients. 10mm. low dose

-

Lee et al. [48]

2001

Template-nratching/rule-based

classifier

98

72

1.1/section

20 patients. 10 mm. low dose

5-30 mm

Brown et al. [49]

2001

Model-based image understanding technique/fuzzy logic

36

86

11/scan

17 patients. 5-10 mm. normal dose

5-30 mm

Arnrato et al. [47]

2002

Multiple gray-level thresholding/rule-based classifier

50

80

1/section

31 patients. 10 mm. low dose

5-25 mm

Suzuki et al. [63]

2003

Supervised lesion enhancement filter based on a massive-training ANN (MTANN)

71

80.3

4.8/scan

71 patients. 10 mm. low

dose

Mean. 13.5 mm

McCulloch et al. [51]

2004

Mathematic model of anatomic structures/Baysian classifier

43

69.8

8.3/scan

50 patients. 2.5 mm. low dose

5-17.1 mm

Awai et al. [52]

2004

Mathematical nrorphology/neural network

78

80

0.87/section

82 patients. 7.5 mm. normal dose

3-30 mm

Ge et al. [61]

2005

Gradient analysis/linear discriminant analysis

116

87.9

0.5/section

56 patients. 1.0-2.5 mm.

3-30.6 mm

Bae et al. [53]

2005

Mathematical

morphology/rule-based classifier

107

97.2

4/scan

20 patients. 1 mm. normal dose

>3 mm

(continued)

Table 4.1 (continued)

Author

Year

Detection schemes initial candidate detection/FP reduction and lung nodule detection

No. of nodules

CAD

sensitivity

(%)

FP rate (%)

Data (No. of patients, section thickness, radiation dose)

Nodule size

Kim et al. [55]

2005

Gaussian curve fitting/neural network

297

94.3

0.89/section

14 patients. 1.0-5.0 mm, normal dose

5-28 mm

Roy et al. [62]

2006

Gradient analysis/linear discriminant analysis

82

70

0.28/section

38 patients. 7 mm. normal dose

3-30 mm

Yuan et al. [69]

2006

ImageCheker CT (R2 Technology)

628

73

3.19/scan

150 patients. 1.25 mm. normal dose

>4 mm

Matsumoto et al. [54]

2006

Convergence index filter/linear discriminant analysis

50

90

1.67/section

5 patients. 5.0-7.0 mm. normal dose

3-12 mm

Das et al. [64]

2006

ImageCheker CT (R2 Technology)

116

73

6/scan

25 patients. 2 mm.

Mean. 3.4 mm

Nodule enhanced viewing (NEV) (Siemens Medical Solutions)

116

75

8/scan

25 patients. 2 mm.

Mean. 3.4 mm

Dehmeshki et al. [56]

2007

Shape-based genetic algorithm/rule-based classifier

178

90

14.6/scan

70 patients. 0.5-1.25 mm. normal dose

3-20 mm

Li et al. [59]

2008

Intensity structure enhancement/rule-based classifier

153

86

6.6/scan

  • 117 patients.
  • 1.25-5.0 mm. low/normal dose

4-28 mm

Pu et al. [57]

2008

Geometric model based on the analysis of the signed distance field/scoring based on shape similarity

184

81.5

6.5/scan

52 patients. 2.5 mm. low dose

3-28.9 mm

Ye et al. [58]

2009

Shape index and dot features/rule-based filtering and support vector machine

220

90.2

8.2/scan

108 patients. 0.5-2.0mm, low/normal dose

Murphy et al. [35]

2009

Shape index and curvedness/K-nearest-neighbor

1525

80

4.2/scan

813 patients. 1 mm. low dose

>3 mm

Yanagawa et al. [65]

2009

Lung VCAR (GE Healthcare)

229

40

5.7/scan

48 patients. 0.625 mm

>4 mm

Messay et al. [36]

2010

Multiple gray-level thresholding/Unear discriminant analysis

143

82.7

3/scan

84 patients. 1.3-3.0 mm,

3-30 mm

Tan et al. [37]

2011

Intensity structure enhancement/neural network

80

87.5

4/scan

125 patients. 0.75-3.0 mm.

3-30 mm

Guo et al. [60]

2012

Intensity structure enhancement/linear discriminant analysis

111

85

17.3/scan

85 patients. 1.25-3.0 mm. low/normal dose

3-115 mm

Zhao et al. [66]

2012

LungCAD VB10A (Siemens AG Healthcare)

151

96.7

3.7/scan

400 patients. 1 mm. low dose

2.3-6.9 mm

Godoy et al. [75]

2013

VD10A (Siemens Healthcare)

155

79

3/scan

46 patients. 0.67-1.0 mm. normal dose

4-27.5 mm

systems led to performance improvement. In the system combination, the results of multiple nodule CAD systems were combined without their internals, like the feature values of candidates [32]. The combination method used only the findings (coordinates and degree of suspicion for each finding) and performance information of each system.

Figure 4.3 shows a snapshot of CADe output for lung nodule referred as a pure ground-glass nodule (GGN) [42]. According to the Fleischner Society glossary of terms for thoracic imaging, a GGN is defined as “a hazy increased attenuation in the lung that does not obliterate the bronchial and vascular margins” [67]. The term “pure GGN” represents nodules of only ground-glass attenuation on CT, whereas the term “part-solid GGN” refers to nodules comprising both ground-glass and solid attenuation areas [16]. The term “subsolid” nodules includes both pure GGN and part-solid GGN [16,68]. Subsolid nodules are increasingly being detected on LDCT screening and have a high likelihood of representing adenocarcinomas [16, 68] For example, Henschke and colleagues reported that 34% of detected subsolid nodules proved malignant, while only 7% of solid nodules proved malignant [68]. However, most CADe schemes were focused on detecting solid nodules because detecting subsolid nodules with low attenuation is not an easy task. The currently commercially available CADe systems are designed and optimized for detecting solid nodules and have a low sensitivity for subsolid nodules [65, 69]. Improving detection accuracy of subsolid nodules is one of the important research areas in CADe [20, 42, 55, 75].

Several studies have investigated the effect of CADe on clinicians’ interpretation of CT examinations for the detection of lung nodules [52, 64, 65, 70-75]. These studies compared the clinicians’ performances without and with CADe systems. The assessment approaches included evaluation with use of sensitivity and FP rate, use of ROC analysis, use of localization ROC (LROC), and free-response ROC (FROC) analysis. The LROC and FROC analyses are categorized as location-specific ROC analyses, which require data such as the identified location of suspected regions and a rating for each region [77]. The LROC analysis constrains a mark-rating pair of the most suspicious region in an image, whereas the FROC analysis can deal with an arbitrary number of mark-rating pairs for each image [76-78]. For FROC data, either an alternative FROC (AFROC) method or Jackknife FROC (JAFROC) method is used for evaluating clinicians’ performance in the detection of lung nodules without and with CADe systems [79, 80]. The AFROC method reduces FROC data to pseudo-ROC data that can be analyzed by tools developed for ROC analysis [79]. The AFROC analysis can be used to analyze the data acquired from FOC studies. The AFROC curve is a plot of lesion localization fraction against false-positive fraction, and the area under the AFROC curves is used to define lesion detestability [81]. The JAFROC method provides a figure of merit (FOM) to summarize FROC and statistically compares the FOMs of clinician/system performances [80, 82].

Another growing research area of interest in CADe is nodule follow-up [49, 8386]. Kubo et al. described a comparative reading system using baseline and follow-up 10-mm section scans of the same patients [85]. Their approach comprised section matching, nodule matching, and quantitative evaluation steps for evaluating

Snapshot of CADe output for lung nodule

Fig. 4.3 Snapshot of CADe output for lung nodule. Detected pure ground-glass nodule is marked with a circle. (a) Scout image (A non-tomographic image). (b) Transverse image. (c) Sagittal image. (d) Coronal image

the growth and shrinkage of lung nodules. The section matching rate of 99.8% was obtained through performance evaluation using CT scans for 85 patients with 198 nodules. Brown et al. described a patient-specific model, which was derived from the segmentation results of lung architecture and lung nodules using baseline 10mm section scans [49]. Their model guided the segmentation of subsequent scans for relocalization of previously detected nodules. Their pilot study reported that the correct relocalization rate was 81% using the follow-up scans of 27 nodules. Ko and Betke described a method to automatically detect lung nodules on a CT scan with 5-10-mm section thicknesses and then relocalize them on follow-up scans [83]. Their method involved global registration of baseline and follow-up scans by translation and rotation to align the centroids of lung architecture landmarks and applied the same transformation to the detected nodules on the baseline scan for relocalizing the nodule on the follow-up scan. The preliminary testing for assessment of nodule change over time was correlated between the radiologist and the computer (Spearman rank correlation coefficient, 0.932). Lee et al. evaluated the performance of automated matching software (LungCARE VB20, Siemens Medical Solutions, Forchheim, Germany) of 30 metastatic patients imaged with two serial CT scans with a 5-mm section thickness [84]. This study included 30 consecutive enrolled patients with metastatic pulmonary nodules from a pulmonary primary tumor (n = 9) or a non-pulmonary primary tumor (n = 21). The overall matching rate of a total of 210 nodules was 66.7%. In a recent study, Aoki et al. introduced a temporal subtraction (TS) method to enhance interval changes in their CADe scheme and assessed the effects of the TS method on radiologist performance in nodule detection on thin-section CT images with 2-mm section thickness [86]. Their observer study reported that the average sensitivity of the eight participants improved from 73.5% to 83.4% with an FP rate of 0.15 per case in the detection of nodules using 30 nodules ranging in size from 5 to 19 mm.

Impact of CADe Systems on Clinicians’ Performance for Detecting Lung Nodules on CT Images

The Imaging Physics Committee of the American Association of Physicists in Medicine (AAPM) formed a Computer-Aided Detection in Diagnostic Imaging Subcommittee (CADSC) to develop recommendations on approaches for assessing computer-aided detection and diagnosis (CADe/CADx) system performance [87, 88]. The CADSC reported that these were the following major areas for assessing CADe/CADx systems: training and test datasets, reference standards, mark-labeling criteria, stand-alone performance assessment metrics and methodologies, reader performance assessment metrics and methodologies, and study sample size estimation [87]. In the development of CADe systems for detecting lung nodules on CT images, not only the evaluation of a stand-alone CADe system but also the evaluation of the effects of the CADe system on clinician accuracy is indispensable from the viewpoint of the fundamental role of CADe systems, which is to support a clinician who has the final responsibility to make the decision for each case.

The results of the investigation of the effect of CADe systems on clinicians’ interpretation of CT examinations that are covered in this session are briefly summarized as follows: Awai et al. reported that average areas under the AFROC curves without and with their CADe system were 0.64 and 0.67, respectively, using a dataset of 50 CT examinations with ten observers [52]. The difference was statistically significant. Li et al. reported that the average area under the ROC curve improved from 0.76 to 0.85 for 14 radiologists with their CADe system. The LROC curve also showed improvement using a dataset [70]. Brown et al. evaluated the incremental effects of their CADe system using a dataset of eight CT examinations with 202 observers at a national radiology meeting. The JAFROC analysis involving 13 observers who read all cases indicated an improvement of 22% in FOM with the CADe system, which was not statistically significant [71]. Das et al. compared the effectiveness of two commercial CADe systems (ImageCheker CT, R2 Technologies and Nodule Enhanced Viewing (NEV), Siemens Medial Solutions) using a dataset of 116 small nodules (a mean diameter of 3.4 mm) with three radiologists [64]. The sensitivities with the CADe systems were improved for all radiologists. Hirose et al. reported that the average sensitivity improved from 39.5% to 81.0% with a decrease in the average number of FPs from 0.14 to 0.89 per case using a dataset of 21 patients with six radiologists [72]. The average FOM values without and with a commercial CADe (ZIOCAD LE version 1.15, Ziosoft Inc., Tokyo, Japan) were 0.390 and 0.845, respectively, and the difference was statistically significant. White et al. reported a multicenter study involving 109 patients from four sites to evaluate the performance of CADe as a second reader. The average increase in area under the ROC curve for ten radiologists with a CADe system (Philips Extended Brilliance Workspace, Philips Healthcare, Enthoven, the Netherlands) (not commercially available in the USA) was 1.9% for a 95% confidence interval (0.8-8.0%) [73]. Sahiner et al. reported the impact of their CADe system on six radiologists’ performance using 85 CT examinations stratified by nodule size [74]. Their study results, which evaluated the sensitivity and FOMs without and with the CADe system, indicated that the CADe system improved radiologists’ performance for detecting lung nodules smaller than 5 mm. Yanagawa et al. evaluated the impact of a commercial CADe system (Lung VCAR, GE Healthcare, Milwaukee WI, USA) on radiologists’ performance in the detection of lung nodules with or without ground-glass opacity (GGO), using a dataset of 48 patients with three radiologists [65]. They established the reference standard to be used in the evaluation by a consensus panel of the radiologists. The pulmonary nodules in their dataset were categorized into three patterns (GGO, solid, or part solid). The FOM values without and with the CADe system were significantly different for overall patterns and solid pattern of nodules. Godoy et al. evaluated the effect of a commercial prototype CADe system (VD10A, Siemens Healthcare) on the detection of subsolid and solid nodules on thin- and thick-section CT images using a dataset of 46 CT examinations with four radiologists [75]. Their study included 155 nodules (74 solid nodules, 22 part GGNs, and 59 pure GGNs) with a 5.5-mm average diameter and used various combinations of thick (5 mm) and thin (0.67 or 1 mm) sections. Sensitivities for both subsolid and solid nodules were significantly improved with the CADe system.

On the basis of these observations, there are a number of initiatives underway to improve CADe systems for implementation and utilization in clinical practice as a second reader for increasing the detection of lung nodules in LDCT screening.

 
Source
< Prev   CONTENTS   Source   Next >

Related topics