A Novel Approach to Nodule Feature Optimization on Thin Section ...

A Novel Approach to Nodule Feature Optimization on Thin Section ... A Novel Approach to Nodule Feature Optimization on Thin Section ...

engineering.utep.edu
from engineering.utep.edu More from this publisher
12.07.2015 Views

A ong>Novelong> ong>Approachong> ong>toong> ong>Noduleong> ong>Featureong> ong>Optimizationong>on Thin Section Thoracic CT 1Ravi Samala, MS, Wilfrido Moreno, PhD, Yuncheng You, PhD, Wei Qian, PhDRationale and Objectives. An analysis for the optimum selection of image features in feature domain ong>toong> represent lung noduleswas performed, with implementation inong>toong> a classification module of a computer-aided diagnosis system.Materials and Methods. Forty-two regions of interest obtained from 38 cases with effective diameters of 3 ong>toong> 8.5 mm were used.On the basis of image characteristics and dimensionality, 11 features were computed. Nonparametric correlation coefficients,multiple regression analysis, and principal-component analysis were used ong>toong> map the relation between the represented featuresfrom four radiologists and the computed features. An artificial neural network was used for the classification of benign andmalignant nodules ong>toong> test the hypothesis obtained from the mapping analysis.Results. Correlation coefficients ranging from 0.2693 ong>toong> 0.5178 were obtained between the radiologists’ annotations and thecomputed features. Of the 11 features used, three were found ong>toong> be redundant when both nodule and non-nodule cases were used,and five were found redundant when nodule or non-nodule cases were used. Combination of analysis from correlation coefficients,regression analysis, principal-component analysis, and the artificial neural network resulted in the selection of optimumfeatures ong>toong> achieve F-test values of 0.821 and 0.643 for malignant and benign nodules, respectively.Conclusion. This study demonstrates that for the optimum selection of features, each feature should be analyzed individually andcollectively ong>toong> evaluate the impact on the computer-aided diagnosis system on the basis of its class representation. This methodologywill ultimately aid in improving the generalization capability of a classification module for early lung cancer diagnosis.Key Words. Correlation; multiple regression; principal-component analysis; artificial neural network; nodule features.ª AUR, 2009Computer-aided diagnosis (CAD) for thoracic computed ong>toong>mographicimaging ong>toong> detect lung nodules plays a vital role inearly cancer diagnosis and thus aids in reducing the mortalityrate significantly (1). The most crucial stage in any CADsystem is the final classification module, which differentiatesmalignant lesions from benign lesions using their inherentimage characteristics. The input ong>toong> such a classificationmodule is a set of image features that represent the nodulecharacteristics. These characteristic features are extractedAcad Radiol 2009; 16:418–4271 From the College of Engineering (R.S., W.M.) and the College of Arts andSciences (Y.Y.), University of South Florida, Tampa, FL; and the College ofEngineering, University of Texas at El Paso, 500 West University Avenue, ElPaso, TX 79968 (W.Q.). Received June 18, 2008; accepted Ocong>toong>ber 15, 2008.Address correspondence ong>toong>: W.Q. e-mail: wqian@utep.eduª AUR, 2009doi:10.1016/j.acra.2008.10.009using a mathematical approach that simulates the humanrepresentation of nodule properties.The Fleischner Society (2) defines a lung nodule fromtwo perspectives. As seen by a pathologist, it is a ‘‘small,approximately spherical, circumscribed focus of abnormaltissue,’’ and for a radiologist, it is ‘‘a round opacity, at leastmoderately well marginated and no greater than 3cm inmaximum diameter.’’ Non-nodules have a visual presentationthat is very similar ong>toong> nodules but are not cancerous. Theassessments of nodule characteristics done by radiologistsand pathologists differ from that of the image-processingperspective (3). However, because the image features calculatedwithin a CAD system are motivated largely by a radiologist’sperspective, there exists an inherent correlationbetween human visualization and the mathematical approachong>toong> feature characterization.Mathematically, the features used in a CAD system arewell defined for a nodule, but they are not accurate enough418

A <str<strong>on</strong>g>Novel</str<strong>on</strong>g> <str<strong>on</strong>g>Approach</str<strong>on</strong>g> <str<strong>on</strong>g>to</str<strong>on</strong>g> <str<strong>on</strong>g>Nodule</str<strong>on</strong>g> <str<strong>on</strong>g>Feature</str<strong>on</strong>g> <str<strong>on</strong>g>Optimizati<strong>on</strong></str<strong>on</strong>g><strong>on</strong> <strong>Thin</strong> Secti<strong>on</strong> Thoracic CT 1Ravi Samala, MS, Wilfrido Moreno, PhD, Yuncheng You, PhD, Wei Qian, PhDRati<strong>on</strong>ale and Objectives. An analysis for the optimum selecti<strong>on</strong> of image features in feature domain <str<strong>on</strong>g>to</str<strong>on</strong>g> represent lung noduleswas performed, with implementati<strong>on</strong> in<str<strong>on</strong>g>to</str<strong>on</strong>g> a classificati<strong>on</strong> module of a computer-aided diagnosis system.Materials and Methods. Forty-two regi<strong>on</strong>s of interest obtained from 38 cases with effective diameters of 3 <str<strong>on</strong>g>to</str<strong>on</strong>g> 8.5 mm were used.On the basis of image characteristics and dimensi<strong>on</strong>ality, 11 features were computed. N<strong>on</strong>parametric correlati<strong>on</strong> coefficients,multiple regressi<strong>on</strong> analysis, and principal-comp<strong>on</strong>ent analysis were used <str<strong>on</strong>g>to</str<strong>on</strong>g> map the relati<strong>on</strong> between the represented featuresfrom four radiologists and the computed features. An artificial neural network was used for the classificati<strong>on</strong> of benign andmalignant nodules <str<strong>on</strong>g>to</str<strong>on</strong>g> test the hypothesis obtained from the mapping analysis.Results. Correlati<strong>on</strong> coefficients ranging from 0.2693 <str<strong>on</strong>g>to</str<strong>on</strong>g> 0.5178 were obtained between the radiologists’ annotati<strong>on</strong>s and thecomputed features. Of the 11 features used, three were found <str<strong>on</strong>g>to</str<strong>on</strong>g> be redundant when both nodule and n<strong>on</strong>-nodule cases were used,and five were found redundant when nodule or n<strong>on</strong>-nodule cases were used. Combinati<strong>on</strong> of analysis from correlati<strong>on</strong> coefficients,regressi<strong>on</strong> analysis, principal-comp<strong>on</strong>ent analysis, and the artificial neural network resulted in the selecti<strong>on</strong> of optimumfeatures <str<strong>on</strong>g>to</str<strong>on</strong>g> achieve F-test values of 0.821 and 0.643 for malignant and benign nodules, respectively.C<strong>on</strong>clusi<strong>on</strong>. This study dem<strong>on</strong>strates that for the optimum selecti<strong>on</strong> of features, each feature should be analyzed individually andcollectively <str<strong>on</strong>g>to</str<strong>on</strong>g> evaluate the impact <strong>on</strong> the computer-aided diagnosis system <strong>on</strong> the basis of its class representati<strong>on</strong>. This methodologywill ultimately aid in improving the generalizati<strong>on</strong> capability of a classificati<strong>on</strong> module for early lung cancer diagnosis.Key Words. Correlati<strong>on</strong>; multiple regressi<strong>on</strong>; principal-comp<strong>on</strong>ent analysis; artificial neural network; nodule features.ª AUR, 2009Computer-aided diagnosis (CAD) for thoracic computed <str<strong>on</strong>g>to</str<strong>on</strong>g>mographicimaging <str<strong>on</strong>g>to</str<strong>on</strong>g> detect lung nodules plays a vital role inearly cancer diagnosis and thus aids in reducing the mortalityrate significantly (1). The most crucial stage in any CADsystem is the final classificati<strong>on</strong> module, which differentiatesmalignant lesi<strong>on</strong>s from benign lesi<strong>on</strong>s using their inherentimage characteristics. The input <str<strong>on</strong>g>to</str<strong>on</strong>g> such a classificati<strong>on</strong>module is a set of image features that represent the nodulecharacteristics. These characteristic features are extractedAcad Radiol 2009; 16:418–4271 From the College of Engineering (R.S., W.M.) and the College of Arts andSciences (Y.Y.), University of South Florida, Tampa, FL; and the College ofEngineering, University of Texas at El Paso, 500 West University Avenue, ElPaso, TX 79968 (W.Q.). Received June 18, 2008; accepted Oc<str<strong>on</strong>g>to</str<strong>on</strong>g>ber 15, 2008.Address corresp<strong>on</strong>dence <str<strong>on</strong>g>to</str<strong>on</strong>g>: W.Q. e-mail: wqian@utep.eduª AUR, 2009doi:10.1016/j.acra.2008.10.009using a mathematical approach that simulates the humanrepresentati<strong>on</strong> of nodule properties.The Fleischner Society (2) defines a lung nodule fromtwo perspectives. As seen by a pathologist, it is a ‘‘small,approximately spherical, circumscribed focus of abnormaltissue,’’ and for a radiologist, it is ‘‘a round opacity, at leastmoderately well marginated and no greater than 3cm inmaximum diameter.’’ N<strong>on</strong>-nodules have a visual presentati<strong>on</strong>that is very similar <str<strong>on</strong>g>to</str<strong>on</strong>g> nodules but are not cancerous. Theassessments of nodule characteristics d<strong>on</strong>e by radiologistsand pathologists differ from that of the image-processingperspective (3). However, because the image features calculatedwithin a CAD system are motivated largely by a radiologist’sperspective, there exists an inherent correlati<strong>on</strong>between human visualizati<strong>on</strong> and the mathematical approach<str<strong>on</strong>g>to</str<strong>on</strong>g> feature characterizati<strong>on</strong>.Mathematically, the features used in a CAD system arewell defined for a nodule, but they are not accurate enough418


Academic Radiology, Vol 16, No 4, April 2009NOVEL APPROACH TO CT FEATURE OPTIMIZATIONFigure 2.His<str<strong>on</strong>g>to</str<strong>on</strong>g>gram of nodule size.Figure 1. Proposed identificati<strong>on</strong> process flowchart. CT, computed<str<strong>on</strong>g>to</str<strong>on</strong>g>mographic; ROI, regi<strong>on</strong> of interest.<str<strong>on</strong>g>to</str<strong>on</strong>g> distinguish between a benign and a malignant nodule,which results in a high false-positive detecti<strong>on</strong> rate. Aspointed out by Sluimer et al (1), currently, much of theresearch in lung cancer detecti<strong>on</strong> is focused <strong>on</strong> false-positivereducti<strong>on</strong> (ie, <str<strong>on</strong>g>to</str<strong>on</strong>g> avoid classifying a benign nodule asa malignant nodule). As many as 50 features can be used <str<strong>on</strong>g>to</str<strong>on</strong>g>detect lung nodules (4–9), but there are disadvantages ofusing a large number of features <str<strong>on</strong>g>to</str<strong>on</strong>g> train and test any CADsystem, making optimal feature selecti<strong>on</strong> an essential process.So far, Aoyama et al (10) have used Wilks’s lambda,which is based <strong>on</strong> in-class variati<strong>on</strong>, <str<strong>on</strong>g>to</str<strong>on</strong>g> select seven featuresfrom a set of 43. Raicu et al (3) used a correlati<strong>on</strong> approach<str<strong>on</strong>g>to</str<strong>on</strong>g> analyze the mapping between a radiologist’s descripti<strong>on</strong>and computed image features of nodules. However, thesemethods may not necessarily prove that the mappings canbe directly applied <str<strong>on</strong>g>to</str<strong>on</strong>g> a CAD system. Hence, in additi<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g>correlati<strong>on</strong> analysis, our method exploited the n<strong>on</strong>-nodulecharacteristics and their inter-relati<strong>on</strong>ships and the overalleffect <strong>on</strong> a simple CAD system.The goal was <str<strong>on</strong>g>to</str<strong>on</strong>g> find the correlati<strong>on</strong> between image featuresof a nodule and a n<strong>on</strong>-nodule using the proposed CADsystem, as well as the correlati<strong>on</strong> between human- andmachine-interpreted features. In this work, we analyze andreport the impact of this study <strong>on</strong> the CAD system’s performance.Figure 1 shows the flowchart for this process. Differentanalyses were performed <strong>on</strong> the feature relati<strong>on</strong>ships,and a rule-based knowledge system was used <str<strong>on</strong>g>to</str<strong>on</strong>g> select theoptimum feature set giving the highest classificati<strong>on</strong> accuracy.Briefly, the methodology we present here gives thec<strong>on</strong>siderati<strong>on</strong>s for selecting optimum features, thus suggestingits general applicability in other CAD-based imagingmodalities.MATERIALS AND METHODSDatabaseThoracic computed <str<strong>on</strong>g>to</str<strong>on</strong>g>mographic images were obtainedfrom the Nati<strong>on</strong>al Cancer Imaging Archive of the Nati<strong>on</strong>alCancer Institute (Bethesda, MD). The database is a collecti<strong>on</strong>of clinical informati<strong>on</strong> initiated by the Lung Imaging DatabaseC<strong>on</strong>sortium (LIDC) <str<strong>on</strong>g>to</str<strong>on</strong>g> screen patients with lung cancer(11,12). Each patient data set is provided with up <str<strong>on</strong>g>to</str<strong>on</strong>g> fourradiologist annotati<strong>on</strong>s (13). The database used for this projectc<strong>on</strong>sisted of 29 data sets with an in-depth (al<strong>on</strong>g the zaxis) resoluti<strong>on</strong> of


SAMALA ET AL Academic Radiology, Vol 16, No 4, April 2009Table 1Radiologist’s Annotati<strong>on</strong>s Used by the Lung Imaging Database C<strong>on</strong>sortium<str<strong>on</strong>g>Feature</str<strong>on</strong>g> Descripti<strong>on</strong> Scale1 Subtlety Radiologist-assessed subtlety of nodule 1 = extremely subtle, 5 = obvious2 Internal structure Radiologist-assessed internal structure score of 1 = soft tissue, 2 = fluid, 3 = fat, 4 = airnodule3 Calcificati<strong>on</strong> Radiologist-assessed of internal calcificati<strong>on</strong> of nodule 1 = popcorn, 2 = laminated, 3 = solid, 4 = n<strong>on</strong>central,5 = central, 6 = absent4 Sphericity Radiologist-assessed shape of nodule in terms of its 1 = linear, 3 = ovoid, 5 = roundroundness/sphericity, with <strong>on</strong>ly three terms defined5 Margin Radiologist-assessed margin of nodule, with <strong>on</strong>ly the 1 = poorly defined, 5 = sharpextreme values explicitly defined6 Lobulati<strong>on</strong> Radiologist-assessed nodule lobulati<strong>on</strong>, with <strong>on</strong>ly the 1 = marked, 5 = no lobulati<strong>on</strong>extreme values explicitly defined7 Spiculati<strong>on</strong> Radiologist-assessed nodule spiculati<strong>on</strong>, with <strong>on</strong>ly theextreme values explicitly defined1 = marked (marked spiculati<strong>on</strong>), 5 = no spiculati<strong>on</strong>(no spiculati<strong>on</strong>)8 Texture Radiologist-assessed nodule internal texture, with <strong>on</strong>lythree terms defined1 = n<strong>on</strong>solid/ground class opacity, 3 = part solid/mixed, 5 = solid texture9 Malignancy Radiologist subjective assessment of likelihood ofmalignancy of this nodule (assuming 60-year-oldmale smoker)1 = highly unlikely for cancer, 2 = moderately unlikelyfor cancer, 3 = indeterminate likelihood,4 = moderately suspicious for cancer,5 = highly suspicious for cancertraining, and testing phases. The effective diameters of thenodules ranged from 3 <str<strong>on</strong>g>to</str<strong>on</strong>g> 8.5 mm, as shown in Figure 2.<str<strong>on</strong>g>Nodule</str<strong>on</strong>g> Segmentati<strong>on</strong>The LIDC database provided the regi<strong>on</strong> of interest (ROI)of each nodule and the approximate centroid of each n<strong>on</strong>nodule.Because the n<strong>on</strong>-nodules were not provided withROIs, we developed a nodule extracti<strong>on</strong> methodology forboth nodules and n<strong>on</strong>-nodules, so that the method used <str<strong>on</strong>g>to</str<strong>on</strong>g>segment ROIs is identical.The steps in the demarcati<strong>on</strong> of ROIs include calculatingthe approximate centroid of each nodule, followed by usingthese centroids as well as the centroids of the n<strong>on</strong>-nodules <str<strong>on</strong>g>to</str<strong>on</strong>g>segment the three-dimensi<strong>on</strong>al regi<strong>on</strong>. Adaptive clustering<strong>on</strong> the basis of directi<strong>on</strong>al wavelets (14–16) was adapted <str<strong>on</strong>g>to</str<strong>on</strong>g>the computed <str<strong>on</strong>g>to</str<strong>on</strong>g>mographic volume <str<strong>on</strong>g>to</str<strong>on</strong>g> extract the ROIsaround the centroids. The delineati<strong>on</strong> of each nodule alwaysfell somewhere in between the ROIs of the four radiologists,indicating that the accuracy of the segmentati<strong>on</strong> process waswithin the acceptable standards.<str<strong>on</strong>g>Nodule</str<strong>on</strong>g> CharacteristicsThe definiti<strong>on</strong> of a nodule, as defined earlier, is ambiguous,thus leading <str<strong>on</strong>g>to</str<strong>on</strong>g> the use of multiple feature descrip<str<strong>on</strong>g>to</str<strong>on</strong>g>rs.Table 1 lists all the assessment features of the nodule characteristicsd<strong>on</strong>e by the radiologists in the LIDC collecti<strong>on</strong>process.Table 2Classificati<strong>on</strong> of Image <str<strong>on</strong>g>Feature</str<strong>on</strong>g>s Used in the ProjectSphericity (3D)Shape Size Intensity TextureMaximumcompactness (2D)Maximumeccentricity (2D)Volume(3D)Area(2D)Mean pixelvalue (3D)SD (3D)Maximum meanpixel value (2D)In-plane SD (2D)SD, standard deviati<strong>on</strong>; 3D, three-dimensi<strong>on</strong>al;2D, two-dimensi<strong>on</strong>al.C<strong>on</strong>trast (3D)In-planec<strong>on</strong>trast (2D)Some of the features are by definiti<strong>on</strong> identical between thetwo perspectives, such as the sphericity and malignancy ofradiologists’ annotati<strong>on</strong>s with the sphericity and volume of thecomputed features, respectively. For other features, the goal is<str<strong>on</strong>g>to</str<strong>on</strong>g> find the extent of correlati<strong>on</strong>. The applicati<strong>on</strong> of this analysiscan be used in efficient CAD design, because the optimumselecti<strong>on</strong> of nodule features plays an important role in fast andaccurate cancer detecti<strong>on</strong>. This process will also help reducethe number of input features <str<strong>on</strong>g>to</str<strong>on</strong>g> the CAD system, <str<strong>on</strong>g>to</str<strong>on</strong>g> decreasethe computati<strong>on</strong>al need and decrease classificati<strong>on</strong> errorscaused by noise. It is also difficult <str<strong>on</strong>g>to</str<strong>on</strong>g> define accurate decisi<strong>on</strong>boundaries in a large dimensi<strong>on</strong>al space (‘‘the curse of dimensi<strong>on</strong>ality’’)(17). As noted by Suzuki et al (18), an increasein the number of features increases the requirement for trainingcases of an artificial neural network (ANN) classifier420


Academic Radiology, Vol 16, No 4, April 2009NOVEL APPROACH TO CT FEATURE OPTIMIZATIONTable 3Shapiro-Wilk Hypothesis Test for Normality of the DataTable 5Redundant <str<strong>on</strong>g>Feature</str<strong>on</strong>g>s From Principal-Comp<strong>on</strong>ent Analysis<str<strong>on</strong>g>Feature</str<strong>on</strong>g>s P Test (P < .05)Shapiro-Wilk TestStatistic (W)CategoryVariati<strong>on</strong>ThresholdNumber ofRedundant <str<strong>on</strong>g>Feature</str<strong>on</strong>g>sVolume .6103 0 0.9575Sphericity .1006 0 0.9266Mean .0000 1 0.6723SD .1985 0 0.9381C<strong>on</strong>trast .0000 1 0.6687Area .0000 1 0.6471Compactness .6468 0 0.9586Maximum mean .1422 0 0.9324Maximum c<strong>on</strong>trast .0859 0 0.9235Maximum SD .0000 1 0.6608Eccentricity .0000 1 0.7093SD, standard deviati<strong>on</strong>.Table 4Sphericity Measurements for Figure 5SphericityCase Radiologist Calculated1 3 0.17592 5 0.1797exp<strong>on</strong>entially. Thus, the additi<strong>on</strong> of new input features willdecrease the accuracy of a classifier. Also, there could befeatures highly correlated with each other, causing no changein the final accuracy but increasing the computati<strong>on</strong> time.On the basis of previous experience (13,19), a <str<strong>on</strong>g>to</str<strong>on</strong>g>tal of 11features were used; the careful selecti<strong>on</strong> was based <strong>on</strong> twocategories. The first category involved the image characteristics(shape, size, intensity, and texture), and the sec<strong>on</strong>dcategory was based <strong>on</strong> dimensi<strong>on</strong>ality (two and three dimensi<strong>on</strong>s).These features, as menti<strong>on</strong>ed in Table 2, are themost fundamental of all the features that have been used inthe literature so far.Correlati<strong>on</strong> AnalysisThe purpose of this step is <str<strong>on</strong>g>to</str<strong>on</strong>g> find if each of the radiologists’annotati<strong>on</strong>s is mapped <str<strong>on</strong>g>to</str<strong>on</strong>g> at least <strong>on</strong>e image feature, sothat an efficient initial selecti<strong>on</strong> of parameters is d<strong>on</strong>e. Becauseit is assumed that the radiologists’ annotated featuresare c<strong>on</strong>sidered the ‘‘best’’ features, high correlati<strong>on</strong> of eachcomputed feature with at least <strong>on</strong>e radiologist’s annotatedfeature improves the c<strong>on</strong>fidence level of that particularfeature.Parametric Versus N<strong>on</strong>parametricThis step is <str<strong>on</strong>g>to</str<strong>on</strong>g> decide between the use of parametric andn<strong>on</strong>parametric correlati<strong>on</strong> coefficients. Am<strong>on</strong>g the 11 image<str<strong>on</strong>g>Nodule</str<strong>on</strong>g> 2% 5N<strong>on</strong>-nodule 2% 5Both nodule and n<strong>on</strong>-nodule 2% 3Radiologist 2% 4Table 6Classificati<strong>on</strong> Results of Three and Two Dimensi<strong>on</strong>s andCombined <str<strong>on</strong>g>Feature</str<strong>on</strong>g>sCorrectlyClassifiedIncorrectlyClassifiedF(nodule)F(n<strong>on</strong>-nodule)3D 76.1905 23.8095 0.828 0.6152D 78.5714 21.4286 0.847 0.643D and 2D 69.0476 30.9524 0.787 0.4353D, three-dimensi<strong>on</strong>al; 2D, two-dimensi<strong>on</strong>al.features, six followed a normal distributi<strong>on</strong> (parametric), andthe remaining five deviated from a normal distributi<strong>on</strong>(n<strong>on</strong>parametric). Hence, for correlati<strong>on</strong> analysis, instead ofthe parametric (Pears<strong>on</strong>’s product-moment) correlati<strong>on</strong> coefficient,a n<strong>on</strong>parametric (Spearman’s) correlati<strong>on</strong> coefficientwas used. For a database of this size, the Shapiro-Wilktest was chosen <str<strong>on</strong>g>to</str<strong>on</strong>g> test for n<strong>on</strong>-normality. A significance levelof .05 (ie, the deduced hypothesis is true 95% of the time) wasused. From Table 3, it is evident that the mean, c<strong>on</strong>trast, area,maximum standard deviati<strong>on</strong>, and eccentricity features deviatedfrom normal distributi<strong>on</strong> (for test = 1). Even thoughthe Shapiro-Wilk test does not guarantee normality, it definitelyindicates n<strong>on</strong>-normality.Multiple Regressi<strong>on</strong> AnalysisThis analysis is used <str<strong>on</strong>g>to</str<strong>on</strong>g> determine the influence of a subse<str<strong>on</strong>g>to</str<strong>on</strong>g>f two- and three-dimensi<strong>on</strong>al parameters as a whole <strong>on</strong> theradiologists’ annotati<strong>on</strong>s. The general purpose of multipleregressi<strong>on</strong> is <str<strong>on</strong>g>to</str<strong>on</strong>g> learn more about the relati<strong>on</strong>ship betweenseveral independent (predic<str<strong>on</strong>g>to</str<strong>on</strong>g>r) variables and a dependent(criteri<strong>on</strong>) variable. In this case, it is used <str<strong>on</strong>g>to</str<strong>on</strong>g> understandwhich computed image feature has the greatest effect <strong>on</strong>a particular radiologist’s annotati<strong>on</strong> when modeled with allthe image features.Classificati<strong>on</strong>For any classificati<strong>on</strong> problem, a given image featureis c<strong>on</strong>sidered <str<strong>on</strong>g>to</str<strong>on</strong>g> be good <strong>on</strong>ly if it has enough informati<strong>on</strong><str<strong>on</strong>g>to</str<strong>on</strong>g> distinguish between classes. A single feature by itselfis insufficient for classificati<strong>on</strong>; several features are used421


SAMALA ET AL Academic Radiology, Vol 16, No 4, April 2009Table 7Classificati<strong>on</strong> Using All <str<strong>on</strong>g>Feature</str<strong>on</strong>g>s Except Maximum EccentricityCorrectly Classified Incorrectly Classified F (nodule) F (n<strong>on</strong>-nodule) Correlati<strong>on</strong> with RadiologistAll but maximum eccentricity 69.0476 30.9524 0.787 0.435 0.2693Table 8F-Test Results for Various <str<strong>on</strong>g>Feature</str<strong>on</strong>g>s<str<strong>on</strong>g>Feature</str<strong>on</strong>g>sF(nodule)F(n<strong>on</strong>-nodule)Correlati<strong>on</strong>with RadiologistAll computed features 0.8 0.5 N/AVolume 0.8 0 0.5027C<strong>on</strong>trast 0.8 0 0.3739Compactness 0.765 0 0.5178Sphericity 0.75 0.5 0.4586Area 0.831 0.421 0.4739Maximum c<strong>on</strong>trast 0.767 0.417 0.3397Maximum eccentricity 0.733 0.333 0.2693Mean 0.698 0.095 0.4947SD 0.698 0.095 0.4123Maximum mean 0.698 0.095 0.4675Maximum SD 0.738 0.105 0.3961Sphericity, area,0.821 0.643 N/Amaximum c<strong>on</strong>trast,maximum eccentricityVolume, compactness,c<strong>on</strong>trast0.781 0.3 N/AN/A, not available; SD, standard deviati<strong>on</strong>.Table 9F-Test Results Using All <str<strong>on</strong>g>Feature</str<strong>on</strong>g>s Except Those Menti<strong>on</strong>edin the TableAll <str<strong>on</strong>g>Feature</str<strong>on</strong>g>s Excluding F (nodule) F (n<strong>on</strong>-nodule)Volume 0.8 0.5Sphericity 0.814 0.56Mean 0.822 0.522SD 0.759 0.462C<strong>on</strong>trast 0.793 0.538Area 0.833 0.583Compactness 0.8 0.5Maximum mean 0.8 0.5Maximum c<strong>on</strong>trast 0.833 0.583Maximum eccentricity 0.787 0.435Maximum SD 0.759 0.462SD, standard deviati<strong>on</strong>.by various classificati<strong>on</strong> algorithms. Ideally, thecorrelati<strong>on</strong> analysis between the image features and theannotati<strong>on</strong>s is c<strong>on</strong>sidered <str<strong>on</strong>g>to</str<strong>on</strong>g> improve CAD performance.This analysis was used in the selecti<strong>on</strong> of features thatbest define the distincti<strong>on</strong> between a nodule and an<strong>on</strong>-nodule.However, for a classificati<strong>on</strong> algorithm used in a CADsystem, the representati<strong>on</strong> of a nodule as well as a n<strong>on</strong>noduleneeds <str<strong>on</strong>g>to</str<strong>on</strong>g> be used for training. Finding the correlati<strong>on</strong>between annotated and calculated descrip<str<strong>on</strong>g>to</str<strong>on</strong>g>rs of a noduleal<strong>on</strong>e is insufficient <str<strong>on</strong>g>to</str<strong>on</strong>g> classify benign and malignantnodules.To test the hypothesis from correlati<strong>on</strong> analysis, a threelayerfeed-forward neural network with a n<strong>on</strong>linear sigmoidactivati<strong>on</strong> functi<strong>on</strong> using a back-propagati<strong>on</strong> learning algorithmwas used <str<strong>on</strong>g>to</str<strong>on</strong>g> classify the abnormal and normal lungnodules. The objective of the classificati<strong>on</strong> step was <str<strong>on</strong>g>to</str<strong>on</strong>g>verify which combinati<strong>on</strong> of features resulted in the bestclass representati<strong>on</strong> but not <str<strong>on</strong>g>to</str<strong>on</strong>g> improve the overall CADperformance at this point. This helps in rating the inputimage features <str<strong>on</strong>g>to</str<strong>on</strong>g>ward efficient classificati<strong>on</strong>. A k-fold crossvalidati<strong>on</strong> with 500 iterati<strong>on</strong>s was used for all the combinati<strong>on</strong>sof input features. In this method, the data set isdivided in<str<strong>on</strong>g>to</str<strong>on</strong>g> k parts, with k 1 parts used for training andthe remaining part used as a testing set. This process isrepeated k times, with the c<strong>on</strong>straint that each part is usedexactly <strong>on</strong>ce for testing. This ensures that all the data areeffectively used for both training and testing purposes incase of a small data set. A k value of 4 was chosen <str<strong>on</strong>g>to</str<strong>on</strong>g> dividethe training and testing sets in<str<strong>on</strong>g>to</str<strong>on</strong>g> 75% and 25%, respectively.The parameter F test was used for this purpose <str<strong>on</strong>g>to</str<strong>on</strong>g>grade the quality of each class representati<strong>on</strong> after testingprocess. The closer the value of F is <str<strong>on</strong>g>to</str<strong>on</strong>g> 1, the better is therepresentati<strong>on</strong> of the class.Knowledge BaseKnowledge base is an intermediate process <str<strong>on</strong>g>to</str<strong>on</strong>g> decidewhether the results from correlati<strong>on</strong> analysis can be used <str<strong>on</strong>g>to</str<strong>on</strong>g>select the optimum features. All the results of correlati<strong>on</strong>analysis, multiple regressi<strong>on</strong>, and principal-comp<strong>on</strong>entanalysis (PCA) are tested <str<strong>on</strong>g>to</str<strong>on</strong>g> determine the effectiveness <str<strong>on</strong>g>to</str<strong>on</strong>g>wardclassificati<strong>on</strong>. Hence, as a first step, all the features areindividually used <str<strong>on</strong>g>to</str<strong>on</strong>g> train and test the classificati<strong>on</strong> algorithm.This will result in effectiveness of each feature <str<strong>on</strong>g>to</str<strong>on</strong>g>ward individualoutput class representati<strong>on</strong>. Next, all the features withthe highest correlati<strong>on</strong> coefficients are used as input features.This is the case where it is accepted that all the highly correlatedfeatures will in fact result in good classificati<strong>on</strong>. All422


Academic Radiology, Vol 16, No 4, April 2009NOVEL APPROACH TO CT FEATURE OPTIMIZATIONFigure 3. Correlati<strong>on</strong> analysis between computed features and radiologists’ annotati<strong>on</strong>s.Parametric correlati<strong>on</strong> coefficient values are used <str<strong>on</strong>g>to</str<strong>on</strong>g> show the strength ofmapping. SD, standard deviati<strong>on</strong>; 3D, three-dimensi<strong>on</strong>al; 2D, two-dimensi<strong>on</strong>al.features except for <strong>on</strong>e feature as an input are used, repeatingthis for the number of features, eliminating <strong>on</strong>e feature eachtime. It is a way <str<strong>on</strong>g>to</str<strong>on</strong>g> analyze the impact of <strong>on</strong>e feature when allthe features are used. Then, combinati<strong>on</strong>s of three- and twodimensi<strong>on</strong>alfeatures are used, s<str<strong>on</strong>g>to</str<strong>on</strong>g>ring the F values everytime. This is <str<strong>on</strong>g>to</str<strong>on</strong>g> analyze how multiple regressi<strong>on</strong> analysis canaffect the selecti<strong>on</strong> of features.One combinati<strong>on</strong> of features that the knowledge baseselects is based <strong>on</strong> the feedback from the ANN. The selecti<strong>on</strong>of these features is based <strong>on</strong> the F values when anindividual feature is used for training and testing the ANN.The feature that represents the nodule and the n<strong>on</strong>-noduleclass by >50% of the maximum attained F value for anyfeature is selected. Then, the results of PCA are used <str<strong>on</strong>g>to</str<strong>on</strong>g>cross-check for any redundant features in the selected subset.If two features are found redundant, then the featurewith highest average of F values for both classes is usedinstead.RESULTSThe interobserver variati<strong>on</strong> am<strong>on</strong>g radiologist assessmentswhen characterizing a nodule is apparent (20–22).This results in data that are partially random and withoutc<strong>on</strong>formity, thus producing low values of correlati<strong>on</strong> coefficients,as seen in Figure 3. These findings correlate withprevious results (2,11). The variati<strong>on</strong> is evident in the datawe selected; <strong>on</strong>e such example is shown in Figure 4, inwhich all four radiologists marked the ROIs differently.Sphericity and volume calculated from the ROIs in Figure 4and the sphericity as indicated by the radiologists inFigure 5 show the severity of the extent of variati<strong>on</strong>. InFigure 6 and Table 4, the calculated sphericity is comparedwith the annotated sphericity for two different nodules.Although the calculated sphericities were equal, the annotatedsphericities differed by two levels. Figure 7 shows thevariati<strong>on</strong> between observed and calculated sphericities overthe entire data set. For each level of observed sphericity, thecalculated sphericity varied from a minimum <str<strong>on</strong>g>to</str<strong>on</strong>g> a maximumvalue. This suggests that the direct use of the radiologistannotati<strong>on</strong>s for a classificati<strong>on</strong> algorithm is likely <str<strong>on</strong>g>to</str<strong>on</strong>g> resultin an err<strong>on</strong>eous diagnosis.PCA was used <str<strong>on</strong>g>to</str<strong>on</strong>g> find the presence of any redundantfeatures (Table 5) am<strong>on</strong>g the nodule and n<strong>on</strong>-nodules. Avariati<strong>on</strong> threshold of 2% was used <str<strong>on</strong>g>to</str<strong>on</strong>g> estimate the number ofredundant features.Multiple regressi<strong>on</strong> analysis of subsets of five three-dimensi<strong>on</strong>al,five two-dimensi<strong>on</strong>al, 10 three-dimensi<strong>on</strong>al andtwo-dimensi<strong>on</strong>al, and 11 features can be visualized usingsquared multiple correlati<strong>on</strong> coefficients (R 2 ). If R 2 =0,a model has no predictability, and if R 2 = 1, a model hasperfect predictability. Looking at the graph (Fig 8), it can bec<strong>on</strong>cluded that the higher the number of features, the betterthe predictability of the model. Although this is the case, it ispossible that with more features, the classificati<strong>on</strong> algorithmmay not be generalized enough, and the accuracy of correctclassificati<strong>on</strong> could be lower.The feedback from the ANN classificati<strong>on</strong> algorithm inc<strong>on</strong>juncti<strong>on</strong> with the knowledge base leads <str<strong>on</strong>g>to</str<strong>on</strong>g> the resultspresented in Tables 6 <str<strong>on</strong>g>to</str<strong>on</strong>g> 9. Table 6 gives the impact of twoandthree-dimensi<strong>on</strong>al features <strong>on</strong> the classificati<strong>on</strong> module.The effect of the maximum eccentricity feature that had thelowest correlati<strong>on</strong> coefficient is shown in Table 7, and the423


SAMALA ET AL Academic Radiology, Vol 16, No 4, April 2009Figure 4. Regi<strong>on</strong>s of interest as marked by four different radiologists (blue, green, red,and cyan lines) for a nodule <strong>on</strong> c<strong>on</strong>secutive slices shown al<strong>on</strong>g the in-depth (z) directi<strong>on</strong>.effect of each feature is shown in Table 8. Table 9 gives the Fvalues of the all features used collectively, excluding thefeature menti<strong>on</strong>ed in the table.DISCUSSIONThe level of impact of human visualizati<strong>on</strong> <strong>on</strong> the developmen<str<strong>on</strong>g>to</str<strong>on</strong>g>f CAD systems is an area that needs <str<strong>on</strong>g>to</str<strong>on</strong>g> be explored.This will help determine <str<strong>on</strong>g>to</str<strong>on</strong>g> what extent current CAD methodsare dependent <strong>on</strong> the human approach of analyzing radiographicimages. By using more image features, the probabilityof identifying a nodule is higher. However, given thenature of a classificati<strong>on</strong> methodology, the use of a largenumber of features will increase the false-positive rate andcomputati<strong>on</strong> time and would probably result in overtraining.Figure 5. Sphericity measured by different radiologists andcalculated al<strong>on</strong>g with the volume of the nodule.424


Academic Radiology, Vol 16, No 4, April 2009NOVEL APPROACH TO CT FEATURE OPTIMIZATIONFigure 6. Two different nodule cases with equal calculated sphericities but unequalannotated sphericity values as indicated in Table 6. C<strong>on</strong>secutive slices al<strong>on</strong>g thein-depth (z) directi<strong>on</strong> are shown.Figure 7. Radiologist’s assessment of sphericity versusmachine-calculated sphericity.Figure 8. Multiple regressi<strong>on</strong> analysis results for two-dimensi<strong>on</strong>al(2D) and three-dimensi<strong>on</strong>al (3D) parameters. The x axis isthe radiologist’s annotati<strong>on</strong>s, and the y axis represents the squareof the correlati<strong>on</strong> coefficient (R 2 ).Three different analyses—correlati<strong>on</strong>, multiple regressi<strong>on</strong>,and PCA—were performed <str<strong>on</strong>g>to</str<strong>on</strong>g> observe the type andstrength of the mapping between the annotati<strong>on</strong>s and thecalculated image features. Each analysis resulted in framingthe rules and c<strong>on</strong>straints for the selecti<strong>on</strong> of the best imagefeatures that were optimum descrip<str<strong>on</strong>g>to</str<strong>on</strong>g>rs for a classificati<strong>on</strong>method. These rules were tested with ANN classificati<strong>on</strong>,whose feedback was used <str<strong>on</strong>g>to</str<strong>on</strong>g> obtain the best optimum featuresfrom the CAD system’s perspective. The effect of thestrength of mappings <strong>on</strong> the CAD system was analyzed indetail.Correlati<strong>on</strong> analysis can be used <str<strong>on</strong>g>to</str<strong>on</strong>g> see if the computedfeatures have any correlati<strong>on</strong> with the radiologists’ annotati<strong>on</strong>s.This will help determine if the selected machine425


SAMALA ET AL Academic Radiology, Vol 16, No 4, April 2009features are as good as the human visual descrip<str<strong>on</strong>g>to</str<strong>on</strong>g>rs inidentifying the cancerous tissue. At this step, care must betaken <str<strong>on</strong>g>to</str<strong>on</strong>g> be certain that at least <strong>on</strong>e computed feature is highlycorrelated with each of the radiologists’ annotati<strong>on</strong>s. Thisstep will generally help increase the in-class representati<strong>on</strong> ofthe nodule.From multiple regressi<strong>on</strong> analysis, it can be c<strong>on</strong>cludedthat the subset of two-dimensi<strong>on</strong>al features influenced thefinal predicti<strong>on</strong> model better than that of the three-dimensi<strong>on</strong>alfeatures. In additi<strong>on</strong>, any increase in the number ofparameters resulted in an increase in the correlati<strong>on</strong> coefficientfor each linear predicti<strong>on</strong> model. Hence, it is recommended<str<strong>on</strong>g>to</str<strong>on</strong>g> use more two-dimensi<strong>on</strong>al computed features thanthree-dimensi<strong>on</strong>al features for n<strong>on</strong>isometric volume. FromPCA, up <str<strong>on</strong>g>to</str<strong>on</strong>g> three redundant features were observed, indicatingthe presence of highly correlated features. Reducing theredundant features principally improves the computati<strong>on</strong>time for the training and testing phases of the classificati<strong>on</strong>module.A classificati<strong>on</strong> algorithm is used <str<strong>on</strong>g>to</str<strong>on</strong>g> test the weights ofthe hypotheses from correlati<strong>on</strong>, multiple regressi<strong>on</strong>, andPCA. From Figure 3, maximum eccentricity was observed<str<strong>on</strong>g>to</str<strong>on</strong>g> have a high impact <strong>on</strong> the F test for a n<strong>on</strong>-nodule class(Tables 7 and 9) when the feature was eliminated from theclassificati<strong>on</strong>, even though it had the lowest correlati<strong>on</strong>coefficient. Hence, it can be c<strong>on</strong>cluded that maximum eccentricityis <strong>on</strong>e image feature that is a good representati<strong>on</strong>of a n<strong>on</strong>-nodule am<strong>on</strong>g all the features that were used.<str<strong>on</strong>g>Feature</str<strong>on</strong>g>s such as volume, compactness, and mean had thehighest correlati<strong>on</strong> coefficients, but when used individuallyfor classificati<strong>on</strong> (Table 8), the F value for in-class representati<strong>on</strong>of n<strong>on</strong>-nodules was found <str<strong>on</strong>g>to</str<strong>on</strong>g> be zero. Similarly,when these features were removed from the classificati<strong>on</strong>stage, the F values were unaffected (Table 9). Thus, correlati<strong>on</strong>analysis cannot be used entirely as a precursor forfeature selecti<strong>on</strong>.As shown in Table 6, two-dimensi<strong>on</strong>al features are betterdescrip<str<strong>on</strong>g>to</str<strong>on</strong>g>rs <str<strong>on</strong>g>to</str<strong>on</strong>g> distinguish a nodule and a n<strong>on</strong>-nodule comparedwith three-dimensi<strong>on</strong>al features. The F values for eachclass are high in case of the two-dimensi<strong>on</strong>al features. Thiscould be attributed <str<strong>on</strong>g>to</str<strong>on</strong>g> the n<strong>on</strong>isometric resoluti<strong>on</strong> of the data.It was also observed that combining the two- and three-dimensi<strong>on</strong>alfeatures resulted in lower F values for each class,indicating the presence of noise. From Table 9, it was observedthat by removing each feature from the pool of all thefeatures, the in-class representati<strong>on</strong> sometimes improved orworsened or did not change at all, indicating that multipleregressi<strong>on</strong> analysis cannot be entirely relied <strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> select theoptimum features.The knowledge base module is programmed <str<strong>on</strong>g>to</str<strong>on</strong>g> selectfeatures with high F values for both classes, <str<strong>on</strong>g>to</str<strong>on</strong>g> include moretwo-dimensi<strong>on</strong>al features compared with three-dimensi<strong>on</strong>alfeatures, and <str<strong>on</strong>g>to</str<strong>on</strong>g> eliminate any features with high correlati<strong>on</strong>am<strong>on</strong>g themselves. The highest F values were achieved whensphericity, area, maximum c<strong>on</strong>trast, and maximum eccentricityfeatures were used (Table 8). These features had neitherthe highest nor the lowest correlati<strong>on</strong> coefficients; rather,75% bel<strong>on</strong>ged <str<strong>on</strong>g>to</str<strong>on</strong>g> the two-dimensi<strong>on</strong>al category and 25%bel<strong>on</strong>ged <str<strong>on</strong>g>to</str<strong>on</strong>g> the three-dimensi<strong>on</strong>al category. To test thispremise, volume, compactness, and c<strong>on</strong>trast features that didnot represent the n<strong>on</strong>-nodule class were used, resulting in lowF values.Beside the ANN classifier and the three analyses used forthe selecti<strong>on</strong> of features in thoracic computed <str<strong>on</strong>g>to</str<strong>on</strong>g>mographicdata, we shall explore some other analyses <strong>on</strong> the basis ofmathematical programming. Classificati<strong>on</strong> of biopsy lungtissue images (23) will be tested for the general applicati<strong>on</strong> ofthis methodology. An extensi<strong>on</strong> of robust linear programming(24,25) can be used for this kind of optimal featureselecti<strong>on</strong>.In c<strong>on</strong>clusi<strong>on</strong>, CAD developers should not include featuresdepending <strong>on</strong> various analyses oriented <str<strong>on</strong>g>to</str<strong>on</strong>g>ward radiologistannotati<strong>on</strong>s al<strong>on</strong>e. Each feature must be analyzed <str<strong>on</strong>g>to</str<strong>on</strong>g>evaluate the impact it has <strong>on</strong> the CAD system <strong>on</strong> the basis ofits class representati<strong>on</strong>. The generalizati<strong>on</strong> capability ofa classificati<strong>on</strong> methodology will be limited if the selecti<strong>on</strong>of features is based solely <strong>on</strong> radiologists’ nature of analyzinglung nodules. <str<strong>on</strong>g>Feature</str<strong>on</strong>g>s from different dimensi<strong>on</strong>alityand domains should be c<strong>on</strong>sidered as an initial feature set, <str<strong>on</strong>g>to</str<strong>on</strong>g>describe vaguely defined nodules. The mappings betweenradiologists and CAD developers will create a comm<strong>on</strong>platform that can be used <str<strong>on</strong>g>to</str<strong>on</strong>g> enhance meaningful communicati<strong>on</strong>.This will motivate radiologists <str<strong>on</strong>g>to</str<strong>on</strong>g> assess pulm<strong>on</strong>arynodules from a different perspective. The present analysesare generalized <str<strong>on</strong>g>to</str<strong>on</strong>g> be used in any CAD system, providedthat training and testing are involved in the classificati<strong>on</strong>process.REFERENCES1. Sluimer I, Schilham A, Prokop M, et al. Computer analysis of computed<str<strong>on</strong>g>to</str<strong>on</strong>g>mography scans of the lung: a survey. IEEE Trans Med Imaging 2006;25:385–405.2. Austin JH, Müller N, Friedman PJ, et al. Glossary of terms for CT of thelungs: recommendati<strong>on</strong>s of the nomenclature committee of the FleischnerSociety. Radiology 1996; 200:327–331.3. Raicu DS, Varutbangkul E, Cisneros JG, et al. Semantics and imagec<strong>on</strong>tent integrati<strong>on</strong> for pulm<strong>on</strong>ary nodule interpretati<strong>on</strong> in thoraciccomputed <str<strong>on</strong>g>to</str<strong>on</strong>g>mography. Proc SPIE Int Soc Opt Eng 2007; 8:65120S.1–65120S.12.4. Arma<str<strong>on</strong>g>to</str<strong>on</strong>g> SG III, Li F, Giger ML, et al. Lung cancer: performance of au<str<strong>on</strong>g>to</str<strong>on</strong>g>matedlung nodule detecti<strong>on</strong> applied <str<strong>on</strong>g>to</str<strong>on</strong>g> cancers missed in a CT screeningprogram. Radiology 2002; 225:685–692.5. Brown MS, Goldin JG, Suh RD, et al. Lung micr<strong>on</strong>odules: au<str<strong>on</strong>g>to</str<strong>on</strong>g>matedmethod for detecti<strong>on</strong> at thin-secti<strong>on</strong> CT—initial experience. Radiology2003; 226:256–262.6. Lee Y, Hara T, Fujita H, et al. Au<str<strong>on</strong>g>to</str<strong>on</strong>g>mated detecti<strong>on</strong> of pulm<strong>on</strong>ary nodulesin helical CT images based <strong>on</strong> an improved template-matching technique.IEEE Trans Med Imaging 2001; 20:595–604.7. T<strong>on</strong>g J, Da-Zhe Z, Jin-Zhu Y, et al. Au<str<strong>on</strong>g>to</str<strong>on</strong>g>mated detecti<strong>on</strong> of pulm<strong>on</strong>arynodules in HRCT images. In: Proceedings of the 1st Internati<strong>on</strong>al426


Academic Radiology, Vol 16, No 4, April 2009NOVEL APPROACH TO CT FEATURE OPTIMIZATIONC<strong>on</strong>ference <strong>on</strong> Bioinformatics and Biomedical Engineering. New York:IEEE, 2007:833–836.8. Arma<str<strong>on</strong>g>to</str<strong>on</strong>g> SG III, Giger ML, Moran CJ, et al. Computerized detecti<strong>on</strong> ofpulm<strong>on</strong>ary nodules <strong>on</strong> CT scans. RadioGraphics 1999; 19:1303–1311.9. McNitt-Gray MF, Wyckoff N, Sayre JW, et al. The effects of co-occurrencematrix based texture parameters <strong>on</strong> the classificati<strong>on</strong> of solitarypulm<strong>on</strong>ary nodules imaged <strong>on</strong> computed <str<strong>on</strong>g>to</str<strong>on</strong>g>mography. Comput MedImag Graphics 1999; 23:339–348.10. Aoyama M, Li Q, Katsuragawa S, et al. Computerized scheme for determinati<strong>on</strong>of the likelihood measure of malignancy for pulm<strong>on</strong>ary nodules<strong>on</strong> low-dose CT images. Med Phys 2003; 30:387–394.11. Clarke LP, Croft BY, Staab E, et al. Nati<strong>on</strong>al Cancer Institute initiative:lung image database resource for imaging research. Acad Radiol 2001;8:447–450.12. Arma<str<strong>on</strong>g>to</str<strong>on</strong>g> SG III, McLennan G, McNitt-Gray MF. Lung Image DatabaseC<strong>on</strong>sortium: developing a resource for the medical imaging researchcommunity. Radiology 2004; 232:739–748.13. McNitt-Gray MF, Arma<str<strong>on</strong>g>to</str<strong>on</strong>g> SG III, Meyer CR. The Lung Image DatabaseC<strong>on</strong>sortium (LIDC) data collecti<strong>on</strong> process for nodule detecti<strong>on</strong> andannotati<strong>on</strong>. Acad Radiol 2007; 14:1464–1474.14. Qian W, Li LH, Clarke LP. Image feature extracti<strong>on</strong> for mass detecti<strong>on</strong>using digital mammography: influence of wavelet analysis. Med Phys1999; 26:402–408.15. Qian W, Sun XJ, S<strong>on</strong>g DS, et al. Digital mammography: wavelet transformand Kalman filtering neural network in mass segmentati<strong>on</strong> and detecti<strong>on</strong>.Acad Radiol 2001; 11:1074–1082.16. Qian W, Lei M, S<strong>on</strong>g DS, et al. Computer aided mass detecti<strong>on</strong> based<strong>on</strong> ipsilateral multi-view mammograms. Acad Radiol 2007; 14:530–538.17. Fukunaga K. Introducti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> statistical pattern recogniti<strong>on</strong>. L<strong>on</strong>d<strong>on</strong>: AcademicPress, 1990.18. Suzuki K, Feng Li, S<strong>on</strong>e S, et al. Computer-aided diagnostic scheme fordistincti<strong>on</strong> between benign and malignant nodules in thoracic low-doseCT by use of massive training artificial neural network. IEEE Trans MedImaging 2005; 24:1138–1150.19. Qian W, Zhukov TA, S<strong>on</strong>g DS, et al. Computerized analysis of cellularfeatures and biomarkers for cy<str<strong>on</strong>g>to</str<strong>on</strong>g>logic diagnosis of early lung cancer. AnalQuant Cy<str<strong>on</strong>g>to</str<strong>on</strong>g>l His<str<strong>on</strong>g>to</str<strong>on</strong>g>l 2007; 10:18–28.20. Dodd LE, Wagner RF. Arma<str<strong>on</strong>g>to</str<strong>on</strong>g> SG III. Assessment methodologiesand statistical issues for computer-aided diagnosis of lung nodulesin computed <str<strong>on</strong>g>to</str<strong>on</strong>g>mography: c<strong>on</strong>temporary research <str<strong>on</strong>g>to</str<strong>on</strong>g>pics relevant<str<strong>on</strong>g>to</str<strong>on</strong>g> the Lung Image Database C<strong>on</strong>sortium. Acad Radiol 2004; 11:462–475.21. Meyer CR, Johns<strong>on</strong> TD, McLennan G. Evaluati<strong>on</strong> of lung MDCT noduleannotati<strong>on</strong> across radiologists and methods. Acad Radiol 2006; 13:1254–1265.22. Arma<str<strong>on</strong>g>to</str<strong>on</strong>g> SG III, McNitt-Gray MF, Reeves AP. The Lung Image DatabaseC<strong>on</strong>sortium (LIDC): an evaluati<strong>on</strong> of radiologist variability in theidentificati<strong>on</strong> of lung nodules <strong>on</strong> CT scans. Acad Radiol 2007; 14:1409–1421.23. Land WL, McKee DW, Zhukov T, Qian W. A kernelised fuzzy-supportvec<str<strong>on</strong>g>to</str<strong>on</strong>g>r machine CAD system for the diagnosis of lung cancer from tissueimages. Int J Funct Informat Pers Med 2008; 1:26–42.24. Bennett KP, Mangasarian OL. Robust linear programming discriminati<strong>on</strong>of two linearly inseparable sets. Optim Methods Softw 1992; 1:23–34.25. Mangasarian OL. Mathematical programming in data mining. Data MinKnowl Discov 1997; 42:183–201.Erratum‘‘Asymmetry Analysis in Rodent Cerebral Ischemia Models.’’ Acad Radiol 2008; 15:1181–1197.In the secti<strong>on</strong> ‘‘Materials and Methods’’ (pp 1183–1187), the included mathematics closely parallels the development in theunpublished manuscript ‘‘Further Techniques for Enhancing Asymmetry in Brain Imagery,’’ by X. Liu, C. Imielinska,and J. Rosiene (2005), without reference <str<strong>on</strong>g>to</str<strong>on</strong>g> the prior work. The authors regret the omissi<strong>on</strong>.427

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!