njit-etd2003-081 - New Jersey Institute of Technology
njit-etd2003-081 - New Jersey Institute of Technology njit-etd2003-081 - New Jersey Institute of Technology
111 These ideas are easily extended to the case of P variables x 1 , x 2 ,...,xp . Each principal component is a linear combination of the x variables. Coefficients of these linear combinations are chosen to satisfy the following three requirements: 2. The values of any two principal components are uncorrelated. 3. For any principal component the sum of the squares of the coefficients is one. In other words, C 1 is the linear combination of the largest variance. Subject to the condition that it is uncorrelated with C 1 , C2 is the linear combination with the largest variance. Similarly, C3 has the largest variance subject to the condition that it is uncorrelated with C I and C2 , etc. The VarCi are the eigenvalues. These P variances add up to the original total variance. In some literature the set of coefficients of the linear combination for the ith principal component is called the ith eigenvector (also known as the characteristic or latent vector).
112 3.15 Cluster Analysis The term cluster analysis (first used by Tryon, 1939) actually encompasses a number of different classification algorithms. A general question facing researchers in many areas of inquiry is how to organize observed data into meaningful structures, that is, to develop taxonomies. For example, biologists have to organize the different species of animals before a meaningful description of the differences between animals is possible. According to the modern system employed in biology, man belongs to the primates, the mammals, the amniotes, the vertebrates, and the animals. Note how in this classification, the higher the level of aggregation the less similar are the members in the respective class. Man has more in common with all other primates (e.g., apes) than it does with the more "distant" members of the mammals (e.g., dogs), etc. Note that one talks about clustering algorithms and does not mention anything about statistical significance testing. In fact, cluster analysis is not as much a typical statistical test as it is a "collection" of different algorithms that "put objects into clusters." The point here is that, unlike many other statistical procedures, cluster analysis methods are mostly used when priori hypotheses are not available, and it is still in the exploratory phase of the research. In a sense, cluster analysis finds the "most significant solution possible." Therefore, statistical significance testing is really not appropriate here, even in cases when p-levels are reported (as in ANOVA). Clustering techniques have been applied to a wide variety of research problems. Hartigan (1975) provides an excellent summary of the many published studies reporting the results of cluster analyses [59]. For example, in the field of medicine, clustering diseases, cures for diseases, or symptoms of diseases can lead to very useful taxonomies.
- Page 89 and 90: 60 called the cross Wigner distribu
- Page 91 and 92: 62 3.6.3 The Choi-Williams (Exponen
- Page 93 and 94: 64 Figure 3.3 Performance of the Ch
- Page 95 and 96: 66 [-Ω,Ω ], then its STFT will be
- Page 97 and 98: 68 This condition forces that the w
- Page 99 and 100: 70 where c is a constant. Thus, the
- Page 101 and 102: Figure 3.5 The time-frequency plane
- Page 103 and 104: 74 The measure dadb used in the tra
- Page 105 and 106: 76 and the wavelet transform repres
- Page 107 and 108: 78 Figure 3.6 Figure depicting the
- Page 109 and 110: 80 The final step to obtain the pow
- Page 111 and 112: 82 It should be noted that if the w
- Page 113 and 114: 84 The normal respiration rate can
- Page 115 and 116: Figure 3.12 Power spectrum of BP II
- Page 117 and 118: RR similar manner to give: When com
- Page 119 and 120: 90 when there is significant correl
- Page 121 and 122: 92 3.12 Partial Coherence Analysis
- Page 123 and 124: 94 after removal of the effects of
- Page 125 and 126: 96 The bulk of the theory and appli
- Page 127 and 128: 98 technique is measurement time. T
- Page 129 and 130: 100 usually attainable. The key poi
- Page 131 and 132: 102 variability exists in the propa
- Page 133 and 134: 104 eXogenous input (ARX) was used
- Page 135 and 136: 106 The baroreflex, an autonomic re
- Page 137 and 138: 108 the principal components are no
- Page 139: 110 The mathematical solution for t
- Page 143 and 144: 114 formed) one can read off the cr
- Page 145 and 146: 116 3.15.5 Squared Euclidian Distan
- Page 147 and 148: 118 Alternatively, one may use the
- Page 149 and 150: 120 Sneath and Sokal used the abbre
- Page 151 and 152: 122 may seem a bit confusing at fir
- Page 153 and 154: CHAPTER 4 METHODS The purpose of th
- Page 155 and 156: 126 4.1.2.1 Autonomic Testing. HR V
- Page 157 and 158: 128 of heart rate, blood pressure,
- Page 159 and 160: 130 The patients who underwent LVRS
- Page 161 and 162: 132 panel of the Correct.vi. It was
- Page 163 and 164: 134 4.2.3 Power Spectrum Analysis o
- Page 165 and 166: 136 weighted-average value of the c
- Page 167 and 168: 138 For each given scale a within t
- Page 169 and 170: 140 frequency F to the wavelet func
- Page 171 and 172: 142 4.2.8 System Identification Ana
- Page 173 and 174: 144 In this study a simpler approac
- Page 175 and 176: 146 Table 4.2 Parameters That Make
- Page 177 and 178: 148 4.2.11 Cluster Analysis The sam
- Page 179 and 180: 150 viewing the time series of sequ
- Page 181 and 182: Figure 5.2 BPV analysis of a COPD s
- Page 183 and 184: Figure 5.3 HRV analysis of a normal
- Page 185 and 186: Figure 5.4.1 Comparison of the HRV
- Page 187 and 188: 158 5.2 Time Frequency Analysis One
- Page 189 and 190: Figure 5.5 Test signal with 3 sine
112<br />
3.15 Cluster Analysis<br />
The term cluster analysis (first used by Tryon, 1939) actually encompasses a number <strong>of</strong><br />
different classification algorithms. A general question facing researchers in many areas<br />
<strong>of</strong> inquiry is how to organize observed data into meaningful structures, that is, to<br />
develop taxonomies. For example, biologists have to organize the different species <strong>of</strong><br />
animals before a meaningful description <strong>of</strong> the differences between animals is possible.<br />
According to the modern system employed in biology, man belongs to the primates, the<br />
mammals, the amniotes, the vertebrates, and the animals. Note how in this<br />
classification, the higher the level <strong>of</strong> aggregation the less similar are the members in the<br />
respective class. Man has more in common with all other primates (e.g., apes) than it<br />
does with the more "distant" members <strong>of</strong> the mammals (e.g., dogs), etc.<br />
Note that one talks about clustering algorithms and does not mention anything<br />
about statistical significance testing. In fact, cluster analysis is not as much a typical<br />
statistical test as it is a "collection" <strong>of</strong> different algorithms that "put objects into<br />
clusters." The point here is that, unlike many other statistical procedures, cluster<br />
analysis methods are mostly used when priori hypotheses are not available, and it is still<br />
in the exploratory phase <strong>of</strong> the research. In a sense, cluster analysis finds the "most<br />
significant solution possible." Therefore, statistical significance testing is really not<br />
appropriate here, even in cases when p-levels are reported (as in ANOVA).<br />
Clustering techniques have been applied to a wide variety <strong>of</strong> research problems.<br />
Hartigan (1975) provides an excellent summary <strong>of</strong> the many published studies reporting<br />
the results <strong>of</strong> cluster analyses [59]. For example, in the field <strong>of</strong> medicine, clustering<br />
diseases, cures for diseases, or symptoms <strong>of</strong> diseases can lead to very useful taxonomies.