Human Detection in Video over Large Viewpoint Changes

More documents

Recommendations

Info

1252 G. Duan, H. Ai, and S. Lao memory requirements impractical. With the distance of two granules defined in Sec. 3.1, two effective constraints are introduced into I 2 CF : 1)Motivated by [5], the first pair of granules in I 2 CF is constrained as d(g i 1, g j 1 ) ≤ T 1. 2) Considering of the consistency in one frame or two near video frames, we constrain that the second pair of granules in I 2 CF is in the neighborhood of the first pair as shown in Fig. 2 (d): d(g i 1, g i 2) ≤ T 2 , d(g j 1 , gj 2 ) ≤ T 2. (7) We set T 1 = 8, T 2 = 4 in our experiments. Table 1: Learning algorithm of I 2 CF . Input: Sample set S = {(x i , y i )|1 ≤ i ≤ m} where y i = ±1. Initialize: Cell space (CS) with all possible cells and empty I 2 CF . Output: The learned I 2 CF . Loop: – Learn the first pair of granules as [5]. Denote the best f pairs as a set F . – Construct a new set CS’: In each cell of CS’, the first pair of granules is from F , the second pair of granules is generated by Eq. 7 and its mode is A-mode, D-mode or C-mode. Calculate Z value of I 2 CF by adding each cell in CS’. – Select the cell with the lowest Z value, denoted as c ∗ . Add c ∗ to I 2 CF . – Refine I 2 CF by replacing one or two granules in it without changing the mode. Heuristically learning I 2 CF starts with an empty I 2 CF . Each time select the most discriminative cell and add it to I 2 CF . The discriminability of a weak feature is measured by Z value, which reflects the classification power of the weak classifier as [17]: Z = 2 ∑ √ W+W j −, j (8) j where W j + is the weight of positive samples that fall into the j th bin while W j − is that of negatives. The less Z value is, the more discriminative a weak feature is. The learning algorithm of I 2 CF is summarized in Table 1. (See more details in [5] [16].) 4 EMC-Boost We propose the EMC-Boost to co-cluster the sample space and discriminative features automatically. A perceptual clustering problem is shown in Fig. 3 (a)- (c). EMC-Boost consists of three components, Cascade Component (CC), Mixed Component (MC) and Separated Component (SC). The three components are combined to become EMC-Boost. In fact, SC is similar to MC-Boost [13], which is the reason that our boosting algorithm is named as EMC-Boost. In the following descriptions, we formulate the three components explicitly first, and then demonstrate the learning algorithms, and summarize EMC-Boost at the end of this section.
Human Detection in Video over Large Viewpoint Changes 1253 CC MC Cluster (b) SC Classifier (a) (c) (d) Fig. 3: A perceptual clustering problem in (a)-(c) and a general EMC-Boost in (d) where CC, MC and SC are three components of EMC-Boost. 4.1 Three components CC/MC/SC CC deals with a standard 2-class classification problem that can be solved by any boosting algorithm. MC and SC deal with K clusters. We formulate the detectors of MC or SC as K strong classifiers, each of which is a linear combination of weak learners H k (x) = ∑ t α kth kt (x), k = 1, · · · , K with a threshold θ k (default is 0). Note that the K classifiers H k (x), k = 1, · · · , K are same in MC with K different thresholds θ k , which means H 1 (x) = H 2 (x) = · · · = H k (x), but they are totally different in SC. We present MC and SC uniformly below. The score y ik of the i th sample belonging to the k th clusters can be computed as y ik = H k (x i )−θ k . Therefore, the probability of x i belonging to the k th cluster 1 is P ik (x) = . For aggregating all scores of one sample on K classifiers, 1+e −y ik we formulate Noisy-OR like [18] [13] as P i (x) = 1 − K∏ (1 − P ik (x i )). (9) k=1 The cost function is defined as J = ∏ i P ti i (1 − P i) 1−ti where t i ∈ {0, 1} is the label of i th sample, which is equivalent to maximize the log-likelihood log J = ∑ i t i log P i + (1 − t i ) log(1 − P i ). (10) 4.2 Learning algorithms of CC/MC/SC The learning algorithm of CC is directly Real Adaboost [17]. The learning algorithm of MC or SC is different from that of CC. MC and SC learn weak classifiers to maximize ∑ K ∑ k i w kih kt (x i ) and ∑ i w kih kt (x i ) respectively at t th round of boosting. Initially, the sample weights are: 1) For positives, w ki = 1 if x i ∈ k and w ki = 0 otherwise, where i denotes the i th sample and k denotes the k th cluster or classifier; 2) For all negatives we set w ki = 1/K. Following the AnyBoost method [19], we set the sample weights as the derivative of the cost function
Page 1 and 2: Human Detection in Video over Large
Page 5: Human Detection in Video over Large
Page 11 and 12: Reca l Reca l Reca l Reca l Reca l

Human Detection in Video over Large Viewpoint Changes

Create successful ePaper yourself

Delete template?

Save as template?