27.09.2014 Views

Human Detection in Video over Large Viewpoint Changes

Human Detection in Video over Large Viewpoint Changes

Human Detection in Video over Large Viewpoint Changes

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1248 G. Duan, H. Ai, and S. Lao<br />

Fig. 1: <strong>Human</strong> detection <strong>in</strong> video <strong>over</strong> large viewpo<strong>in</strong>t changes. Samples of three typical<br />

viewpo<strong>in</strong>ts and correspond<strong>in</strong>g scenes are given.<br />

a large variation <strong>in</strong> human appearance and motion <strong>over</strong> wide viewpo<strong>in</strong>t changes?<br />

3) How to deal with the changes <strong>in</strong> video frame rate or abrupt motion if us<strong>in</strong>g<br />

motion features on several consecutive frames? Viola and Jones [1] first made use<br />

of appearance and motion <strong>in</strong>formation <strong>in</strong> object detection, where they tra<strong>in</strong>ed<br />

AdaBoosted classifiers with Harr features on two consecutive frames. Later Jones<br />

and Snow [2] extended this work by propos<strong>in</strong>g appearance filter, difference filter<br />

and shifted difference filter on 10 consecutive frames and us<strong>in</strong>g predef<strong>in</strong>ed several<br />

categories of samples. The approaches <strong>in</strong> [1] [2] can solve the 1 st problem,<br />

but still face the challenge of the 3 rd problem. The approach <strong>in</strong> [2] can handle<br />

the 2 nd problem to some extent but as even human himself sometimes cannot<br />

tell which predef<strong>in</strong>ed category a mov<strong>in</strong>g object belongs to and thus its application<br />

will be limited, while the approach <strong>in</strong> [1] tra<strong>in</strong>s detectors by mix<strong>in</strong>g all<br />

positives together. Dalal et al. [3] comb<strong>in</strong>ed HOG descriptors and some motionbased<br />

descriptors together to detect humans with possibly mov<strong>in</strong>g cameras and<br />

backgrounds. Wojek et al. [4] proposed to comb<strong>in</strong>e multiple and complementary<br />

feature types and <strong>in</strong>corporate motion <strong>in</strong>formation for human detection, which<br />

coped with mov<strong>in</strong>g camera and clustered background well and achieved promis<strong>in</strong>g<br />

results on humans with a common frontal viewpo<strong>in</strong>t. In this paper, our aim<br />

is to design a novel feature to take advantages of both appearance and motion<br />

<strong>in</strong>formation, and to propose an efficient learn<strong>in</strong>g algorithm to learn a practical<br />

detector of rational structure even when the samples are tremendously diverse<br />

for handl<strong>in</strong>g the difficulties mentioned above <strong>in</strong> one framework.<br />

The rest of this paper is organized as follows. Related work is <strong>in</strong>troduced <strong>in</strong><br />

Sec. 2. The proposed feature (I 2 CF ), the co-cluster algorithm (EMC-Boost) and<br />

the sampl<strong>in</strong>g strategy (MVS) are given <strong>in</strong> Sec. 3, Sec. 4 and Sec. 5 respectively<br />

and they are <strong>in</strong>tegrated to handle human detection <strong>in</strong> video <strong>in</strong> Sec. 6. Some<br />

experiments and conclusions are given <strong>in</strong> Sec. 7 and the last section respectively.<br />

2 Related work<br />

In literature, human detection <strong>in</strong> video can be divided roughly <strong>in</strong>to four categories.<br />

1) <strong>Detection</strong> <strong>in</strong> static images as [5] [6] [7]. APCF [5], HOG [6] and<br />

Edgelet [7] are def<strong>in</strong>ed on appearance only. APCF compares colors or gradient<br />

orientations of two squares <strong>in</strong> images that can describe the <strong>in</strong>variance of color

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!