Deliverable 5.2 - the School of Engineering and Design - Brunel ...
Deliverable 5.2 - the School of Engineering and Design - Brunel ...
Deliverable 5.2 - the School of Engineering and Design - Brunel ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
ICTProjectContract no.:2484203D VIVANT–<strong>Deliverable</strong><strong>5.2</strong>Search& Retrieval Mechanismss &Tools3.3CODEBOOK GENERATION AND BAG-OF-WORDSSince most <strong>of</strong> <strong>the</strong> well performing descriptor extraction algorithms extract local features, <strong>the</strong>re aremany feature vectors that correspondtoa single multimedia object. The local features for eachmultimedia object may be hundreds or thous<strong>and</strong>s <strong>and</strong>thus it is very difficult to use in an efficient wayfor <strong>the</strong> matchingstep. A common approach to this problem is <strong>the</strong>bag-<strong>of</strong>-words (BoW) method. With<strong>the</strong> BoW, a codebook with <strong>the</strong> most dominant code words (quantized descriptor vectors) is generated.The codebook isgenerated using a k-means algorithmwith a typical k <strong>of</strong> 5000 to 1000 dimensions <strong>and</strong>by sampling <strong>the</strong> local features available in <strong>the</strong> database.Theneach set <strong>of</strong> local features is checked against <strong>the</strong> codebook to generate a histogram <strong>of</strong>occurrences <strong>of</strong> code words for each features’ file (which correspond to one multimedia object). Thefinalhistogram <strong>of</strong> <strong>the</strong> BoWis actually a new descriptor that represents <strong>the</strong>multimediaobject withonlyone descriptor vector. These final BoW descriptor vectors are stored in<strong>the</strong> database under <strong>the</strong>“words” directories.3.4MANIFOLD LEARNING AND UNIFIED SEARCH SPACEThe next step after <strong>the</strong> generation <strong>of</strong> <strong>the</strong> bag-<strong>of</strong>-words is <strong>the</strong> creation <strong>of</strong> <strong>the</strong> unified search space as itwas described in details in<strong>Deliverable</strong> D5.1. Forthis <strong>the</strong> manifold learning process starts <strong>and</strong>combines <strong>the</strong> different modalities (viewpoints, depth, curvature) to build a unified vector space.Thethatfinal outcome <strong>of</strong> <strong>the</strong> process is a single descriptor vector (see Figure 10) per multimedia objectcorrespondss to <strong>the</strong> unified multimodal search space.Figure 10: <strong>the</strong> manifold learning process produces a single multimodal descriptor per objectThisfinal singledescriptor vector per multimedia object is <strong>the</strong>nindexed to prepare <strong>the</strong> database forsimilarity searchqueries.4/03/201316