Discriminative Learning of Local Image Descriptors

Discriminative Learning of 

Local Image Descriptors 

Authors: Matthew Brown, Gang Hua, 

Simon Winder 

讲解人: 樊彬

作者信息 

文章信息 

提纲 

拟解决的问题与采用的思路 

本文的方法 

实验 

结论

Matthew Brown 

作者简介 

Postdoctoral Fellow at the Ecole Polytechnique Fédérale de Lausanne (EPFL) 

PhD in Computer Science (UBC, 2005) 

MEng in Electrical and Information Sciences (Cambridge, 2000) 

Famous for his work on automatic 2D image stitch 

http://cvlab.epfl.ch/~brown/research/research.html 

Gang Hua 

Senior Researcher in Nokia Research Center, Hollywood 

Scientist in Microsoft Live Labs Research from 2006 to 2009 

PhD in Electrical and Computer Engineering (Northwestern University, 2006) 

M.S and B.S in Electrical Engineering (Xi’an Jiaotong University, 2002,1999) 

http://www.eecs.northwestern.edu/~ganghua/ 

Simon Winder 

Senior Developer in Microsoft Research 

http://research.microsoft.com/en-us/people/swinder/

作者信息 

文章信息 

提纲 



实验 

结论

文章出处 

文章信息 

PAMI 2010, to appear 

相关文献 

Learning Local Image Descriptors. S. Winder and M. 

Brown.(CVPR2007). 

Discriminant Embedding for Local Image Descriptors. G. 

Hua, M. Brown and S. Winder. (ICCV2007). 

Picking the Best DAISY. S. Winder, G. Hua and M. Brown. 

(CVPR09).

摘要 

A realistic ground truth dataset of matched patches 

based on multi-view stereo data 

Describe a set of building blocks for constructing 

descriptors 

Parametric learning for local image descriptors 

Non-Parametric learning for local image 

descriptors 

Dimensionality reduction 

Obtain descriptors that exceed the state-of-the-art 

performance with lower dimensionality

作者信息 

文章信息 

提纲 



实验 

结论


一方面,尽管局部描述子在计算机视觉领域得到高度重视 

和广泛应用,但目前大部分局部描述子都是人为设计好的 

特征变换。 

另一方面,虽然基于学习的方法广泛用于高层视觉任务中, 

底层的视觉处理方法却很少用到基于学习的方法。 

本文提出了一种自动的基于学习的局部描述子设计方法, 

基于训练样本,通过线性判别分析和Powell最小化方法分 

别学得最优的非参数局部描述子和参数局部描述子。

作者信息 

文章信息 

提纲 



实验 

结论

The Framework 

本文的方法

G-block T-block 

G-block: Gaussian smoothing 

S-block/ 

E-block 

N-block



S-block/ 

E-block 

N-block 

T-block: Non-linear transformation to each sample grid in 

smoothed patch



S-block/ 

E-block 

N-block 

“simple-cell” 

stage 


smoothed patch


G-block: Gaussian smoothing. 

S-block/ 

E-block 

N-block 


smoothed patch. 

S-block/E-block: Spatial pooling of the T-block responses. 

S-block uses parameterized pooling regions, E-block is 

non-parametric.



S-block/ 

E-block 

N-block 



“complex-cell” 

operations 



non-parametric.



S-block/ 

E-block 

N-block 





non-parametric. 

N-block: SIFT-style Normalization.



S-block/ 

E-block 

N-block 





non-parametric. 

N-block: SIFT-style Normalization.


Smoothed 

input patch 

I11 I12 I13 I14 

 

 

I I I I 

 

21 22 23 24 

I31 I32 I33 I 34 

 

I I I I 

41 42 43 44 

S-block/ 

E-block 

An output grid, one 

k length vector per 

sample 

N-block 

f11 f12 f13 f14 

 

 

f f f f 

 

21 22 23 24 

f31 f32 f33 f 34 

 

f f f f 

41 42 43 44


S-block/ 

E-block 

N-block 

T1: Angle-quantized gradients. Magnitude is linearly assigned to two 

adjacent elements of orientation. 

T1a: 4 quantized directions; T1b: 8 quantized directions 

T2: Rectified gradients. Positive and negative separated gradient vector. 

T2a: { x x; x x; y y; y y} 

T2b: { x x; x x; y y; y y; x x; x x; y y; y y}


S-block/ 

E-block 

T3: Steerable filters. 

T3g: 2 nd order, 4 orientations; T3h: 4 th order, 4 orientations; 

T3i: 2 nd order, 8 orientations; T3j: 4 th order, 8 orientations 

T4: DoG responses. 

D I( ) I( ), D I( ) I( 

) 

1 2 1 2 3 2 

T4: { D D ; D D ; D D ; D 

D } 

1 1 1 1 2 2 2 2 

N-block


S-block 

S-block/ 

E-block 

N-block


E-block 

E1: PCA (Principal Component Analysis) 

E2: LPP (Local Preserving Projections) 

S-block/ 

E-block 

E4: LDE (Local Discriminative Embedding) 

E6: GLDE (Generalized Local Discriminative Embedding) 

E3,E5,E7: orthogonality version of E2,E4,E6 

N-block

Learning Parametric Descriptors 

Parameters: parameters of G,T,S,N-blocks 

Maximizing the area under the ROC curve. 

ROC: true positive rate VS. false positive rate 

Optimization method: Powell’s multidimensional direction 

set method

Input: 

Learning Non-Parametric 

Descriptors (E-block) 

S { x T ( p ), x T ( p ), l } 

i i j j ij 

Output: The optimized projections w.

Input: 



S { x T ( p ), x T ( p ), l } 

i i j j ij 

Output: The optimized projections w. 

E2: Minimizing the distance between the match pairs 

while keeping the overall variance of all vectors in the 

match pair set as big as possible in projection space. 

J ( w) 

 

 

1 

T 

( wx) 

1 

 

T 

2 

( w ( x )) 

l 1 i x 

 

j 

ij 

l 

ij 

i 

2



E4: Seeking the embedding space under which the 

distances between match pairs are minimized and the 

distances between non-matches pairs are maximized. 

J ( w) 

 

 

l 

ij 

0 

T 

( w ( x x )) 

2 

 

T 

2 

( w ( x )) 

l 1 i x 

 

j 

ij 

i j 

E6: Find projections that minimize the ratio of in-class 

variance for match pairs to the total data variance. 

( wx) 

T 2 

xiS i 

3( ) 

 

T 

2 

( w ( x )) 

l 1 i x 

 

j 

J w 

 

ij 

2

1 

2 

3 


0 

1 


A ( l ) x x 

T 

w Ai w 

i T 

J ( w) 

A ( x x )( x x ) 

ij 

i 

B ( x x )( x x ) 

l 

 

ij 

l 

S j 

 

 

A x x 

xS 

ij i 

T 

i 

i j i j 

T 

i i 

i j i j 

w Bw 

T 

T

1 

2 

3 


0 

1 


A ( l ) x x 

T 

w Ai w 

i T 

J ( w) 

A ( x x )( x x ) 

ij 

i 

B ( x x )( x x ) 

l 

 

ij 

l 

S j 

 

 

A x x 

xS 

ij i 

T 

i 

i j i j 

T 

i i 

i j i j 

w Bw 

T 

T 

A w 

Bw 

i



Orthogonality constraint on projections 

w , w , , 

w 1 2 k1 

T 

w Ai w 

arg max w T 

w Bw 

T 

s. t. w w 0, j 1,2, , 

k 1 

j

作者信息 

文章信息 

提纲 



实验 

结论

实验 

数据:大约250万标记好的匹配对和未匹配对,来自Yosemite, 

Notre Dame和Liberty三个自然场景的三维重建数据。

实验 

数据:大约250万标记好的匹配对和未匹配对,来自Yosemite, 

Notre Dame和Liberty三个自然场景的三维重建数据。 

评价方法 

ROC曲线: Correct Match Fraction VS. Incorrect Match Fraction 

95%错误率:在95%正确匹配被找到的条件下,不正确匹配数的 

比例

Parametric Descriptors



Parametric Descriptors 

1、凹的形状 

2、离中心越远,累加的区域越大 

3、性能优于SIFT,但维数高

Non-Parametric Descriptors

Non-Parametric Descriptors 

Trained on Yosemite, tested on Notre Dame



Dimension Reduced Parametric 

Descriptors







Effects of Normalization 

r/sqrt(D)

作者信息 

文章信息 

提纲 



实验 

结论

结论 

The techniques have been used in Photosynth and ICE 

(Image Compositing Editor) 

Photosynth: www.photosynth.com 

ICE: http://research.microsoft.com/ivm/ice.html

结论 

Recommendations by the authors

结论 

Recommendations by the authors 

1. Learning parameters from training data.

结论 


1. Learning parameters from training data. 

2. Use foveated summation regions.

结论 



2. Use foveated summation regions. 

3. Use non-linear filter responses.

结论 




3. Use non-linear filter responses. 

4. Use LDA for discriminative dimension reductions.

结论 




3. Use non-linear filter responses. 

4. Use LDA for discriminative dimension reductions. 

5. Normalization.

Thanks! 

Questions?

Discriminative Learning of Local Image Descriptors

Create successful ePaper yourself

Delete template?

Save as template?