Tutorial on ranking methods in machine learning - Computer ...
Tutorial on ranking methods in machine learning - Computer ...
Tutorial on ranking methods in machine learning - Computer ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Rank<strong>in</strong>g Methods <strong>in</strong> Mach<strong>in</strong>e Learn<strong>in</strong>g<br />
A <str<strong>on</strong>g>Tutorial</str<strong>on</strong>g> Introducti<strong>on</strong><br />
Shivani Agarwal<br />
<strong>Computer</strong> Science & Artificial Intelligence Laboratory<br />
Massachusetts Institute of Technology
Example 1: Recommendati<strong>on</strong> Systems
Example 2: Informati<strong>on</strong> Retrieval
Example 2: Informati<strong>on</strong> Retrieval<br />
<strong>in</strong>formati<strong>on</strong>
Example 2: Informati<strong>on</strong> Retrieval<br />
<strong>in</strong>formati<strong>on</strong>
Example 3: Drug Discovery<br />
Problem: Milli<strong>on</strong>s of structures <strong>in</strong> a chemical library.<br />
How do we identify the most promis<strong>in</strong>g <strong>on</strong>es?
Example 4: Bio<strong>in</strong>formatics<br />
Human genetics is now at a critical juncture.<br />
The molecular <strong>methods</strong> used successfully to<br />
identify the genes underly<strong>in</strong>g rare mendelian<br />
syndromes are fail<strong>in</strong>g to f<strong>in</strong>d the numerous<br />
genes caus<strong>in</strong>g more comm<strong>on</strong>, familial, n<strong>on</strong>mendelian<br />
diseases . . .<br />
Nature 405:847–856, 2000
Example 4: Bio<strong>in</strong>formatics<br />
With the human genome sequence near<strong>in</strong>g<br />
completi<strong>on</strong>, new opportunities are be<strong>in</strong>g<br />
presented for unravell<strong>in</strong>g the complex genetic<br />
basis of n<strong>on</strong>mendelian disorders based <strong>on</strong><br />
large-scale genomewide studies . . .<br />
Nature 405:847–856, 2000
Types of Rank<strong>in</strong>g Problems<br />
Instance Rank<strong>in</strong>g<br />
Label Rank<strong>in</strong>g<br />
Subset Rank<strong>in</strong>g<br />
Rank Aggregati<strong>on</strong><br />
?
Instance Rank<strong>in</strong>g<br />
> , 10<br />
doc1 doc2<br />
> , 20<br />
doc3 doc5<br />
…
doc1<br />
doc2<br />
…<br />
Label Rank<strong>in</strong>g<br />
sports > politics<br />
health > m<strong>on</strong>ey<br />
…<br />
science > sports<br />
m<strong>on</strong>ey > politics<br />
…
Subset Rank<strong>in</strong>g<br />
query 1 doc1 > doc2 , doc3 doc5<br />
…<br />
> …<br />
query 2 doc2 > doc4 , doc11 ><br />
doc3 …
…<br />
Rank Aggregati<strong>on</strong><br />
query 1 of search of search …<br />
query 2<br />
results<br />
eng<strong>in</strong>e 1<br />
results<br />
of search<br />
eng<strong>in</strong>e 1<br />
results<br />
eng<strong>in</strong>e 2<br />
results<br />
of search<br />
eng<strong>in</strong>e 2<br />
…<br />
desired<br />
<strong>rank<strong>in</strong>g</strong><br />
desired<br />
<strong>rank<strong>in</strong>g</strong>
Types of Rank<strong>in</strong>g Problems<br />
Instance Rank<strong>in</strong>g<br />
Label Rank<strong>in</strong>g<br />
Subset Rank<strong>in</strong>g<br />
Rank Aggregati<strong>on</strong><br />
?<br />
This tutorial
<str<strong>on</strong>g>Tutorial</str<strong>on</strong>g> Road Map<br />
Part I: Theory & Algorithms<br />
Bipartite Rank<strong>in</strong>g<br />
k-partite Rank<strong>in</strong>g<br />
Rank<strong>in</strong>g with Real-Valued Labels<br />
General Instance Rank<strong>in</strong>g<br />
Part II: Applicati<strong>on</strong>s<br />
Further Read<strong>in</strong>g & Resources<br />
RankSVM<br />
RankBoost<br />
RankNet<br />
Applicati<strong>on</strong>s to Bio<strong>in</strong>formatics<br />
Applicati<strong>on</strong>s to Drug Discovery<br />
Subset Rank<strong>in</strong>g and Applicati<strong>on</strong>s to Informati<strong>on</strong> Retrieval
Part I<br />
Theory & Algorithms<br />
[for Instance Rank<strong>in</strong>g]
Bipartite Rank<strong>in</strong>g<br />
Relevant (+) …<br />
doc1+ doc2+ doc3+<br />
Irrelevant (-) …<br />
doc1- doc2- doc3- doc4-
Bipartite Rank<strong>in</strong>g
Bipartite Rank<strong>in</strong>g
Is Bipartite Rank<strong>in</strong>g Different from<br />
B<strong>in</strong>ary Classificati<strong>on</strong>?
Is Bipartite Rank<strong>in</strong>g Different from<br />
B<strong>in</strong>ary Classificati<strong>on</strong>?
Is Bipartite Rank<strong>in</strong>g Different from<br />
B<strong>in</strong>ary Classificati<strong>on</strong>?
Is Bipartite Rank<strong>in</strong>g Different from<br />
B<strong>in</strong>ary Classificati<strong>on</strong>?
Is Bipartite Rank<strong>in</strong>g Different from<br />
B<strong>in</strong>ary Classificati<strong>on</strong>?
Is Bipartite Rank<strong>in</strong>g Different from<br />
B<strong>in</strong>ary Classificati<strong>on</strong>?
Bipartite Rank<strong>in</strong>g:<br />
Basic Algorithmic Framework
Bipartite RankSVM Algorithm<br />
[Herbrich et al, 2000; Joachims, 2002; Rakotomam<strong>on</strong>jy, 2004]
Bipartite RankSVM Algorithm
Bipartite RankBoost Algorithm<br />
[Freund et al, 2003]
Bipartite RankBoost Algorithm
Bipartite RankNet Algorithm<br />
[Burges et al, 2005]
k-partite Rank<strong>in</strong>g<br />
Rat<strong>in</strong>g k …<br />
doc1k doc2k doc3k …<br />
Rat<strong>in</strong>g 2 …<br />
doc12 doc22 doc32 doc42 Rat<strong>in</strong>g 1 …<br />
doc11 doc21 doc31 doc41
k-partite Rank<strong>in</strong>g
k-partite Rank<strong>in</strong>g:<br />
Basic Algorithmic Framework
Rank<strong>in</strong>g with Real-Valued Labels<br />
doc1<br />
doc2<br />
doc3<br />
…<br />
y1<br />
y2<br />
y3
Rank<strong>in</strong>g with Real-Valued Labels
Rank<strong>in</strong>g with Real-Valued Labels:<br />
Basic Algorithmic Framework
General Instance Rank<strong>in</strong>g<br />
><br />
doc1 doc1’<br />
, r1<br />
> , r2<br />
doc2 doc2’<br />
…
General Instance Rank<strong>in</strong>g<br />
r i
General Instance Rank<strong>in</strong>g
General Instance Rank<strong>in</strong>g:<br />
Basic Algorithmic Framework
General RankSVM Algorithm<br />
[Herbrich et al, 2000; Joachims, 2002]
General RankBoost Algorithm<br />
[Freund et al, 2003]
General RankNet Algorithm<br />
[Burges et al, 2005]
<str<strong>on</strong>g>Tutorial</str<strong>on</strong>g> Road Map<br />
Part I: Theory & Algorithms<br />
Bipartite Rank<strong>in</strong>g<br />
k-partite Rank<strong>in</strong>g<br />
Rank<strong>in</strong>g with Real-Valued Labels<br />
General Instance Rank<strong>in</strong>g<br />
Part II: Applicati<strong>on</strong>s<br />
Further Read<strong>in</strong>g & Resources<br />
RankSVM<br />
RankBoost<br />
RankNet<br />
Applicati<strong>on</strong>s to Bio<strong>in</strong>formatics<br />
Applicati<strong>on</strong>s to Drug Discovery<br />
Subset Rank<strong>in</strong>g and Applicati<strong>on</strong>s to Informati<strong>on</strong> Retrieval
Part II<br />
Applicati<strong>on</strong>s<br />
[and Subset Rank<strong>in</strong>g]
Applicati<strong>on</strong> to Bio<strong>in</strong>formatics<br />
Human genetics is now at a critical juncture.<br />
The molecular <strong>methods</strong> used successfully to<br />
identify the genes underly<strong>in</strong>g rare mendelian<br />
syndromes are fail<strong>in</strong>g to f<strong>in</strong>d the numerous<br />
genes caus<strong>in</strong>g more comm<strong>on</strong>, familial, n<strong>on</strong>mendelian<br />
diseases . . .<br />
Nature 405:847–856, 2000
Applicati<strong>on</strong> to Bio<strong>in</strong>formatics<br />
With the human genome sequence near<strong>in</strong>g<br />
completi<strong>on</strong>, new opportunities are be<strong>in</strong>g<br />
presented for unravell<strong>in</strong>g the complex genetic<br />
basis of n<strong>on</strong>mendelian disorders based <strong>on</strong><br />
large-scale genomewide studies . . .<br />
Nature 405:847–856, 2000
Identify<strong>in</strong>g Genes Relevant to a Disease<br />
Us<strong>in</strong>g Microarrray Gene Expressi<strong>on</strong> Data
Identify<strong>in</strong>g Genes Relevant to a Disease<br />
Us<strong>in</strong>g Microarrray Gene Expressi<strong>on</strong> Data
Identify<strong>in</strong>g Genes Relevant to a Disease<br />
Us<strong>in</strong>g Microarrray Gene Expressi<strong>on</strong> Data
Identify<strong>in</strong>g Genes Relevant to a Disease<br />
Us<strong>in</strong>g Microarrray Gene Expressi<strong>on</strong> Data
Identify<strong>in</strong>g Genes Relevant to a Disease<br />
Us<strong>in</strong>g Microarrray Gene Expressi<strong>on</strong> Data
Identify<strong>in</strong>g Genes Relevant to a Disease<br />
Us<strong>in</strong>g Microarrray Gene Expressi<strong>on</strong> Data
Formulati<strong>on</strong> as a Bipartite<br />
Relevant<br />
Not relevant<br />
Rank<strong>in</strong>g Problem<br />
…<br />
…
Microarray Gene Expressi<strong>on</strong> Data Sets<br />
[Golub et al, 1999; Al<strong>on</strong> et al, 1999]
Selecti<strong>on</strong> of Tra<strong>in</strong><strong>in</strong>g Genes
Top-Rank<strong>in</strong>g Genes for Leukemia<br />
Returned by RankBoost<br />
[Agarwal & Sengupta, 2009]
Biological Validati<strong>on</strong><br />
[Agarwal et al, 2010]
Applicati<strong>on</strong> to Drug Discovery<br />
Problem: Milli<strong>on</strong>s of structures <strong>in</strong> a chemical library.<br />
How do we identify the most promis<strong>in</strong>g <strong>on</strong>es?
Formulati<strong>on</strong> as a Rank<strong>in</strong>g Problem<br />
with Real-Valued Labels
Chem<strong>in</strong>formatics Data Sets<br />
[Sutherland et al, 2004]
DHFR Results Us<strong>in</strong>g RankSVM<br />
[Agarwal et al, 2010]
Applicati<strong>on</strong> to Informati<strong>on</strong> Retrieval (IR)
Learn<strong>in</strong>g to Rank <strong>in</strong> IR
Learn<strong>in</strong>g to Rank <strong>in</strong> IR
Learn<strong>in</strong>g to Rank <strong>in</strong> IR
Learn<strong>in</strong>g to Rank <strong>in</strong> IR
Learn<strong>in</strong>g to Rank <strong>in</strong> IR
Learn<strong>in</strong>g to Rank <strong>in</strong> IR
Learn<strong>in</strong>g to Rank <strong>in</strong> IR
Learn<strong>in</strong>g to Rank <strong>in</strong> IR
Learn<strong>in</strong>g to Rank <strong>in</strong> IR
Learn<strong>in</strong>g to Rank <strong>in</strong> IR
General Subset Rank<strong>in</strong>g<br />
query 1 doc1 > doc2 , doc3 doc5<br />
…<br />
> …<br />
query 2 doc2 > doc4 , doc11 ><br />
doc3 …
General Subset Rank<strong>in</strong>g
Subset Rank<strong>in</strong>g with Real-Valued<br />
Relevance Labels<br />
query 1 doc1 y1 , doc2 y2 , doc3 y3<br />
…<br />
query 2<br />
…<br />
doc1<br />
y1 , y2 , y3<br />
doc2 doc3<br />
…
Subset Rank<strong>in</strong>g with Real-Valued<br />
Relevance Labels
RankSVM Applied to IR/Subset Rank<strong>in</strong>g<br />
Standard RankSVM<br />
[Joachims, 2002]
RankSVM Applied to IR/Subset Rank<strong>in</strong>g<br />
RankSVM with Query Normalizati<strong>on</strong><br />
& Relevance Weight<strong>in</strong>g<br />
[Agarwal & Coll<strong>in</strong>s, 2010; also Cao et al, 2006]
Rank<strong>in</strong>g Performance Measures <strong>in</strong> IR<br />
Mean Average Precisi<strong>on</strong> (MAP)
Rank<strong>in</strong>g Performance Measures <strong>in</strong> IR<br />
Normalized Discounted Cumulative Ga<strong>in</strong> (NDCG)
Rank<strong>in</strong>g Algorithms for Optimiz<strong>in</strong>g<br />
MAP/NDCG
LETOR 3.0/OHSUMED Data Set<br />
[Liu et al, 2007]
OHSUMED Results – NDCG
OHSUMED Results – MAP
Further Read<strong>in</strong>g & Resources<br />
[Incomplete!]
Early Papers <strong>on</strong> Rank<strong>in</strong>g<br />
W. W. Cohen, R. E. Schapire, and Y. S<strong>in</strong>ger, Learn<strong>in</strong>g to order th<strong>in</strong>gs,<br />
Journal of Artificial Intelligence Research, 10:243–270, 1999.<br />
R. Herbrich, T. Graepel, and K. Obermayer, Large marg<strong>in</strong> rank boundaries<br />
for ord<strong>in</strong>al regressi<strong>on</strong>. Advances <strong>in</strong> Large Marg<strong>in</strong> Classifiers, 2000.<br />
T. Joachims, Optimiz<strong>in</strong>g search eng<strong>in</strong>es us<strong>in</strong>g clickthrough data, KDD 2002.<br />
Y. Freund, R. Iyer, R. E. Schapire, and Y. S<strong>in</strong>ger, An efficient boost<strong>in</strong>g<br />
algorithm for comb<strong>in</strong><strong>in</strong>g preferences. Journal of Mach<strong>in</strong>e Learn<strong>in</strong>g<br />
Research, 4:933–969, 2003.<br />
C.J.C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilt<strong>on</strong>,<br />
G. Hullender, Learn<strong>in</strong>g to rank us<strong>in</strong>g gradient descent, ICML 2005.
Generalizati<strong>on</strong> Bounds for Rank<strong>in</strong>g<br />
S. Agarwal, T. Graepel, R. Herbrich, S. Har-Peled, D. Roth, Generalizati<strong>on</strong><br />
bounds for the area under the ROC curve, Journal of Mach<strong>in</strong>e Learn<strong>in</strong>g<br />
Research, 6:393—425, 2005.<br />
S. Agarwal, P. Niyogi, Generalizati<strong>on</strong> bounds for <strong>rank<strong>in</strong>g</strong> algorithms via<br />
algorithmic stability, Journal of Mach<strong>in</strong>e Learn<strong>in</strong>g Research, 10:441—474,<br />
2009.<br />
C. Rud<strong>in</strong>, R. Schapire, Marg<strong>in</strong>-based <strong>rank<strong>in</strong>g</strong> and an equivalence between<br />
AdaBoost and RankBoost, Journal of Mach<strong>in</strong>e Learn<strong>in</strong>g Research,<br />
10: 2193—2232, 2009
Bio<strong>in</strong>formatics/Drug Discovery<br />
Applicati<strong>on</strong>s<br />
S. Agarwal and S. Sengupta, Rank<strong>in</strong>g genes by relevance to a disease,<br />
CSB 2009.<br />
S. Agarwal, D. Dugar, and S. Sengupta, Rank<strong>in</strong>g chemical structures for<br />
drug discovery: A new mach<strong>in</strong>e learn<strong>in</strong>g approach. Journal of Chemical<br />
Informati<strong>on</strong> and Model<strong>in</strong>g, DOI 10.1021/ci9003865, 2010.
Natural Language Process<strong>in</strong>g<br />
Other Applicati<strong>on</strong>s<br />
M. Coll<strong>in</strong>s and T. Koo, Discrim<strong>in</strong>ative re<strong>rank<strong>in</strong>g</strong> for natural language<br />
pars<strong>in</strong>g, Computati<strong>on</strong>al L<strong>in</strong>guistics, 31:25—69, 2005.<br />
Collaborative Filter<strong>in</strong>g<br />
M. Weimer, A. Karatzoglou, Q. V. Le, and A. Smola, CofiRank - Maximum<br />
marg<strong>in</strong> matrix factorizati<strong>on</strong> for collaborative <strong>rank<strong>in</strong>g</strong>, NIPS 2007.<br />
Manhole Event Predicti<strong>on</strong><br />
C. Rud<strong>in</strong>, R. Pass<strong>on</strong>neau, A. Radeva, H. Dutta, S. Ierome, and D. Isaac , A<br />
process for predict<strong>in</strong>g manhole events <strong>in</strong> Manhattan, Mach<strong>in</strong>e Learn<strong>in</strong>g,<br />
DOI 10.1007/s10994-009-5166, 2010.
IR Rank<strong>in</strong>g Algorithms<br />
Y. Cao, J. Xu, T.-Y. Liu, H. Li, Y. Hunag, and H.W. H<strong>on</strong>, Adapt<strong>in</strong>g <strong>rank<strong>in</strong>g</strong><br />
SVM to document retrieval, SIGIR 2006.<br />
C.J.C. Burges, R. Ragno, and Q.V. Le, Learn<strong>in</strong>g to rank with n<strong>on</strong>-smooth<br />
cost functi<strong>on</strong>s. NIPS 2006.<br />
J. Xu and H. Li, AdaRank: A boost<strong>in</strong>g algorithm for <strong>in</strong>formati<strong>on</strong> retrieval.<br />
SIGIR 2007.<br />
Y. Yue, T. F<strong>in</strong>ley, F. Radl<strong>in</strong>ski, and T. Joachims, A support vector method<br />
for optimiz<strong>in</strong>g average precisi<strong>on</strong>. SIGIR 2007.<br />
M. Taylor, J. Guiver, S. Roberts<strong>on</strong>, T. M<strong>in</strong>ka, Softrank: optimiz<strong>in</strong>g<br />
n<strong>on</strong>-smooth rank metrics. WSDM 2008.
IR Rank<strong>in</strong>g Algorithms<br />
S. Chakrabarti, R. Khanna, U. Sawant, and C. Bhattacharyya, Structured<br />
learn<strong>in</strong>g for n<strong>on</strong>smooth <strong>rank<strong>in</strong>g</strong> losses. KDD 2008.<br />
D. Cossock and T. Zhang, Statistical analysis of Bayes optimal subset<br />
<strong>rank<strong>in</strong>g</strong>, IEEE Transacti<strong>on</strong>s <strong>on</strong> Informati<strong>on</strong> Theory, 54:5140–5154, 2008.<br />
T. Q<strong>in</strong>, X.D. Zhang, M.F. Tsai, D.S. Wang, T.Y. Liu, and H. Li. Query-level<br />
loss functi<strong>on</strong>s for <strong>in</strong>formati<strong>on</strong> retrieval. Informati<strong>on</strong> Process<strong>in</strong>g and<br />
Management, 44:838–855, 2008.<br />
O. Chapelle and M. Wu, Gradient descent optimizati<strong>on</strong> of smoothed<br />
<strong>in</strong>formati<strong>on</strong> retrieval metrics. Informati<strong>on</strong> Retrieval (To appear), 2010.<br />
S. Agarwal and M. Coll<strong>in</strong>s, Maximum marg<strong>in</strong> <strong>rank<strong>in</strong>g</strong> algorithms for<br />
<strong>in</strong>formati<strong>on</strong> retrieval, ECIR 2010.
NIPS Workshop 2005<br />
Learn<strong>in</strong>g to Rank<br />
SIGIR Workshops 2007-2009<br />
Learn<strong>in</strong>g to Rank for Informati<strong>on</strong> Retrieval<br />
NIPS Workshop 2009<br />
Advances <strong>in</strong> Rank<strong>in</strong>g<br />
American Institute of Mathematics<br />
Workshop <strong>in</strong> Summer 2010<br />
The Mathematics of Rank<strong>in</strong>g
<str<strong>on</strong>g>Tutorial</str<strong>on</strong>g> Articles & Books<br />
Tie-Yan Liu, Learn<strong>in</strong>g to Rank for Informati<strong>on</strong> Retrieval,<br />
Foundati<strong>on</strong>s & Trends <strong>in</strong> Informati<strong>on</strong> Retrieval, 2009.<br />
Shivani Agarwal, A <str<strong>on</strong>g>Tutorial</str<strong>on</strong>g> Introducti<strong>on</strong> to Rank<strong>in</strong>g Methods<br />
<strong>in</strong> Mach<strong>in</strong>e Learn<strong>in</strong>g, In preparati<strong>on</strong>.<br />
Shivani Agarwal (Ed.), Advances <strong>in</strong> Rank<strong>in</strong>g Methods <strong>in</strong> Mach<strong>in</strong>e<br />
Learn<strong>in</strong>g, Spr<strong>in</strong>ger-Verlag, In preparati<strong>on</strong>.