13.07.2015 Views

Machine learning in complex networks

Machine learning in complex networks

Machine learning in complex networks

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Muzeeker• Wikipedia based common sense• Wikipedia used as a proxy for themusic users mental model• Implementation: Filter retrievalus<strong>in</strong>g Wikipedia’s article/categories• Muzeeker.com• LINK PREDICTION to complete theontological quality of Wikipedia


Network models• Nodes/vertices and l<strong>in</strong>ks/edges– Directed / undirected– Weighted / un-weighted• L<strong>in</strong>k distributions– Random– Long tail– Hubs and authorities• L<strong>in</strong>k <strong>in</strong>duced correlations– The Rich club• Communities– L<strong>in</strong>k prediction


Motivation for community detection• Community structure may mark a non-stationary l<strong>in</strong>kdistribution with “high and low density” sub-<strong>networks</strong>, hencesummariz<strong>in</strong>g with a s<strong>in</strong>gle “model” could be mislead<strong>in</strong>g


Modularity can be predictive for dynamicsM.E.J. Newman and M. Girvan, F<strong>in</strong>d<strong>in</strong>g and evaluat<strong>in</strong>gcommunity structure <strong>in</strong> <strong>networks</strong>, Phys. Rev. E 69, 026113 (2004).


Modularity objective functionThe modularity is expressed as a sum over l<strong>in</strong>ks, such that we penalizemiss<strong>in</strong>g l<strong>in</strong>ks <strong>in</strong> communities - miss<strong>in</strong>g is measured relative to anull distribution P 0 ij.⎡ Aij⎤Q = ∑ PPi jδ ( ci , cj)ij ⎢ −2m⎥⎣ ⎦C i is the community assignment of node jand 2m = Σ ij A ij , k i = Σ j A ijThe null is a basel<strong>in</strong>e distribution P ij = k i k j /(2m) 2The value of the modularity lies <strong>in</strong> the range [−1,1].It is positive if the number of edges with<strong>in</strong> groupsexceeds the number expected on the basis of chanceM.E.J. Newman and M. Girvan. F<strong>in</strong>d<strong>in</strong>g and evaluat<strong>in</strong>g communitystructure <strong>in</strong> <strong>networks</strong>. Physical Review E, 69:026113,2004, cond-mat/0308217.


Potts representationIntroduce 0,1 b<strong>in</strong>ary variables S kj cod<strong>in</strong>g thecommunity assignment: “node j is member of community k”δ ( c, c ) =∑S Si j k ki kjAijP( j, i)=2m⎡ Aij⎤ ⎡ Aij⎤Q = ∑ ( , )ij ⎢ − PP c c PP S Sij k2m⎥δ= ∑ ⎢ −2m⎥∑⎣ ⎦ ⎣ ⎦1 Tr( SBS ')Q = ∑ Bijk ijSkiSkj=2m2mi j i j i j ki kj


Spectral optimization• Newman relaxes the optimization problem to the simplexQ =1 Tr( SBS ')∑ BijS kiSijkkj=2m2mL =Tr( SBS ')+ Tr( ΛS )2 mB S= S Λ


Comb<strong>in</strong>atorial optimization• We can use a physics analogy Simulated Anneal<strong>in</strong>g (Kirkpatricket al. 1983)QS ( ) TrSBS ( ')PS ( | AT , ) ∝ exp( ) = exp( )T2mT• Gibbs sampl<strong>in</strong>g is a Monte Carlo realization of a Markovprocess <strong>in</strong> which each variable is randomly assignedaccord<strong>in</strong>g to its marg<strong>in</strong>al distributionPS ( | S , AT , )j−jPS ( | AT , )=∑ PS ( | AT , )SjS Geman,D Geman, "Stochastic Relaxation,Gibbs Distributions, and theBayesian Restoration of Images".IEEE Transactions on Pattern Analysis and<strong>Mach<strong>in</strong>e</strong> Intelligence 6 (6): 721–741 (1984)


Potts model 1-node• Discrete probability distribution on states k = 1,…,Kk=1 k k( | , ) exp ,PS ATPS ( | AT , ) =Sk∝∏kk⎛⎜⎝( r)k= r =∑SkK∑TSϕ⎞⎟⎠⎛ϕk⎞exp⎜⎟⎝ T ⎠⎛ϕexpk ' ⎜⎝ Tk '⎞⎟⎠


Gibbs sampl<strong>in</strong>gϕBij Aij k ki j= ∑ S = ∑ S −∑Sj j j2m 2m 2m 2mki kj kj kjrki=∑exp( ϕki/ T )exp( ϕ / T )k 'ki 'Si=potts( r)i


Determ<strong>in</strong>istic anneal<strong>in</strong>g• Instead of draw<strong>in</strong>g Gibbs samples accord<strong>in</strong>g to the marg<strong>in</strong>als wecan average <strong>in</strong>stead, this provides a set of self-consistentequations for the means (for 0,1 Bernoulli variables the meanis the probability μ ki =P(S ki ))rki=∑exp( ϕki/ T )exp( ϕ / T )k 'ki 'ϕBijAij= ∑ r = ∑ r −∑PPrj j j2m2mki kj kj i j kjS. Lehmann, L.K. Hansen: Determ<strong>in</strong>istic modularity optimizationEuropean Physical Journal B 60(1) 83-88 (2007).


Experimental evaluation• Create a simple testbed with l<strong>in</strong>k probability and “noise”S. Lehmann, L.K. Hansen: Determ<strong>in</strong>istic modularity optimizationEuropean Physical Journal B 60(1) 83-88 (2007).


S. Lehmann, L.K. Hansen: Determ<strong>in</strong>istic modularity optimizationEuropean Physical Journal B 60(1) 83-88 (2007).


Generative community model (Hofman & Wigg<strong>in</strong>s, 2008)PASpq ( | , , ) = p(1 − p) q(1 −q)cd=12121212∑∑j≠i,k= (1 − A ) S S∑j≠i,kj≠iij kj kic d e fij kj ki( 1 )ij k kj ki( )f = (1 − A ) 1 − S Sj≠iA S Se= A − S S∑∑∑ij k kj ki


Learn<strong>in</strong>g parameters of the generative model• Hofman & Wigg<strong>in</strong>s (2008)•Here– “Variational Bayes”– Dirichlets/beta prior and posteriordistributions for the probabilities– Very well determ<strong>in</strong>ed (over kill)– Independent b<strong>in</strong>omials for the assignmentvariables (misses correlation)– Maximum likelihood for the parameters– Gibbs sampl<strong>in</strong>g for the assignmentsJake M. Hofman and Chris H. Wigg<strong>in</strong>s,Bayesian Approach to Network ModularityPhys. Rev. Lett. 100, 258701 (2008),


The community detection thresholdhow many l<strong>in</strong>ks are needed to detect the structure?P<strong>in</strong>p SNR= =qC ( −1) C−1Jorg Reichardt and Michele Leone,Un)detectable Cluster Structure <strong>in</strong> Sparse NetworksPhys. Rev. Lett. 101, 078701 (2008),


Experimental design• Planted solution– N = 1000 nodes–C true= 5– Quality: Mutual <strong>in</strong>formation between• planted assignments and the best identified• Gibbs sampl<strong>in</strong>g– No anneal<strong>in</strong>g– Burn-<strong>in</strong> 200 iterations– Averag<strong>in</strong>g 800 iterations• Parameter <strong>learn<strong>in</strong>g</strong>– Q = 10 iterations


Community Detection – fully <strong>in</strong>formed on number of communities and probabilitiesMUTUAL INF. PLANTED COMMUNITYMUTUAL INF. PLANTED COMMUNITYCOMMUNITY DETECTION (N =1000, C = 10, SNR = 50)2.521.510.500 0.01 0.02 0.03 0.04 0.05INTRA COMMUNITY LINK PROB (P)COMMUNITY DETECTION (N =1000, C = 5, SNR = 50)2.521.510.500 0.01 0.02 0.03 0.04 0.05INTRA COMMUNITY LINK PROB (P)MUTUAL INF. PLANTED COMMUNITYMUTUAL INF. PLANTED COMMUNITYCOMMUNITY DETECTION (N =1000, C = 5, SNR = 5)2.521.510.500 0.01 0.02 0.03 0.04 0.05INTRA COMMUNITY LINK PROB (P)COMMUNITY DETECTION (N =1000, C = 5, SNR = 10)2.521.510.500 0.01 0.02 0.03 0.04 0.05INTRA COMMUNITY LINK PROB (P)


Now what happens to the phase transition if welearn the parameters … with a too <strong>complex</strong> model(C > C true = 5) ?MUTUAL INF. PLANTED COMMUNITYCOMMUNITY DETECTION (N =1000, C = 10, SNR = 10)2.521.510.500 0.02 0.04 0.06 0.08 0.1INTRA COMMUNITY LINK PROB (P)MUTUAL INF. PLANTED COMMUNITYCOMMUNITY DETECTION (N =1000, C = 10, SNR = 5)2.521.510.500 0.02 0.04 0.06 0.08 0.1INTRA COMMUNITY LINK PROB (P)200MEMBERSHIPS1501005001 2 3 4 5 6 7 8 9 10COMMUNITY


Conclusions• Community detection can be formulated as an<strong>in</strong>ference problem (Hofman & Wigg<strong>in</strong>s, 2008)• The sampl<strong>in</strong>g process for fixed SNR has a phasetransition like detection threshold (Richard &Leone, 2008)• The phase transition rema<strong>in</strong>s (sharpens?) if youlearn the parameters of a generative model withunknown <strong>complex</strong>ity

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!