Chapter 4 Networks in Their Surrounding Contexts - Cornell University
Chapter 4 Networks in Their Surrounding Contexts - Cornell University
Chapter 4 Networks in Their Surrounding Contexts - Cornell University
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
4.4. TRACKING LINK FORMATION IN ON-LINE DATA 101<br />
averaged all the curves they obta<strong>in</strong>ed. In particular, the observations <strong>in</strong> each snapshot were<br />
one day apart, so their computation gives the average probability that two people form a<br />
l<strong>in</strong>k per day, as a function of the number of common friends they have.<br />
Figure 4.9 shows a plot of this curve (<strong>in</strong> the solid black l<strong>in</strong>e). The first th<strong>in</strong>g one notices<br />
is the clear evidence for triadic closure: T (0) is very close to 0, after which the probability<br />
of l<strong>in</strong>k formation <strong>in</strong>creases steadily as the number of common friends <strong>in</strong>creases. Moreover,<br />
for much of the plot, this probability <strong>in</strong>creases <strong>in</strong> a roughly l<strong>in</strong>ear fashion as a function<br />
of the number of common friends, with an upward bend away from a straight-l<strong>in</strong>e shape.<br />
The curve turns upward <strong>in</strong> a particularly pronounced way from 0 to 1 to 2 friends: hav<strong>in</strong>g<br />
two common friends produces significantly more than twice the effect on l<strong>in</strong>k formation<br />
compared to hav<strong>in</strong>g a s<strong>in</strong>gle common friend. (The upward effect from 8 to 9 to 10 friends is<br />
also significant, but it occurs on a much smaller sub-population, s<strong>in</strong>ce many fewer people <strong>in</strong><br />
the data have this many friends <strong>in</strong> common without hav<strong>in</strong>g already formed a l<strong>in</strong>k.)<br />
To <strong>in</strong>terpret this plot more deeply, it helps to compare it to an <strong>in</strong>tentionally simplified<br />
basel<strong>in</strong>e model, describ<strong>in</strong>g what one might have expected the data to look like <strong>in</strong> the presence<br />
of triadic closure. Suppose that for some small probability p, each common friend that two<br />
people have gives them an <strong>in</strong>dependent probability p of form<strong>in</strong>g a l<strong>in</strong>k each day. So if two<br />
people have k friends <strong>in</strong> common, the probability they fail to form a l<strong>in</strong>k on any given day is<br />
(1 − p) k : this is because each common friend fails to cause the l<strong>in</strong>k to form with probability<br />
1 − p, and these k trials are <strong>in</strong>dependent. S<strong>in</strong>ce (1 − p) k is the probability the l<strong>in</strong>k fails<br />
to form on a given day, the probability that it does form, accord<strong>in</strong>g to our simple basel<strong>in</strong>e<br />
model, is<br />
Tbasel<strong>in</strong>e(k) = 1 − (1 − p) k .<br />
We plot this curve <strong>in</strong> Figure 4.9 as the upper dotted l<strong>in</strong>e. Given the small absolute effect of<br />
the first common friend <strong>in</strong> the data, we also show a comparison to the curve 1 − (1 − p) k−1 ,<br />
which just shifts the simple basel<strong>in</strong>e curve one unit to the right. Aga<strong>in</strong>, the po<strong>in</strong>t is not to<br />
propose this basel<strong>in</strong>e as an explanatory mechanism for triadic closure, but rather to look at<br />
how the real data compares to it. Both the real curve and the basel<strong>in</strong>e curve are close to<br />
l<strong>in</strong>ear, and hence qualitatively similar; but the fact that the real data turns upward while the<br />
basel<strong>in</strong>e curve turns slightly downward <strong>in</strong>dicates that the assumption of <strong>in</strong>dependent effects<br />
from common friends is too simple to be fully supported by the data.<br />
A still larger and more detailed study of these effects was conducted by Leskovec et<br />
al. [272], who analyzed properties of triadic closure <strong>in</strong> the on-l<strong>in</strong>e social networks of L<strong>in</strong>kedIn,<br />
Flickr, Del.icio.us, and Yahoo! Answers. It rema<strong>in</strong>s an <strong>in</strong>terest<strong>in</strong>g question to try understand<strong>in</strong>g<br />
the similarities and variations <strong>in</strong> triadic closure effects across social <strong>in</strong>teraction <strong>in</strong><br />
a range of different sett<strong>in</strong>gs.