automatically exploiting cross-invocation parallelism using runtime ...

automatically exploiting cross-invocation parallelism using runtime ... automatically exploiting cross-invocation parallelism using runtime ...

dataspace.princeton.edu
from dataspace.princeton.edu More from this publisher
13.07.2015 Views

Algorithm 2: Pseudo-code for worker< depTid, depIterNum > ← consume()while depTid ≠ NO SYNC dowhile latestFinished[depTid] < depIterNum dosleep()< depTid, depIterNum > ← consume()doWork(depIterNum)latestFinished[getTid()] ← depIterNumfor worker thread T2 to finish iteration I2 (wait until latestFinished[T2] ≥ I2).Worker thread T1 then consumes the (NO SYNC,I3) and begins execution of iterationI3.Using this synchronization scheme, instead of stalling both threads to wait for firstinvocation to finish, only thread T1 needs to synchronize while thread T2 can move on toexecute iterations from the second invocation.32

OriginalGeneratedInvoc. Iter. Access Sched. Combined Iter.shadow- - - - initialize 〈⊥, ⊥〉 , 〈⊥, ⊥〉 , 〈⊥, ⊥〉 , 〈⊥, ⊥〉1 1 A1 T1 I1 〈⊥, ⊥〉 , 〈T1, I1〉 , 〈⊥, ⊥〉 , 〈⊥, ⊥〉1 2 A3 T2 I2 〈⊥, ⊥〉 , 〈T1, I1〉 , 〈⊥, ⊥〉 , 〈T2, I2〉2 1 A3 T1 I3 〈⊥, ⊥〉 , 〈T1, I1〉 , 〈⊥, ⊥〉 , 〈T1, I3〉2 2 A2 T2 I4 〈⊥, ⊥〉 , 〈T1, I1〉 , 〈T2, I4〉 , 〈T1, I3〉(a)!" # # (b)Figure 3.5: Scheduler scheme running example: (a) Table showing original invocation/iteration,array element accessed in iteration, thread the iteration is scheduled to, combinediteration number, and helper data structure values (b) Execution of the example.33

Algorithm 2: Pseudo-code for worker< depTid, depIterNum > ← consume()while depTid ≠ NO SYNC dowhile latestFinished[depTid] < depIterNum dosleep()< depTid, depIterNum > ← consume()doWork(depIterNum)latestFinished[getTid()] ← depIterNumfor worker thread T2 to finish iteration I2 (wait until latestFinished[T2] ≥ I2).Worker thread T1 then consumes the (NO SYNC,I3) and begins execution of iterationI3.Using this synchronization scheme, instead of stalling both threads to wait for first<strong>invocation</strong> to finish, only thread T1 needs to synchronize while thread T2 can move on toexecute iterations from the second <strong>invocation</strong>.32

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!