Tasking in OpenMP 3.0
Tasking in OpenMP 3.0 Tasking in OpenMP 3.0
Tasking in OpenMP Alejandro Duran Barcelona Supercomputing Center
- Page 2 and 3: Outline 1 Why task parallelism? 2 T
- Page 4 and 5: Why task parallelism? Why task para
- Page 6 and 7: Task parallelism Why task paralleli
- Page 8 and 9: Outline 1 Why task parallelism? 2 T
- Page 10 and 11: Task directive The OpenMP tasking m
- Page 12 and 13: List traversal with tasks Example T
- Page 14 and 15: Task data scoping Data scoping clau
- Page 16 and 17: Task data scoping In practice... Ex
- Page 18 and 19: Task data scoping In practice... Ex
- Page 20 and 21: Task data scoping In practice... Ex
- Page 22 and 23: Task data scoping In practice... Ex
- Page 24 and 25: List traversal Example The OpenMP t
- Page 26 and 27: Task synchronization The OpenMP tas
- Page 28 and 29: Outline 1 Why task parallelism? 2 T
- Page 30 and 31: List traversal Example L i s t l #p
- Page 32 and 33: List traversal Single traversal Exa
- Page 34 and 35: Task scheduling How it works? The O
- Page 36 and 37: Outline 1 Why task parallelism? Pit
- Page 38 and 39: Search problem Example Pitfalls & P
- Page 40 and 41: Search problem Example Pitfalls & P
- Page 42 and 43: Pitfalls & Performance issues Pitfa
- Page 44 and 45: Search problem Example Pitfalls & P
- Page 46 and 47: Pitfall #2 Out-of-scope data Proble
- Page 48 and 49: Search problem Example Pitfalls & P
- Page 50 and 51: Pitfalls & Performance issues Reduc
<strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong><br />
Alejandro Duran<br />
Barcelona Supercomput<strong>in</strong>g Center
Outl<strong>in</strong>e<br />
1 Why task parallelism?<br />
2 The <strong>OpenMP</strong> task<strong>in</strong>g model<br />
Creat<strong>in</strong>g tasks<br />
Data scop<strong>in</strong>g<br />
Syncroniz<strong>in</strong>g tasks<br />
Execution model<br />
Outl<strong>in</strong>e<br />
3 Pitfalls & Performance issues<br />
4 Conclusions<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 2 / 48
Outl<strong>in</strong>e<br />
1 Why task parallelism?<br />
Why task parallelism?<br />
2 The <strong>OpenMP</strong> task<strong>in</strong>g model<br />
Creat<strong>in</strong>g tasks<br />
Data scop<strong>in</strong>g<br />
Syncroniz<strong>in</strong>g tasks<br />
Execution model<br />
3 Pitfalls & Performance issues<br />
4 Conclusions<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 3 / 48
Why task parallelism?<br />
Why task parallelism?<br />
List traversal<br />
Example<br />
void t r a v e r s e _ l i s t ( L i s t l )<br />
{<br />
Element e ;<br />
#pragma omp p a r a l l e l private ( e )<br />
for ( e = l −> f i r s t ; e ; e = e−>next )<br />
#pragma omp s<strong>in</strong>gle nowait<br />
process ( e ) ;<br />
}<br />
Without tasks<br />
Ackward<br />
Very poor<br />
performance<br />
Not composable<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 4 / 48
Why task parallelism?<br />
Why task parallelism?<br />
Tree traversal<br />
Example<br />
void t r a v e r s e ( Tree ∗ t r e e )<br />
{<br />
#pragma omp p a r a l l e l sections<br />
{<br />
#pragma omp section<br />
i f ( tree −> l e f t )<br />
t r a v e r s e ( tree −> l e f t ) ;<br />
#pragma omp section<br />
i f ( tree −>r i g h t )<br />
t r a v e r s e ( tree −>r i g h t ) ;<br />
}<br />
process ( t r e e ) ;<br />
}<br />
Without tasks<br />
Too many parallel regions<br />
Extra overheads<br />
Extra synchronizations<br />
Not always well<br />
supported<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 5 / 48
Task parallelism<br />
Why task parallelism?<br />
Better solution for those problems<br />
Ma<strong>in</strong> addition to <strong>OpenMP</strong> <strong>3.0</strong> a<br />
Allows to parallelize irregular problems<br />
unbounded loops<br />
recursive algorithms<br />
producer/consumer schemes<br />
...<br />
a Ayguadé et al., The Design of <strong>OpenMP</strong> Tasks, IEEE TPDS March 2009<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 6 / 48
Outl<strong>in</strong>e<br />
1 Why task parallelism?<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model<br />
2 The <strong>OpenMP</strong> task<strong>in</strong>g model<br />
Creat<strong>in</strong>g tasks<br />
Data scop<strong>in</strong>g<br />
Syncroniz<strong>in</strong>g tasks<br />
Execution model<br />
3 Pitfalls & Performance issues<br />
4 Conclusions<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 7 / 48
Outl<strong>in</strong>e<br />
1 Why task parallelism?<br />
2 The <strong>OpenMP</strong> task<strong>in</strong>g model<br />
Creat<strong>in</strong>g tasks<br />
Data scop<strong>in</strong>g<br />
Syncroniz<strong>in</strong>g tasks<br />
Execution model<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Creat<strong>in</strong>g tasks<br />
3 Pitfalls & Performance issues<br />
4 Conclusions<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 8 / 48
The <strong>OpenMP</strong> task<strong>in</strong>g model Creat<strong>in</strong>g tasks<br />
What is an <strong>OpenMP</strong> task?<br />
Tasks are work units which execution may be deferred<br />
they can also be executed immediately!<br />
Tasks are composed of:<br />
code to execute<br />
data environment<br />
Initialized at creation time<br />
<strong>in</strong>ternal control variables (ICVs)<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 9 / 48
Task directive<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Creat<strong>in</strong>g tasks<br />
#pragma omp task [ clauses ]<br />
s t r u c t u r e d block<br />
Each encounter<strong>in</strong>g thread creates a task<br />
Packages code and data environment<br />
Highly composable. Can be nested<br />
<strong>in</strong>side parallel regions<br />
<strong>in</strong>side other tasks<br />
<strong>in</strong>side workshar<strong>in</strong>gs<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 10 / 48
List traversal<br />
with tasks<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Creat<strong>in</strong>g tasks<br />
void t r a v e r s e _ l i s t ( L i s t l )<br />
{<br />
Element e ;<br />
for ( e = l −> f i r s t ; e ; e = e−>next )<br />
#pragma omp task<br />
process ( e ) ;<br />
}<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 11 / 48
List traversal<br />
with tasks<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Creat<strong>in</strong>g tasks<br />
void<br />
{<br />
t r a v e r s e _ l i s t ( L i s t l )<br />
Element e ;<br />
for ( e = l −> f i r s t ; e ; e = e−>next )<br />
#pragma omp task<br />
process ( e ) ;<br />
}<br />
What is the scope of e?<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 11 / 48
Outl<strong>in</strong>e<br />
1 Why task parallelism?<br />
2 The <strong>OpenMP</strong> task<strong>in</strong>g model<br />
Creat<strong>in</strong>g tasks<br />
Data scop<strong>in</strong>g<br />
Syncroniz<strong>in</strong>g tasks<br />
Execution model<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
3 Pitfalls & Performance issues<br />
4 Conclusions<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 12 / 48
Task data scop<strong>in</strong>g<br />
Data scop<strong>in</strong>g clauses<br />
shared(list)<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
private(list)<br />
firstprivate(list)<br />
data is captured at creation<br />
default(shared|none)<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 13 / 48
Task data scop<strong>in</strong>g<br />
When there are no clauses ...<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
If no clause<br />
Implicit rules apply<br />
e.g., global variables are shared<br />
Otherwise...<br />
firstprivate<br />
shared attributed is lexically <strong>in</strong>herited<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 14 / 48
Task data scop<strong>in</strong>g<br />
In practice...<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
i n t a ;<br />
void foo ( ) {<br />
i n t b , c ;<br />
#pragma omp p a r a l l e l shared ( b )<br />
#pragma omp p a r a l l e l private ( b )<br />
{<br />
i n t d ;<br />
#pragma omp task<br />
{<br />
i n t e ;<br />
} } }<br />
a =<br />
b =<br />
c =<br />
d =<br />
e =<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 15 / 48
Task data scop<strong>in</strong>g<br />
In practice...<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
i n t a ;<br />
void foo ( ) {<br />
i n t b , c ;<br />
#pragma omp p a r a l l e l shared ( b )<br />
#pragma omp p a r a l l e l private ( b )<br />
{<br />
i n t d ;<br />
#pragma omp task<br />
{<br />
i n t e ;<br />
} } }<br />
a = shared<br />
b =<br />
c =<br />
d =<br />
e =<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 15 / 48
Task data scop<strong>in</strong>g<br />
In practice...<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
i n t a ;<br />
void foo ( ) {<br />
i n t b , c ;<br />
#pragma omp p a r a l l e l shared ( b )<br />
#pragma omp p a r a l l e l private ( b )<br />
{<br />
i n t d ;<br />
#pragma omp task<br />
{<br />
i n t e ;<br />
} } }<br />
a = shared<br />
b = firstprivate<br />
c =<br />
d =<br />
e =<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 15 / 48
Task data scop<strong>in</strong>g<br />
In practice...<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
i n t a ;<br />
void foo ( ) {<br />
i n t b , c ;<br />
#pragma omp p a r a l l e l shared ( b )<br />
#pragma omp p a r a l l e l private ( b )<br />
{<br />
i n t d ;<br />
#pragma omp task<br />
{<br />
i n t e ;<br />
} } }<br />
a = shared<br />
b = firstprivate<br />
c = shared<br />
d =<br />
e =<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 15 / 48
Task data scop<strong>in</strong>g<br />
In practice...<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
i n t a ;<br />
void foo ( ) {<br />
i n t b , c ;<br />
#pragma omp p a r a l l e l shared ( b )<br />
#pragma omp p a r a l l e l private ( b )<br />
{<br />
i n t d ;<br />
#pragma omp task<br />
{<br />
i n t e ;<br />
} } }<br />
a = shared<br />
b = firstprivate<br />
c = shared<br />
d = firstprivate<br />
e =<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 15 / 48
Task data scop<strong>in</strong>g<br />
In practice...<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
i n t a ;<br />
void foo ( ) {<br />
i n t b , c ;<br />
#pragma omp p a r a l l e l shared ( b )<br />
#pragma omp p a r a l l e l private ( b )<br />
{<br />
i n t d ;<br />
#pragma omp task<br />
{<br />
i n t e ;<br />
} } }<br />
a = shared<br />
b = firstprivate<br />
c = shared<br />
d = firstprivate<br />
e = private<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 15 / 48
Task data scop<strong>in</strong>g<br />
In practice...<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
i n t a ;<br />
void foo ( ) {<br />
i n t b , c ;<br />
#pragma omp p a r a l l e l shared ( b )<br />
#pragma omp p a r a l l e l private ( b )<br />
{<br />
i n t d ;<br />
#pragma omp task<br />
{<br />
i n t e ;<br />
} } }<br />
a = shared<br />
b = firstprivate<br />
c = shared<br />
d = firstprivate<br />
e = private<br />
Tip<br />
default(none) is your friend<br />
Use it if you do not see<br />
it clear<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 15 / 48
List traversal<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
void<br />
{<br />
t r a v e r s e _ l i s t ( L i s t l )<br />
Element e ;<br />
for ( e = l −> f i r s t ; e ; e = e−>next )<br />
#pragma omp task<br />
process ( e ) ;<br />
}<br />
e is firstprivate<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 16 / 48
List traversal<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Data scop<strong>in</strong>g<br />
void t r a v e r s e _ l i s t ( L i s t l )<br />
{<br />
Element e ;<br />
for ( e = l −> f i r s t ; e ; e = e−>next )<br />
#pragma omp task<br />
process ( e ) ;<br />
}<br />
how we can guarantee here that the traversal is f<strong>in</strong>ished?<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 16 / 48
Outl<strong>in</strong>e<br />
1 Why task parallelism?<br />
2 The <strong>OpenMP</strong> task<strong>in</strong>g model<br />
Creat<strong>in</strong>g tasks<br />
Data scop<strong>in</strong>g<br />
Syncroniz<strong>in</strong>g tasks<br />
Execution model<br />
3 Pitfalls & Performance issues<br />
4 Conclusions<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Syncroniz<strong>in</strong>g tasks<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 17 / 48
Task synchronization<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Syncroniz<strong>in</strong>g tasks<br />
Barriers (implicit or explicit)<br />
All tasks created by any thread of the current team are guaranteed<br />
to be completed at barrier exit<br />
Task barrier<br />
#pragma omp taskwait<br />
Encounter<strong>in</strong>g task suspends until child tasks complete<br />
Only direct childs not descendants!<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 18 / 48
List traversal<br />
Example<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Syncroniz<strong>in</strong>g tasks<br />
void t r a v e r s e _ l i s t ( L i s t l )<br />
{<br />
Element e ;<br />
for ( e = l −> f i r s t ; e ; e = e−>next )<br />
#pragma omp task<br />
process ( e ) ;<br />
#pragma omp taskwait<br />
}<br />
All tasks guaranteed to be completed here<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 19 / 48
Outl<strong>in</strong>e<br />
1 Why task parallelism?<br />
2 The <strong>OpenMP</strong> task<strong>in</strong>g model<br />
Creat<strong>in</strong>g tasks<br />
Data scop<strong>in</strong>g<br />
Syncroniz<strong>in</strong>g tasks<br />
Execution model<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Execution model<br />
3 Pitfalls & Performance issues<br />
4 Conclusions<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 20 / 48
Task execution model<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Execution model<br />
Task are executed by a thread of the team that generated it<br />
Can be executed immediately by the same thread that creates it<br />
Parallel regions <strong>in</strong> <strong>3.0</strong> create tasks!<br />
One implicit task is created for each thread<br />
So all task-concepts have sense <strong>in</strong>side the parallel region<br />
Threads can suspend the execution of a task and start/resume<br />
another<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 21 / 48
List traversal<br />
Example<br />
L i s t l<br />
#pragma omp p a r a l l e l<br />
t r a v e r s e _ l i s t ( l ) ;<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Execution model<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 22 / 48
List traversal<br />
Example<br />
L i s t l<br />
#pragma omp p a r a l l e l<br />
t r a v e r s e _ l i s t ( l ) ;<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Execution model<br />
Careful!<br />
Multiple traversals of the same<br />
list<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 22 / 48
List traversal<br />
S<strong>in</strong>gle traversal<br />
Example<br />
L i s t l<br />
#pragma omp p a r a l l e l<br />
#pragma omp s<strong>in</strong>gle<br />
t r a v e r s e _ l i s t ( l ) ;<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Execution model<br />
S<strong>in</strong>gle traversal<br />
One thread enters s<strong>in</strong>gle<br />
and creates all tasks<br />
All the team cooperates<br />
execut<strong>in</strong>g them<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 23 / 48
List traversal<br />
Multiple traversals<br />
Example<br />
L i s t l [N]<br />
#pragma omp p a r a l l e l<br />
#pragma omp for<br />
for ( i = 0; i < N; i ++)<br />
t r a v e r s e _ l i s t ( l [ i ] ) ;<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Execution model<br />
Multiple traversals<br />
Multiple threads create<br />
tasks<br />
All the team cooperates<br />
execut<strong>in</strong>g them<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 24 / 48
Task schedul<strong>in</strong>g<br />
How it works?<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Execution model<br />
Tasks are tied by default<br />
Tied tasks are executed always by the same thread<br />
Tied tasks have schedul<strong>in</strong>g restrictions<br />
Determ<strong>in</strong>istic schedul<strong>in</strong>g po<strong>in</strong>ts (creation, synchronization, ... )<br />
Another constra<strong>in</strong>t to avoid deadlock problems<br />
Tied tasks may run <strong>in</strong>to performance problems<br />
Programmer can use untied clause to lift all restrictions<br />
Note: Mix very carefully with threadprivate, critical and thread-ids<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 25 / 48
And last...<br />
The IF clause<br />
The <strong>OpenMP</strong> task<strong>in</strong>g model Execution model<br />
If the the expression of a if clause evaluates to false<br />
The encounter<strong>in</strong>g task is suspended<br />
The new task is executed immediately<br />
with its own data environment<br />
different task with respect to synchronization<br />
The parent task resumes when the task f<strong>in</strong>ishes<br />
Allows implementations to optimize task creation<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 26 / 48
Outl<strong>in</strong>e<br />
1 Why task parallelism?<br />
Pitfalls & Performance issues<br />
2 The <strong>OpenMP</strong> task<strong>in</strong>g model<br />
Creat<strong>in</strong>g tasks<br />
Data scop<strong>in</strong>g<br />
Syncroniz<strong>in</strong>g tasks<br />
Execution model<br />
3 Pitfalls & Performance issues<br />
4 Conclusions<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 27 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
{<br />
}<br />
s t a t e [ j ] = i ;<br />
i f ( ok ( j +1 , s t a t e ) ) {<br />
search ( n , j +1 , s t a t e ) ;<br />
}<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 28 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task<br />
{<br />
s t a t e [ j ] = i ;<br />
i f ( ok ( j +1 , s t a t e ) ) {<br />
search ( n , j +1 , s t a t e ) ;<br />
}<br />
}<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 28 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task<br />
{<br />
s t a t e [ j ] = i ;<br />
i f ( ok ( j +1 , s t a t e ) ) {<br />
search ( n , j +1 , s t a t e ) ;<br />
}<br />
}<br />
Data scop<strong>in</strong>g<br />
Because it’s an orphaned<br />
task all variables are<br />
firstprivate<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 28 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task<br />
{<br />
s t a t e [ j ] = i ;<br />
i f ( ok ( j +1 , s t a t e ) ) {<br />
search ( n , j +1 , s t a t e ) ;<br />
}<br />
}<br />
Data scop<strong>in</strong>g<br />
Because it’s an orphaned<br />
task all variables are<br />
firstprivate<br />
State is not captured<br />
Just the po<strong>in</strong>ter is captured<br />
not the po<strong>in</strong>ted data<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 28 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task<br />
{<br />
s t a t e [ j ] = i ;<br />
i f ( ok ( j +1 , s t a t e ) ) {<br />
search ( n , j +1 , s t a t e ) ;<br />
}<br />
}<br />
Pitfall #1<br />
Incorrectly captur<strong>in</strong>g<br />
po<strong>in</strong>ted data<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 28 / 48
Pitfalls & Performance issues<br />
Pitfall #1<br />
Incorrectly captur<strong>in</strong>g po<strong>in</strong>ted data<br />
Problem<br />
firstprivate does not allow to capture data through po<strong>in</strong>ters<br />
Solutions<br />
1 Capture it manually<br />
2 Copy it to an array and capture the array with firstprivate<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 29 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
search ( n , j +1 , new_state ) ;<br />
}<br />
}<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 30 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
search ( n , j +1 , new_state ) ;<br />
}<br />
}<br />
Caution!<br />
Will new_state still be valid<br />
by the time memcpy is<br />
executed?<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 30 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
search ( n , j +1 , new_state ) ;<br />
}<br />
}<br />
Pitfall #2<br />
Data can go out of scope!<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 30 / 48
Pitfall #2<br />
Out-of-scope data<br />
Problem<br />
Pitfalls & Performance issues<br />
Stack-allocated parent data can become <strong>in</strong>valid before be<strong>in</strong>g used by<br />
child tasks<br />
Only if not captured with firstprivate<br />
Solutions<br />
1 Use firstprivate when possible<br />
2 Allocate it <strong>in</strong> the heap<br />
Not always easy (we also need to free it)<br />
3 Put additional synchronizations<br />
May reduce the available parallelism<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 31 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++ ;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
search ( n , j +1 , new_state ) ;<br />
}<br />
}<br />
#pragma omp taskwait<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 32 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++ ;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
search ( n , j +1 , new_state ) ;<br />
}<br />
}<br />
#pragma omp taskwait<br />
Shared variable needs protected access<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 32 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++ ;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
search ( n , j +1 , new_state ) ;<br />
}<br />
}<br />
#pragma omp taskwait<br />
Solutions<br />
Use omp critical<br />
Use omp atomic<br />
Use a reduction<br />
operation<br />
Not available for <strong>3.0</strong><br />
Can be work out<br />
manually<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 32 / 48
Pitfalls & Performance issues<br />
Reductions for tasks<br />
Example<br />
i n t s o l u t i o n s =0;<br />
i n t mysolutions =0;<br />
#pragma omp t h r e a d p r i v a t e ( mysolutions )<br />
void s t a r t _ s e a r c h ( )<br />
{<br />
#pragma omp p a r a l l e l<br />
{<br />
#pragma omp s<strong>in</strong>gle<br />
{<br />
bool i n i t i a l _ s t a t e [ n ] ;<br />
search ( n , 0 , i n i t i a l _ s t a t e ) ;<br />
}<br />
#pragma omp c r i t i c a l<br />
s o l u t i o n s += mysolutions ;<br />
}<br />
}<br />
Use a separate counter for each thread<br />
Accumulate them at the end<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 33 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
}<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
search ( n , j +1 , new_state ) ;<br />
}<br />
#pragma omp taskwait<br />
Prun<strong>in</strong>g mechanism potentially<br />
<strong>in</strong>troduces imbalance <strong>in</strong> the tree<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 34 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task untied<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
search ( n , j +1 , new_state ) ;<br />
}<br />
}<br />
#pragma omp taskwait<br />
Untied clause<br />
Allows the<br />
implementation to<br />
easier load balance<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 34 / 48
Benefit of untied?<br />
Speed-up<br />
10<br />
9<br />
8<br />
7<br />
6<br />
5<br />
4<br />
3<br />
2<br />
1<br />
0<br />
tied tasks<br />
untied tasks<br />
with Intel’s icc v11.0<br />
Pitfalls & Performance issues<br />
1 2 4 8 16 32<br />
# of threads<br />
Don’t expect much<br />
today.<br />
But, as<br />
implementations are<br />
optimized differences<br />
may arise<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 35 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
s o l u t i o n s ++ ;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task untied<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
search ( n , j +1 , new_state ) ;<br />
}<br />
}<br />
#pragma omp taskwait<br />
Because of untied this is not safe!<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 36 / 48
Pitfall #3<br />
Unsafe use of untied tasks<br />
Pitfalls & Performance issues<br />
Problem<br />
Because tasks can migrate between threads at any po<strong>in</strong>t<br />
thread-centric constructs can yield unexpected results<br />
Remember<br />
When us<strong>in</strong>g untied tasks avoid:<br />
Threadprivate variables<br />
Any thread-id uses<br />
And be very careful with:<br />
Critical regions (and locks)<br />
Simple solution<br />
Create a task tied region with #pragma omp task if(0)<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 37 / 48
Search problem<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗s t a t e )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
#pragma omp task i f ( 0 )<br />
s o l u t i o n s ++ ;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task untied<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
search ( n , j +1 , new_state ) ;<br />
}<br />
}<br />
#pragma omp taskwait<br />
Now this statement is tied and safe<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 38 / 48
Task granularity<br />
Pitfalls & Performance issues<br />
Granularity is a key performance factor<br />
Tasks tend to be f<strong>in</strong>e-gra<strong>in</strong>ed<br />
Try to “group“ tasks together<br />
Use if clause or manual transformations<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 39 / 48
Us<strong>in</strong>g the if clause<br />
Example<br />
Pitfalls & Performance issues<br />
void search ( i n t n , i n t j , bool ∗state , <strong>in</strong>t depth )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
#pragma omp task i f ( 0 )<br />
mysolutions ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task untied if(depth < MAX_DEPTH)<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
search ( n , j +1 , new_state,depth+1 ) ;<br />
}<br />
}<br />
#pragma omp taskwait<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 40 / 48
Pitfalls & Performance issues<br />
Us<strong>in</strong>g an if statement<br />
Example<br />
void search ( i n t n , i n t j , bool ∗state , <strong>in</strong>t depth )<br />
{<br />
i n t i , res ;<br />
}<br />
i f ( n == j ) {<br />
/∗ good s o l u t i o n , count i t ∗/<br />
#pragma omp task i f ( 0 )<br />
mysolutions ++;<br />
return ;<br />
}<br />
/∗ t r y each p o s s i b l e s o l u t i o n ∗/<br />
for ( i = 0; i < n ; i ++)<br />
#pragma omp task untied<br />
{<br />
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;<br />
memcpy( new_state , state , sizeof ( bool )∗n ) ;<br />
new_state [ j ] = i ;<br />
i f ( ok ( j +1 , new_state ) ) {<br />
if ( depth < MAX_DEPTH )<br />
search ( n , j +1 , new_state,depth+1 ) ;<br />
else<br />
search_serial(n,j+1,new_state);<br />
}<br />
}<br />
#pragma omp taskwait<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 41 / 48
Pitfalls & Performance issues<br />
If clause vs If statement<br />
Speed-up<br />
30<br />
25<br />
20<br />
15<br />
10<br />
5<br />
0<br />
with Intel’s icc v11.0<br />
noth<strong>in</strong>g<br />
if clause<br />
manual if<br />
1 2 4 8 16 32<br />
# of threads<br />
If clause<br />
reduces<br />
overheads<br />
without<br />
modify<strong>in</strong>g<br />
the code<br />
but if granularity<br />
is very small is<br />
not enough<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 42 / 48
Don’t abuse tasks<br />
Pitfalls & Performance issues<br />
Tasks are nice but...<br />
They are not the answer to<br />
everyth<strong>in</strong>g<br />
They are more costly than<br />
other <strong>OpenMP</strong> mechanisms<br />
Use other <strong>OpenMP</strong> constructs<br />
when appropriate<br />
Particularly for/do<br />
workshar<strong>in</strong>g and sections<br />
Courtesy of Christian Terboven<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 43 / 48
Outl<strong>in</strong>e<br />
1 Why task parallelism?<br />
Conclusions<br />
2 The <strong>OpenMP</strong> task<strong>in</strong>g model<br />
Creat<strong>in</strong>g tasks<br />
Data scop<strong>in</strong>g<br />
Syncroniz<strong>in</strong>g tasks<br />
Execution model<br />
3 Pitfalls & Performance issues<br />
4 Conclusions<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 44 / 48
Summary<br />
Conclusions<br />
Tasks <strong>in</strong> <strong>3.0</strong><br />
#pragma omp task [clauses] creates a task<br />
shared,firstprivate,private data clauses<br />
firstprivate is usually the default (but shared <strong>in</strong>herited)<br />
untied allows tasks to move between threads<br />
if allows to dynamically control task creation<br />
#pragma omp taskwait waits for children completation<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 45 / 48
Summary<br />
Pitfalls & tips<br />
Conclusions<br />
1 Use default(none) if unsure of data scop<strong>in</strong>g<br />
2 Careful when us<strong>in</strong>g firstprivate on po<strong>in</strong>ters<br />
3 Careful with Out-of-scope data<br />
4 Use untied tasks carefully<br />
5 Control granularity<br />
6 Do not abuse of tasks<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 46 / 48
Tasks after <strong>3.0</strong><br />
In the works<br />
Conclusions<br />
Support for task reductions<br />
Task dependences<br />
Schedul<strong>in</strong>g h<strong>in</strong>ts<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 47 / 48
The End<br />
Thanks for your attention!<br />
Conclusions<br />
Alejandro Duran (BSC) <strong>Task<strong>in</strong>g</strong> <strong>in</strong> <strong>OpenMP</strong> June 3rd 2009 48 / 48