03.12.2012 Views

C++ for Scientists - Technische Universität Dresden

C++ for Scientists - Technische Universität Dresden

C++ for Scientists - Technische Universität Dresden

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

172 CHAPTER 5. META-PROGRAMMING<br />

<strong>for</strong> (unsigned i= 0; i < sb; i+= BSize)<br />

assign()(ref, that, i);<br />

<strong>for</strong> (unsigned i= sb; i < s; i++)<br />

ref[i]= that[i];<br />

return ref;<br />

}<br />

private:<br />

V& ref;<br />

};<br />

Evaluting the considered vector expressions <strong>for</strong> some block sizes yields:<br />

Compute time unroll(u)= v + v + w is 1.72 µs.<br />

Compute time unroll(u)= v + v + w is 1.52 µs.<br />

Compute time unroll(u)= v + v + w is 1.36 µs.<br />

Compute time unroll(u)= v + v + w is 1.37 µs.<br />

Compute time unroll(u)= v + v + w is 1.4 µs.<br />

This few benchmarks are consistent with the previous results, i.e. unroll is equal to the<br />

canocical implementation and unroll is as fast as the hard-wired unrolling.<br />

5.4.6 Tuning Reduction Operations<br />

Reducing on a Single Variable<br />

⇒ reduction unroll example.cpp<br />

In the preceding vector operations, the i th entry of each vector was handled independently of<br />

any other entry. For reduction operations, they are related by one or more temporary variables.<br />

And this temporary variable(s) can become a serious bottle neck.<br />

First, we test if a reduction operation, say the discrete L1 norm (also known as Manhattan<br />

norm) can be accelerated by the techniques from Section 5.4.4. We implement the one norm<br />

function in terms of a functor <strong>for</strong> the iteration block:<br />

template <br />

typename Vector::value type<br />

inline one norm(const Vector& v)<br />

{<br />

using std::abs;<br />

typename Vector::value type sum(0);<br />

unsigned s= size(v), sb= s / BSize ∗ BSize;<br />

}<br />

<strong>for</strong> (unsigned i= 0; i < sb; i+= BSize)<br />

one norm ftor()(sum, v, i);<br />

<strong>for</strong> (unsigned i= sb; i < s; i++)<br />

sum+= abs(v[i]);<br />

return sum;

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!