03.12.2012 Views

C++ for Scientists - Technische Universität Dresden

C++ for Scientists - Technische Universität Dresden

C++ for Scientists - Technische Universität Dresden

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

166 CHAPTER 5. META-PROGRAMMING<br />

}<br />

vector u(s), v(s), w(s);<br />

vector u(s), v(s), w(s);<br />

<strong>for</strong> (unsigned i= 0; i < s; i++) {<br />

v[i]= float(i);<br />

w[i]= float(2∗i + 15);<br />

}<br />

<strong>for</strong> (unsigned j= 0; j < 3; j++)<br />

<strong>for</strong> (unsigned i= 0; i < s; i++)<br />

u[i]= 3.0f ∗ v[i] + w[i];<br />

const unsigned rep= 200000;<br />

boost::timer native;<br />

<strong>for</strong> (unsigned j= 0; j < rep; j++)<br />

<strong>for</strong> (unsigned i= 0; i < s; i++)<br />

u[i]= 3.0f ∗ v[i] + w[i];<br />

std::cout ≪ ”Compute time native loop is ” ≪ 1000000.0 ∗ native.elapsed() / double(rep) ≪ ” µs.\n”;<br />

return 0 ;<br />

Alternatively we compute this with an unrolling of 4 cycles:<br />

<strong>for</strong> (unsigned j= 0; j < rep; j++)<br />

<strong>for</strong> (unsigned i= 0; i < s; i+= 4) {<br />

u[i]= 3.0f ∗ v[i] + w[i];<br />

u[i+1]= 3.0f ∗ v[i+1] + w[i+1];<br />

u[i+2]= 3.0f ∗ v[i+2] + w[i+2];<br />

u[i+3]= 3.0f ∗ v[i+3] + w[i+3];<br />

}<br />

This code will obviously only work if the vector size is divisible by 4. To avoid errors we can<br />

add an assertion on the vector size but this is not really satisfying. Instead, we generalize this<br />

implementation to arbitrary vector sizes:<br />

boost::timer unrolled;<br />

<strong>for</strong> (unsigned j= 0; j < rep; j++) {<br />

unsigned sb= s / 4 ∗ 4;<br />

<strong>for</strong> (unsigned i= 0; i < sb; i+= 4) {<br />

u[i]= 3.0f ∗ v[i] + w[i];<br />

u[i+1]= 3.0f ∗ v[i+1] + w[i+1];<br />

u[i+2]= 3.0f ∗ v[i+2] + w[i+2];<br />

u[i+3]= 3.0f ∗ v[i+3] + w[i+3];<br />

}<br />

<strong>for</strong> (unsigned i= sb; i < s; i++)<br />

u[i]= 3.0f ∗ v[i] + w[i];<br />

}<br />

std::cout ≪ ”Compute time unrolled loop is ” ≪ 1000000.0 ∗ unrolled.elapsed() / double(rep) ≪ ” µs.\n”;<br />

std::cout ≪ ”u is ” ≪ u ≪ ’\n’;<br />

Listing 5.4: Unrolled computation of u = 3v + w<br />

The little program was compiled with g++ 4.1.2 with the flags -O3 -ffast-math -DNDEBUG

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!