13.07.2015 Views

TOPOLOGICAL OPTIMIZATION OF THE EVALUATION ... - BearSpace

TOPOLOGICAL OPTIMIZATION OF THE EVALUATION ... - BearSpace

TOPOLOGICAL OPTIMIZATION OF THE EVALUATION ... - BearSpace

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

OPTIMIZING <strong>EVALUATION</strong> <strong>OF</strong> FINITE ELEMENT MATRICES 11Table 4.6Number of multiply-add pairs in the optimized algorithm for computing the weighted Laplacianelement stiffness matrix on triangles and tetrahedra for Lagrange polynomials of degree one throughthree using symmetry.trianglesdegree n m nm MAPs1 6 9 54 272 21 18 378 2183 55 30 1650 1110tetrahedradegree n m nm MAPs1 10 24 240 1082 55 60 3300 16503 210 120 25200 14334Table 4.7Number of multiply-add pairs in the optimized algorithm for performing all of the contractionswith (G L ) e in the weighted Laplacian on triangles and tetrahedra first, resulting in ( |P |+1) 2 arraysof length |P | to contract with w k .trianglesdegree n m nm MAPs1 18 3 54 92 126 3 378 1153 550 3 1650 683tetrahedradegree n m nm MAPs1 40 6 240 272 550 6 3300 6933 4200 6 25200 7021each slice A 0 i· of the reference tensor as an array of |P | tensors of size d × d and applythe transformation to each of these. If we fully form G e , this reduces the cost from|P |d 2 to |P | ( )d+12 . In all of our experiments, we made use of this.In Table 4.6, we see the cost of computing the weighted Laplacian by the firstapproach (optimizing directly the tensor product A e i = A 0 iα Gα e ). While the optimizationsare not as successful as for the constant coefficient operators, we still getreductions of 30%-50% in the operation counts.When we perform the contraction in stages, we find more dependencies (for example,the slices of two of the tensors could be colinear although the entire tensorsare not). We show the cost of performing the optimized stage for contracting with(G L ) e first in Table 4.7 and for contracting with w k first in Table 4.8.In order to get a fair comparison between these approaches, we must factor in theadditional costs of building G e or performing the second stage of contraction. Once(G L ) e is built and symmetrized, it costs an additional |P | ( )d+12 multiply-add pairsto construct G e . If we optimize the computation of contracting with (G L ) e first, wedo not have to build G e , but we must perform a dot product with w k for each entryof the matrix. This costs |P | per contraction with ( )|P |+12 entries in the matrix. Ifwe optimize the contraction with each w k first, then we have an additional ( )|P |+12contractions with (G L ) e at a cost of ( )d+12 each. We expect that which of thesewill be most effective must be determined (automatically) on a case-by-case basis.Tables 4.10 and 4.9 show the comparisons for the first approach (labeled G e ), thesecond approach (labeled (G L ) e ) and the third approach (labeled w k ) by indicatingthe cost of the optimized computation plus the additional stages of computation. Inmost of these cases, contracting with the coefficient first leads to the lowest total cost.5. Optimizing the optimization process. Since our graph (Y, E) is completelyconnected, we have |E| = O(|Y | 2 ) and our optimization process requires com-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!