24.04.2013 Views

Verification of Parameterised FPGA Circuit Descriptions with Layout ...

Verification of Parameterised FPGA Circuit Descriptions with Layout ...

Verification of Parameterised FPGA Circuit Descriptions with Layout ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 6. LAYOUT CASE STUDIES 150<br />

plies for the unpipelined circuits and is not surprising because we have deliberately specified<br />

a less-dense packing for this circuit, using the lift block to align registers <strong>with</strong> the adders and<br />

thus using only one out <strong>of</strong> each two slice flip-flops in these columns. In the unpipelined cir-<br />

cuits we are also not utilising the in-slice flip-flops <strong>of</strong> the adder columns which could be used<br />

to implement the coefficient delay. If we had specified a denser packing for the unpipelined<br />

circuit then we would use b × n fewer slices (where b is the number <strong>of</strong> bits and n the number<br />

<strong>of</strong> stages) – saving over 1000 slices for the 32-bit, 32 stage circuit.<br />

The manually placed designs have better maximum clock frequencies than the automatically<br />

placed ones when there is no pipelining and significantly worse performance when pipelined.<br />

For example, the placed 32-bit filter runs 14% faster than the unplaced version when un-<br />

pipelined, but 30% slower when pipelined.<br />

The clock frequency result is important, since it indicates that simulated annealing has<br />

achieved better results for the pipelined circuit regardless <strong>of</strong> the circuit size. The difference<br />

between the placed and unplaced circuits does nonetheless differ depending on the circuit<br />

size, <strong>with</strong> the placed circuit 35% slower for 24-bit data but 30% slower for the larger 32-bit<br />

data variant.<br />

This result is consistent <strong>with</strong> previous research [77] which indicated that placed adder circuits<br />

not employing the carry chain were outperformed by circuits placed using simulated anneal-<br />

ing. It shows that <strong>with</strong>out the vertical placement constraint enforced by use <strong>of</strong> the carry<br />

chain simulated annealing can place cells where it likes and find high speed paths between<br />

cells which humans would probably not have considered. In the absence <strong>of</strong> other constraints,<br />

simulated annealing can therefore find irregular layouts which are better than what a human<br />

would consider reasonable.<br />

Interestingly although for the pipelined circuits the manually placed variants have lower<br />

maximum clock frequencies, when run at the same clock frequency they consume between<br />

1.5% and 13.1% less power. This indicates that manual placement has a role to play even<br />

in circuits such as this one where it leads to a decreased maximum potential performance by<br />

specifying a denser logic mapping and reducing circuit power consumption. If a 24-bit filter<br />

that can run at 50Mhz is desired then the placed circuit is clearly superior – it will compile<br />

quicker, use less logic area and consume significantly less power while running at that speed.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!