Verification of Parameterised FPGA Circuit Descriptions with Layout ...
Verification of Parameterised FPGA Circuit Descriptions with Layout ... Verification of Parameterised FPGA Circuit Descriptions with Layout ...
CHAPTER 6. LAYOUT CASE STUDIES 162 One invariant we have seen across all circuit examples is that manually placing designs significantly reduces the time taken for the place and route stage of the compilation process to execute, with the reduction ranging from 14% for the unpipelined matrix multiplier to 92% for the unpipelined adder tree. This result is not overly surprising, since the place and route stage has been reduced to just routing, however it is beneficial to confirm it since it could have been the case that the denser placements specified by the user constraints would increase the routing time by more than is saved by avoiding automatic placement. Another fairly firm conclusion is that in almost all cases 4 manually placed designs require less logic area than automatically placed ones. The logic mapping specified by the manual constraints is denser than that used by the automatic tools and reductions in area of 40% are commonly achieved, with a maximum of 61% area reduction observed for the unpipelined adder tree. In one case the manual mapping for a butterfly circuit used less than 70% of the device resources while the automatic mapping and placement was unable to fit the same circuit onto the device. Manual placement is clearly significantly superior here. The effectiveness of manual placement on positively influencing maximum clock frequency depends on other constraints. Given a homogenous environment, simulated annealing is able to generate circuits with equivalent or better performance by discovering high speed routing paths between cells that humans would not consider sensible - for example, the placed 24-bit pipelined binomial filter circuit is 35% slower than the automatically placed version. However, when other constraints are affecting the layout simulated annealing does not perform so well and the regular layout constraints can produce significant performance gains. The kind of constraints that affect simulated annealing appear to be use of the fast carry chain circuitry, which forces some cells to be laid out vertically, and level of device utilisation which reduces the ability to the placer to find high-speed routes through less densely packed logic. For pipelined bitonic merger butterfly networks, the 4-bit manually placed circuit utilises only 28% of the device runs 14% slower than the automatically placed version - however the 8-bit version utilises 55% of the device and runs 48% faster than the automatically placed version. Manual placement often produces better results than simulated annealing for unpipelined 4 The exception to this we observed was the binomial filter circuit, where we deliberately specified a less dense logic mapping in order to achieve a better aligned layout.
CHAPTER 6. LAYOUT CASE STUDIES 163 circuits where the maximum clock frequency is already much lower than pipelined ones. This is not unexpected, since wiring delays will accumulate in the same way as logic propagation delays in unpipelined circuits. Generally, manual placement appears to lead to reduced power consumption, with reductions in power consumption of up to 40% possible (for the pipelined adder trees). In general power consumption can be reduced even if the maximum clock frequency of the placed design is lower than that for the automatically placed circuit. For the binomial filter power savings of 2-13% were observed even though the placed circuits had lower maximum clock frequencies. In the case of the butterfly network a correlation was once again observed with device util- isation/circuit size - with the 4-bit circuit consuming more power when placed although an 8-bit circuit consumed less. 6.8 Summary We have demonstrated our layout framework with a variety of real circuits including a ma- trix multiplier described with a new type of higher-dimensional combinator, a binomial filter, a butterfly network and a median filter. We have demonstrated how functional reasoning can be used to derive pipelined versions, while the layout framework can be used to verify layouts. We have found that in many, though not all cases, manually placed designs outper- form automatically placed circuits with higher maximum operating frequencies, lower device utilisation, lower power consumption and a faster place and route process.
- Page 121 and 122: CHAPTER 5. SPECIALISATION 111 // Ha
- Page 123 and 124: CHAPTER 5. SPECIALISATION 113 circu
- Page 125 and 126: CHAPTER 5. SPECIALISATION 115 const
- Page 127 and 128: CHAPTER 5. SPECIALISATION 117 block
- Page 129 and 130: CHAPTER 5. SPECIALISATION 119 Modif
- Page 131 and 132: CHAPTER 5. SPECIALISATION 121 Buffe
- Page 133 and 134: CHAPTER 5. SPECIALISATION 123 a fas
- Page 135 and 136: CHAPTER 5. SPECIALISATION 125 block
- Page 137 and 138: CHAPTER 5. SPECIALISATION 127 y y y
- Page 139 and 140: CHAPTER 5. SPECIALISATION 129 with
- Page 141 and 142: CHAPTER 6. LAYOUT CASE STUDIES 131
- Page 143 and 144: CHAPTER 6. LAYOUT CASE STUDIES 133
- Page 145 and 146: CHAPTER 6. LAYOUT CASE STUDIES 135
- Page 147 and 148: CHAPTER 6. LAYOUT CASE STUDIES 137
- Page 149 and 150: CHAPTER 6. LAYOUT CASE STUDIES 139
- Page 151 and 152: CHAPTER 6. LAYOUT CASE STUDIES 141
- Page 153 and 154: CHAPTER 6. LAYOUT CASE STUDIES 143
- Page 155 and 156: CHAPTER 6. LAYOUT CASE STUDIES 145
- Page 157 and 158: CHAPTER 6. LAYOUT CASE STUDIES 147
- Page 159 and 160: CHAPTER 6. LAYOUT CASE STUDIES 149
- Page 161 and 162: CHAPTER 6. LAYOUT CASE STUDIES 151
- Page 163 and 164: CHAPTER 6. LAYOUT CASE STUDIES 153
- Page 165 and 166: CHAPTER 6. LAYOUT CASE STUDIES 155
- Page 167 and 168: CHAPTER 6. LAYOUT CASE STUDIES 157
- Page 169 and 170: CHAPTER 6. LAYOUT CASE STUDIES 159
- Page 171: CHAPTER 6. LAYOUT CASE STUDIES 161
- Page 175 and 176: CHAPTER 7. CONCLUSION AND FUTURE WO
- Page 177 and 178: CHAPTER 7. CONCLUSION AND FUTURE WO
- Page 179 and 180: CHAPTER 7. CONCLUSION AND FUTURE WO
- Page 181 and 182: CHAPTER 7. CONCLUSION AND FUTURE WO
- Page 183 and 184: CHAPTER 7. CONCLUSION AND FUTURE WO
- Page 185 and 186: Bibliography [1] A. Aggoun and N. B
- Page 187 and 188: BIBLIOGRAPHY 177 [19] H. Gelernter.
- Page 189 and 190: BIBLIOGRAPHY 179 [41] Y. Li and M.
- Page 191 and 192: BIBLIOGRAPHY 181 [60] L. C. Paulson
- Page 193 and 194: BIBLIOGRAPHY 183 [83] J. Voeten. On
- Page 195 and 196: APPENDIX A. QUARTZ LANGUAGE GRAMMAR
- Page 197 and 198: Appendix B Theoretical Basis for La
- Page 199 and 200: APPENDIX B. THEORETICAL BASIS FOR L
- Page 201 and 202: APPENDIX B. THEORETICAL BASIS FOR L
- Page 203 and 204: APPENDIX B. THEORETICAL BASIS FOR L
- Page 205 and 206: APPENDIX B. THEORETICAL BASIS FOR L
- Page 207 and 208: APPENDIX B. THEORETICAL BASIS FOR L
- Page 209 and 210: APPENDIX B. THEORETICAL BASIS FOR L
- Page 211 and 212: APPENDIX B. THEORETICAL BASIS FOR L
- Page 213 and 214: APPENDIX B. THEORETICAL BASIS FOR L
- Page 215 and 216: APPENDIX B. THEORETICAL BASIS FOR L
- Page 217 and 218: Appendix C Placed Combinator Librar
- Page 219 and 220: APPENDIX C. PLACED COMBINATOR LIBRA
- Page 221 and 222: APPENDIX C. PLACED COMBINATOR LIBRA
CHAPTER 6. LAYOUT CASE STUDIES 163<br />
circuits where the maximum clock frequency is already much lower than pipelined ones. This<br />
is not unexpected, since wiring delays will accumulate in the same way as logic propagation<br />
delays in unpipelined circuits.<br />
Generally, manual placement appears to lead to reduced power consumption, <strong>with</strong> reductions<br />
in power consumption <strong>of</strong> up to 40% possible (for the pipelined adder trees). In general power<br />
consumption can be reduced even if the maximum clock frequency <strong>of</strong> the placed design is<br />
lower than that for the automatically placed circuit. For the binomial filter power savings <strong>of</strong><br />
2-13% were observed even though the placed circuits had lower maximum clock frequencies.<br />
In the case <strong>of</strong> the butterfly network a correlation was once again observed <strong>with</strong> device util-<br />
isation/circuit size - <strong>with</strong> the 4-bit circuit consuming more power when placed although an<br />
8-bit circuit consumed less.<br />
6.8 Summary<br />
We have demonstrated our layout framework <strong>with</strong> a variety <strong>of</strong> real circuits including a ma-<br />
trix multiplier described <strong>with</strong> a new type <strong>of</strong> higher-dimensional combinator, a binomial filter,<br />
a butterfly network and a median filter. We have demonstrated how functional reasoning<br />
can be used to derive pipelined versions, while the layout framework can be used to verify<br />
layouts. We have found that in many, though not all cases, manually placed designs outper-<br />
form automatically placed circuits <strong>with</strong> higher maximum operating frequencies, lower device<br />
utilisation, lower power consumption and a faster place and route process.