Microsoft PowerPoint - 3.5presentation.ppt - Cadence Design Systems
Microsoft PowerPoint - 3.5presentation.ppt - Cadence Design Systems
Microsoft PowerPoint - 3.5presentation.ppt - Cadence Design Systems
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
INVENTIVE<br />
Synthesis Strategies for<br />
better QoR using RC<br />
Nandini Chintala<br />
<strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>
2<br />
Outline<br />
Introduction<br />
Requirements for Synthesis<br />
Background on RTL Compiler<br />
Metrics – Area, Timing, Power<br />
Flow exploration<br />
Effective strategies<br />
Conclusions<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.
3<br />
Introduction<br />
Modern designs have aggressive requirements<br />
Need implementation of Smaller, Cooler and Faster chips<br />
Implementation of RTL into netlist is the first step<br />
Global focus synthesis results in faster chips, single pass multi Vt<br />
synthesis, and MSV design results in cooler chips<br />
Logic Netlist determines the quality of design implementation<br />
Advanced techniques for low power and timing closure<br />
Capacity allows for top down synthesis<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.
4<br />
Synthesis Using RTL Compiler<br />
Low V t<br />
Library<br />
Med. V t<br />
Library<br />
High V t<br />
Library<br />
Multidimensional Synthesis - Timing, area and power are<br />
optimized concurrently<br />
Identify and map critical logic for timing<br />
Identify and map off-critical logic for power and area<br />
RTL<br />
Timing,<br />
Power<br />
Constraints<br />
Multi-objective<br />
optimization<br />
Optimized<br />
Netlist<br />
Switching<br />
Activity<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.<br />
T<br />
i<br />
m<br />
i<br />
n<br />
g<br />
P<br />
o<br />
w<br />
e<br />
r<br />
A<br />
r<br />
e<br />
a
5<br />
Typical Flow<br />
Set target library<br />
set_attr library name /<br />
Read HDL files<br />
read_hdl ${FILE_LIST}<br />
Elaborate the design<br />
elaborate<br />
Set timing and design constraints<br />
Apply optimization directives<br />
Synthesize –to_gen<br />
Synthesize –to_map –no_incr<br />
Synthesize –to_map -incr<br />
Reports and Interface to P&R<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.<br />
Timing constraints, Power constraints<br />
Generic optimizations such as MUX<br />
optimizations, datapath selections<br />
Global Mapping<br />
Incremental Mapping
6<br />
Mapping stages<br />
synthesize –to_generic<br />
synthesize –to_map<br />
–no_incr<br />
synthesize –to_map<br />
-incr<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.<br />
Generic structuring<br />
Target setting<br />
Global mapping<br />
Remaps (area_map..)<br />
Incremental<br />
Generic optimizations such as<br />
MUX optimizations, datapath<br />
selections<br />
Targets for each cost group<br />
are derived. Each clock<br />
definition creates a cost group<br />
Global optimization for area,<br />
timing, power driven by target<br />
Area recovery stage for non<br />
critical paths<br />
Path based optimization,<br />
DRC, critical region<br />
resynthesis for timing and<br />
sequential resynthesis
7<br />
Synthesis Strategies Summary<br />
Technique<br />
RTL Coding<br />
Constraints<br />
Selective<br />
ungrouping<br />
Path groups<br />
Multi Vt Libraries<br />
Cell selection<br />
Target manipulation<br />
Mux optimizations<br />
WLM selections<br />
Clock gate analysis<br />
Area<br />
++<br />
++<br />
++<br />
++<br />
++<br />
++<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.<br />
=<br />
++<br />
++<br />
=<br />
=<br />
++<br />
++<br />
Time<br />
++<br />
++<br />
++<br />
++<br />
++<br />
++<br />
++<br />
Power<br />
++<br />
++<br />
++<br />
++<br />
++<br />
++<br />
++<br />
=<br />
++<br />
++<br />
Methodology<br />
Impact<br />
High<br />
High<br />
Medium<br />
Low<br />
Low<br />
Medium<br />
Medium<br />
Medium<br />
Medium<br />
High<br />
Methodology<br />
Change<br />
Clock gating<br />
FSM encoding<br />
Clean constraints<br />
result in better netlist<br />
Analyze opportunity for<br />
better optimization<br />
Synthesis script<br />
Multi Vt libraries<br />
loaded in synthesis<br />
Limit cell selection<br />
Analyze targets<br />
Analyze mux choices<br />
Analyze WLM choices<br />
Analyze clock gating
8<br />
Typical Flow<br />
Set target library<br />
set_attr library name /<br />
Read HDL files<br />
read_hdl ${FILE_LIST}<br />
Elaborate the design<br />
elaborate<br />
Set timing and design constraints<br />
Apply optimization directives<br />
Synthesize –to_gen<br />
Synthesize –to_map –no_incr<br />
Synthesize –to_map -incr<br />
Reports and Interface to P&R<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.<br />
RTL coding – example report power -rtl<br />
•Path adjust<br />
•Path groups<br />
•Mux guidance<br />
•Override target<br />
•Cell selection<br />
•Datapath selection<br />
•Ungrouping<br />
•Area constraints<br />
Effort levels
9<br />
Creating Cost Groups<br />
Cost groups – buckets of logic sharing the same target<br />
Divide the problem and conquer<br />
Target per cost group<br />
Mapping is target based<br />
Highly negative target results in unrealistic goal<br />
Use to isolate impossible paths from other paths<br />
Target manipulation<br />
define_cost_group -name C2C<br />
path_group -from [all::all_seqs] -to [all::all_seqs] -group C2C -name C2C<br />
define_cost_group –name I2C<br />
path_group -from [all::all_inps] -to [all::all_seqs] -group I2C -name I2C<br />
define_cost_group –name I2O<br />
path_group -from [all::all_inps] -to [all::all_outs] -group I2O -name I2O<br />
define_cost_group –name C2O<br />
path_group -from [all::all_seqs] -to [all::all_outs] -group C2O -name C2O<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.
10<br />
Path Adjust Flow – Unique To RC<br />
Target and Path Adjust<br />
Timing lint report<br />
Apply target based PA for effective mapping<br />
Important for area and power effective design for non<br />
critical paths<br />
Affects synthesize –to_map only<br />
path_adjust -to CPU/EBOX/iu0/idaP0ADW/idaP0PreByp2D_reg* -delay -150<br />
path_adjust -to CPU/EBOX/iu1/idaP1ADW/idaP0PreByp2D_reg* -delay -150<br />
path_adjust –from [all_ins] –to [all_outs] –delay 500 –name PA_I2O<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.
11<br />
Mux Selection And Mapping<br />
Mux optimization<br />
Binary mux selection – pragma driven<br />
Area and timing driven but not congestion driven<br />
Attributes available to bias mux mapping<br />
Important for congestion<br />
Identify the high-density pin cells<br />
a<br />
s<br />
b<br />
s<br />
Complex gate Mux gate<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.<br />
a<br />
b
12<br />
Wire Delays – Physically Aware<br />
Wire delays are significant as process geometries shrink<br />
Use Physical Layout Estimator (PLE) method<br />
Dynamic WLM compared to the static library wireload models<br />
Gate sizing for real wires<br />
Physical Layout Estimator (PLE)<br />
• PLE uses actual design and physical<br />
library information.<br />
• Dynamically calculates wire delays for<br />
different logic structures in the design.<br />
• Correlates better with place and route.<br />
set_attr lef_library <br />
set_attr cap_table_file <br />
set_attr interconnect_mode ple /<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.<br />
Wire-load Models<br />
• Wire load models are statistical.<br />
• Wire loads are calculated based on the nearest<br />
calibrated area.<br />
• Correlation is difficult even with custom wire-load<br />
models.<br />
set_attr interconnect_mode wireload /<br />
set_attr wireload_mode top /<br />
set_attr force_wireload<br />
[find /mylib -wireload S160K] /designs/*
13<br />
Wireload Selection Based On the <strong>Design</strong><br />
Timing critical design<br />
PLE<br />
Scale factors help to<br />
introduce pessimism in the<br />
ple calculated wirecaps<br />
Can derive from factors used<br />
in Encounter<br />
set_attr lef_library <br />
set_attr cap_table_file <br />
set_attr interconnect_mode ple /<br />
set_attribute scale_of_cap_per_unit_len 1.2 /<br />
set_attribute scale_of_res_per_unit_len 1.2 /<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.<br />
Non timing critical design<br />
Zero wireload model<br />
synthesis<br />
Smallest design – great for<br />
area<br />
Sized/restructured for<br />
placement introduced wires<br />
Slight over constrain using<br />
path adjust in synthesis<br />
set_attribute force_wireload none /designs/*
14<br />
Clock Gating – Is it important?<br />
Clock gating<br />
Number of clock gating<br />
elements versus root level<br />
clock gating<br />
RC allows Multi level clock<br />
gating<br />
Clone and declone clock<br />
gates after clock gating<br />
insertion<br />
Post map analysis on clock<br />
gating quality of result<br />
Area vs dynamic power<br />
savings<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.<br />
report clock_gating –detail<br />
report power –clock_tree -buffers -<br />
leaf_max_fanout <br />
set_attr lp_clock_gating_max_flops <br />
/designs/*<br />
set_attr lp_clock_gating_min_flops <br />
/designs/*<br />
clock_gating declone/share
15<br />
Optimizing Total Negative Slack and Cell<br />
Selection<br />
TNS<br />
If Worst Negative Slack (WNS) is negative, should I<br />
care about TNS?<br />
Total Negative Slack Optimization provides critical<br />
range<br />
Effective for path based incremental<br />
Focus on mapping result first<br />
Reducing the number of paths violating timing<br />
Cell Selection<br />
Review of don’t use lists used during synthesis<br />
Avoid bad cells<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.
16<br />
Optimizing For Leakage Power Using Multi Vt Libraries<br />
Leakage Power optimization<br />
Load multi Vt libraries<br />
RC uses leakage power numbers characterized in the<br />
library<br />
Depends on the accuracy of library characterization<br />
Useful for sprinkling in leaky cells<br />
Strategies<br />
Limit global mapping to high Vt cells only<br />
Incremental synthesis using Standard vt cells<br />
Effective in optimal leakage optimization and better<br />
structure for timing<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.
17<br />
Selective Ungrouping<br />
Selective ungrouping<br />
ChipWare components<br />
Deeply nested hierarchy<br />
Threshold based or instance name based<br />
Module naming controlled during generic mapping<br />
Ungroup Chipware components and very small instances<br />
set_attribute gen_module_prefix “CW_” /<br />
ungroup –threshold 5000<br />
ungroup [get_attr instance $subdesign]<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.
18<br />
Effort Levels – Generic, Mapping, Incremental Mapping<br />
Effort levels<br />
Effort high does not necessarily mean a good strategy<br />
Effort level is determined by targets for cost groups<br />
Global synthesis<br />
Bottom up synthesis<br />
Don’t discount it completely<br />
It works well for certain designs<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.
19<br />
Bottom up Flow Implementation<br />
Read RTL<br />
Load constraints<br />
Synthesis<br />
Derive environment<br />
Unit level RTL<br />
Unit constraints/reset target<br />
Synthesis<br />
Read unit netlist<br />
Stitch top level<br />
Clock gating analysis<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.<br />
Bottom up flow implementation<br />
Unit level constraint generation<br />
Unit level synthesis<br />
Top level hookup of unit netlists
20<br />
Flow Explorations<br />
A small example design was used to test the RC synthesis strategies<br />
Inst.<br />
count<br />
37312<br />
39109<br />
45728<br />
38611<br />
37228<br />
Hvt(%)<br />
37312(100)<br />
39109(100)<br />
30548(66.8)<br />
30386(78.7)<br />
29423(79)<br />
Svt(%)<br />
0(0)<br />
0(0)<br />
15177(33.2)<br />
8222(21.3)<br />
7802(21)<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.<br />
Timing<br />
WNS ps<br />
-49<br />
-5<br />
-12<br />
500<br />
Run 1<br />
Run 2 Run 3 Run 4<br />
Load Hvt libs<br />
Load Hvt libs Load Multi-Vt libs Hvt<br />
No power const<br />
Map no timing Optimize constraints Multi-Vt Map high –no_incr<br />
Override target<br />
Incr with const<br />
Load Svt<br />
Incr<br />
0<br />
Power<br />
mW<br />
15.99<br />
14.93<br />
33.34<br />
17.7<br />
14.97<br />
Leakage<br />
mW<br />
9.46<br />
9.07<br />
27<br />
6.02<br />
12.14<br />
Run 5<br />
Hvt<br />
Map no const<br />
Svt<br />
Incr
21<br />
Conclusions<br />
Exploration of Synthesis flows and strategies<br />
Better QoR achieved by exploring different<br />
synthesis flows<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.
22<br />
THANK YOU!<br />
© 2007 <strong>Cadence</strong> <strong>Design</strong> <strong>Systems</strong>, Inc. All rights reserved worldwide.