10 Low Power Design in VLSI - LEDA
10 Low Power Design in VLSI - LEDA
10 Low Power Design in VLSI - LEDA
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Low</strong> <strong>Power</strong> <strong>Design</strong> <strong>in</strong> <strong>VLSI</strong>
Evolution <strong>in</strong> <strong>Power</strong> Dissipation:
Why worry about power<br />
Heat Dissipation<br />
source : arpa-esto<br />
microprocessor power dissipation<br />
DEC 21164
Computers Def<strong>in</strong>ed by Watts not MIPS:<br />
µWatt Wireless<br />
Sensor Networks<br />
Base Stations<br />
MegaWatt<br />
Data Centers<br />
Wireless<br />
Internet<br />
PDAs, Cameras,<br />
Cellphones,<br />
Laptops, GPS,<br />
Set-tops,<br />
0.1-<strong>10</strong> Watt Clients<br />
Internet<br />
Routers
Why <strong>Low</strong> <strong>Power</strong> <br />
• Growth of battery-powered systems<br />
• Users need for:<br />
– Mobility<br />
– Portability<br />
– Reliability<br />
• Cost<br />
• Environmental effects
IC <strong>Design</strong> Space:
<strong>Power</strong> Impacts on System <strong>Design</strong>:<br />
• Energy consumed per task determ<strong>in</strong>es battery life<br />
– Second order effect is that higher current draws decrease<br />
effective battery energy capacity<br />
• Current draw causes IR drops <strong>in</strong> power supply voltage<br />
– Requires more power/ground p<strong>in</strong>s to reduce resistance R<br />
– Requires thick&wide on-chip metal wires or dedicated<br />
metal layers<br />
• Switch<strong>in</strong>g current (dI/dT) causes <strong>in</strong>ductive power supply<br />
voltage bounce ∝ LdI/dT<br />
– Requires more p<strong>in</strong>s/shorter p<strong>in</strong>s to reduce <strong>in</strong>ductance L<br />
– Requires on-chip/on-package decoupl<strong>in</strong>g capacitance to<br />
help bypass p<strong>in</strong>s dur<strong>in</strong>g switch<strong>in</strong>g transients<br />
• <strong>Power</strong> dissipated as heat, higher temps reduce speed and<br />
reliability<br />
– Requires more expensive packag<strong>in</strong>g and cool<strong>in</strong>g systems
Facts ...<br />
Moore´s Law - doubl<strong>in</strong>g transistors every 18 months<br />
• <strong>Power</strong> is proportional to die area and frequency!<br />
• In the same technology a new architecture has 2-3X <strong>in</strong> Die<br />
Area<br />
• Chang<strong>in</strong>g technology implies 2X frequency<br />
SCALING TECHNOLOGY ...<br />
• Decreas<strong>in</strong>g voltage ( 0.7 scal<strong>in</strong>g factor )<br />
• Decreas<strong>in</strong>g of die area ( 0.5 scal<strong>in</strong>g factor )<br />
• Increas<strong>in</strong>g C per unit area 43% !!!
This implies that the power density <strong>in</strong>crease of 40%<br />
every generation !!!<br />
Temperature is a function of power density and determ<strong>in</strong>ates the type of<br />
cool<strong>in</strong>g system needed.<br />
VARIABLES<br />
• PEAK POWER ( worst case )<br />
Today´s packages can susta<strong>in</strong> a power dissipation over <strong>10</strong>0W for up<br />
to <strong>10</strong>0msec >>> cheaper package if peaks are reduced<br />
• ENERGY SPENT ( for a workload )<br />
More correlated to battery life
<strong>Low</strong> <strong>Power</strong> Strategies:<br />
• OS level :<br />
PARTITIONING, POWER DOWN<br />
• Software level : REGULARITY, LOCALITY, CONCURRENCY<br />
( Compiler technology for low power, <strong>in</strong>struction schedul<strong>in</strong>g )<br />
• Architecture level : PIPELINING, REDUNDANCY, DATA ENCODING<br />
( ISA, architectural design, memory hierarchy, HW<br />
extensions, etc )<br />
• Circuit/logic level : LOGIC STYLES, TRANSISTOR SIZING, ENERGY<br />
RECOVERY<br />
( Logic families, conditional clock<strong>in</strong>g, adiabatic circuits,<br />
asynchronous design )<br />
• Technology level : Threshold reduction, multi-threshold devices, etc
<strong>Power</strong> Consumption Estimation:<br />
30<br />
25<br />
20<br />
Error estimation<br />
15<br />
<strong>10</strong><br />
power consumption<br />
5<br />
0<br />
Arch RTL Circuit Layout<br />
Levels of abstraction
Due to the relative high error rate <strong>in</strong> the architectural<br />
estimation ( no vision of the total area, circuit types,<br />
technology, block activity, etc )<br />
IMPORTANT DESIGN DECISIONS MUST BE DONE<br />
AT ARCHITECTURAL LEVEL!<br />
• Accurate power evaluation is done at late design phases<br />
• Needs of good feedback between all the design phases<br />
- Correlation between power estimation from low level to high<br />
level
TRY TO IMPROVE ACCURACY AT HIGH LEVEL<br />
- Critical path based power consumption analysis<br />
( CIRCUIT TYPES, TECHNOLOGY, ACTIVITY FACTOR )<br />
- Thermal images based correlation analysis<br />
( HOTTEST SPOTS LOCATION, COOLEST SPOTS<br />
LOCATION, TEMPERATURE DIFFERENCES, TEMPERATURE<br />
DISTRIBUTION )
Architectural <strong>Power</strong> Evaluation:<br />
Architectural design partition<br />
• <strong>Power</strong> consumption evaluation at block level<br />
- <strong>Power</strong> density of blocks ( SPICE simulation, statistical <strong>in</strong>put set,<br />
technology and circuit types def<strong>in</strong>ition )<br />
- Activity of blocks and sub-blocks ( runn<strong>in</strong>g benchmarks )<br />
- Area ( feedback from <strong>VLSI</strong> design, circuits and technology<br />
def<strong>in</strong>ed )<br />
• Try do def<strong>in</strong>e scal<strong>in</strong>g factors that allow to remap the<br />
architectural power simulator when technology, area and<br />
circuit types change<br />
• Try to reduce the error estimation at high level
<strong>Power</strong> Dissipation <strong>in</strong> CMOS:<br />
Short-Circuit<br />
Current<br />
Capacitor<br />
Charg<strong>in</strong>g<br />
Current<br />
C L<br />
Diode Leakage Current<br />
Subthreshold Leakage Current<br />
Primary Components:<br />
• Capacitor Charg<strong>in</strong>g (85-90% of active power)<br />
– Energy is ½ CV 2 per transition<br />
• Short-Circuit Current (<strong>10</strong>-15% of active power)<br />
– When both p and n transistors turn on dur<strong>in</strong>g signal<br />
transition<br />
• Subthreshold Leakage (dom<strong>in</strong>ates when <strong>in</strong>active)<br />
– Transistors don’t turn off completely<br />
• Diode Leakage (negligible)<br />
– Parasitic source and dra<strong>in</strong> diodes leak to substrate
Sources of <strong>Power</strong> Dissipation:<br />
• Dynamic power dissipations: whenever the logic<br />
level changes at different po<strong>in</strong>ts <strong>in</strong> the circuit because of the change <strong>in</strong><br />
the <strong>in</strong>put signals the dynamic power dissipation occurs.<br />
– Switch<strong>in</strong>g power dissipation.<br />
– Short-circuit power dissipation.<br />
• Static power dissipations: this is a type of dissipation,<br />
which does not have any effect of level change <strong>in</strong> the <strong>in</strong>put and output.<br />
– Leakage power.
Switch<strong>in</strong>g <strong>Power</strong> Dissipation:<br />
• Caused by the charg<strong>in</strong>g and discharg<strong>in</strong>g of the<br />
node capacitance.<br />
Figure 1: Switch<strong>in</strong>g power<br />
dissipation [1].
Switch<strong>in</strong>g <strong>Power</strong> Dissipation (Contd.):<br />
• P s/w<br />
= 0.5 * α * C L<br />
* V dd2<br />
*f clk<br />
–C L<br />
physical capacitance, V dd<br />
supply voltage, α switch<strong>in</strong>g activity,<br />
f clk<br />
clock frequency.<br />
–C L<br />
(i) = Σ j<br />
C INj<br />
+C wire<br />
+C par(i)<br />
–C IN<br />
the gate <strong>in</strong>put capacitance, C wire<br />
the parasitic <strong>in</strong>terconnect<br />
and C par<br />
diffusion capacitances of each gate[I].<br />
• Depends on:<br />
– Supply voltage<br />
– Physical Capacitance<br />
– Switch<strong>in</strong>g activity
Short circuit power dissipation:<br />
• Caused by simultaneous conduction of n and p blocks.<br />
Figure 2: Short circuit current
Short circuit power dissipation (contd.):<br />
where k = (k n<br />
=k p<br />
), the trans conductance of the transistor,<br />
τ = (t rise<br />
=t fall<br />
), the <strong>in</strong>put/output transition time, V DD<br />
= supply voltage,<br />
f = clock frequency, and V T<br />
= (V Tn<br />
= |V Tp<br />
|), the threshold voltage of<br />
MOSFET.<br />
• Depends on :<br />
– The <strong>in</strong>put ramp<br />
– Load<br />
– The transistor size of the gate<br />
– Supply voltage<br />
– Frequency<br />
– Threshold voltage.
Leakage power dissipation:<br />
• Six short-channel leakage mechanisms are<br />
there:<br />
–I 1<br />
Reverse-bias p-n junction leakage<br />
–I 2<br />
Sub threshold leakage<br />
–I 3<br />
Oxide tunnel<strong>in</strong>g current<br />
–I 4<br />
Gate current due to hot-carrier <strong>in</strong>jection<br />
–I 5<br />
GIDL (Gate Induced Dra<strong>in</strong> Leakage)<br />
–I 6<br />
Channel punch through current<br />
• I 1<br />
and I 2<br />
are the dom<strong>in</strong>ant leakage mechanisms
Leakage power dissipation (contd.)<br />
Figure 3: Summary of leakage current mechanism [2]
PN Junction reverse bias current:<br />
• The reverse bias<strong>in</strong>g of p-n junction cause<br />
reverse bias current<br />
– Caused by diffusion/drift of m<strong>in</strong>ority carrier near the<br />
edge of the depletion region.<br />
where V bias<br />
= the reverse bias voltage across the p-n junction, J s<br />
= the<br />
reverse saturation current density and A = the junction area.
Sub Threshold Leakage Current:<br />
• Caused when the gate voltage is below V th.<br />
Fig 4: Sub threshold current[2]<br />
Fig 5: Subthreshold leakage <strong>in</strong> a negativechannel<br />
metal–oxide–semiconductor<br />
(NMOS) transistor.[2]
Contribution of Different <strong>Power</strong><br />
Dissipation:<br />
Fig 6: Contribution of different powers[1]<br />
Fig 7:Static power <strong>in</strong>creases with<br />
shr<strong>in</strong>k<strong>in</strong>g device geometries [7].
Degrees of Freedom<br />
• The three degrees of freedom are:<br />
– Supply Voltage<br />
– Switch<strong>in</strong>g Activity<br />
– Physical capacitance
Reduc<strong>in</strong>g <strong>Power</strong>:<br />
• Switch<strong>in</strong>g power ∝ activity*½ CV 2 *frequency<br />
– (Ignor<strong>in</strong>g short-circuit and leakage currents)<br />
• Reduce activity<br />
– Clock and function gat<strong>in</strong>g<br />
– Reduce spurious logic glitches<br />
• Reduce switched capacitance C<br />
– Different logic styles (logic, pass transistor, dynamic)<br />
– Careful transistor siz<strong>in</strong>g<br />
– Tighter layout<br />
– Segmented structures<br />
• Reduce supply voltage V<br />
– Quadratic sav<strong>in</strong>gs <strong>in</strong> energy per transition – BIG effect<br />
– But circuit delay is reduced<br />
• Reduce frequency<br />
– Doesn’t save energy just reduces rate at which it is<br />
consumed<br />
– Some sav<strong>in</strong>g <strong>in</strong> battery life from reduction <strong>in</strong> current<br />
draw
Supply Voltage Scal<strong>in</strong>g<br />
• Switch<strong>in</strong>g and short circuit power are<br />
proportional to the square of the supply<br />
voltage.<br />
• But the delay is proportional to the supply<br />
voltage. So, the decrease <strong>in</strong> supply voltage will<br />
results <strong>in</strong> slower system.<br />
• Threshold voltage can be scaled down to get<br />
the same performance, but it may <strong>in</strong>crease the<br />
concern about the leakage current and noise<br />
marg<strong>in</strong>.
Supply Voltage Scal<strong>in</strong>g (contd.)<br />
Fig 8: Scal<strong>in</strong>g supply and threshold<br />
voltages [4]<br />
Fig 9: Scal<strong>in</strong>g of threshold voltage on<br />
leakage power and delay[4]
Switch<strong>in</strong>g Activity Reduction<br />
• Two components:<br />
– f: The average periodicity of data arrivals<br />
– α: how many transitions each arrival will generate.<br />
• There will be no net benefits by Reduc<strong>in</strong>g f.<br />
• α can be reduced by algorithmic optimization, by<br />
architecture optimization, by proper choice of<br />
logic topology and by logic-level optimization.
Physical capacitance reduction<br />
• Physical capacitance <strong>in</strong> a circuit consists of three<br />
components:<br />
– The output node capacitance (C L ).<br />
– The <strong>in</strong>put capacitance (C <strong>in</strong> ) of the driven gates.<br />
– The total <strong>in</strong>terconnect capacitance (C <strong>in</strong>t ).<br />
• Smaller the size of a device, smaller is C L.<br />
• The gate area of each transistor determ<strong>in</strong>es C <strong>in</strong>.<br />
• C <strong>in</strong>t is determ<strong>in</strong>e by width and thickness of the<br />
metal/oxide layers with which the <strong>in</strong>terconnect<br />
l<strong>in</strong>e is made of, and capacitances between layers<br />
around the <strong>in</strong>terconnect l<strong>in</strong>es.
Issues<br />
• Technology Scal<strong>in</strong>g<br />
– Capacitance per node reduces by 30%<br />
– Electrical nodes <strong>in</strong>crease by 2X<br />
– Die size grows by 14% (Moore’s Law)<br />
– Supply voltage reduces by 15%<br />
– And frequency <strong>in</strong>creases by 2X<br />
This will <strong>in</strong>crease the active power by 2.7X
Issues (contd.)<br />
• To meet frequency demand V t will be<br />
scaled, result<strong>in</strong>g high leakage power.<br />
*Source: Intel<br />
Fig <strong>10</strong>:Total power consumption of a microprocessor follow<strong>in</strong>g Moore’s Law
Ultra <strong>Low</strong> <strong>Power</strong> System <strong>Design</strong>:<br />
• <strong>Power</strong> m<strong>in</strong>imization approaches:<br />
• Run at m<strong>in</strong>imum allowable voltage<br />
• M<strong>in</strong>imize effective switch<strong>in</strong>g capacitance
Process<br />
• Progress <strong>in</strong> SOI and bulk silicon<br />
– (a) 0.5V operation of ICs us<strong>in</strong>g SOI technology<br />
– (b) 0.9V operation of bulk silicon memory, logic, and<br />
processors<br />
• Increas<strong>in</strong>g densities and clock frequencies have<br />
pushed the power up even with reduce power<br />
supply
Choice of Logic Style
Choice of Logic Style<br />
• <strong>Power</strong>-delay product improves as voltage decreases<br />
• The “best” logic style m<strong>in</strong>imizes power-delay for a
<strong>Power</strong> Consumption is Data<br />
Dependent<br />
• Example : Static 2 Input NOR Gate<br />
Assume :<br />
P(A=1) = ½<br />
P(B=1) = ½<br />
Then :<br />
P(Out=1) = ¼<br />
P(0→1)<br />
= P(Out=0).P(Out=1)<br />
=3/4 * 1/4 = 3/16<br />
C EFF = 3/16 * C L
Transition Probability of 2-<strong>in</strong>put<br />
NOR Gate<br />
as a function of <strong>in</strong>put probabilities
Switch<strong>in</strong>g Activity (α) : Example
Glitch<strong>in</strong>g <strong>in</strong> Static CMOS
At the Datapath Level…<br />
Irregular<br />
Reusable
Balanc<strong>in</strong>g Operations
Carry Ripple
Data Representation
<strong>Low</strong> <strong>Power</strong> <strong>Design</strong> Consideration<br />
(cont’)<br />
(B<strong>in</strong>ary v.s. Gray Encod<strong>in</strong>g)
Resource Shar<strong>in</strong>g Can Increase<br />
(Separate Bus Structure)<br />
Activity
Resource Shar<strong>in</strong>g Can Increase<br />
Activity (cont’d)
Operat<strong>in</strong>g at the <strong>Low</strong>est Possible<br />
Voltage<br />
• Desire to operate at lowest possible speeds<br />
(us<strong>in</strong>g low supply voltages)<br />
• Use Architecture optimization to compensate for<br />
slower operation<br />
Approach : Trade-off AREA for lower<br />
POWER
Reduc<strong>in</strong>g V dd
<strong>Low</strong>er<strong>in</strong>g V dd Increases Delay<br />
• Concept of Dynamic Voltage Scal<strong>in</strong>g (DVS)
Architecture Trade-offs : Reference<br />
Data Path
Parallel Data Path
Paralelna implementacija dela<br />
datapath :
Pipel<strong>in</strong>ed Data Path
Protočna implementacija:
Paralelno-protočna implementacija:
A Simple Data Path : Summary
Computational Complexity of DCT<br />
Algorithms
<strong>Power</strong> Down Techniques<br />
• Concept of Dynamic<br />
Frequency Scal<strong>in</strong>g (DFS)
Energy-efficient Software<br />
Cod<strong>in</strong>g<br />
• Potential for power reduction via software<br />
modification is relatively unexploited.<br />
• Code size and algorithmic efficiency can<br />
significantly affect energy dissipation<br />
• Pipel<strong>in</strong><strong>in</strong>g at software level- VLIW cod<strong>in</strong>g style<br />
• Examples -
<strong>Power</strong> Hunger – Clock Network<br />
(Always Tick<strong>in</strong>g)<br />
• H-Tree – design deficiencies based on Elmore delay<br />
model<br />
• PLL – every designer (digital or analog) should have<br />
the knowledge of PLL<br />
• Multiple frequencies <strong>in</strong> chips/systems – by PLL<br />
• <strong>Low</strong> ma<strong>in</strong> frequency, But<br />
• Jitter and Noise, Ga<strong>in</strong> and Bandwidth, Pull-<strong>in</strong><br />
and Lock Time, Stability …<br />
• Local time zone<br />
• Self-Timed<br />
• Asynchronous => Use Gated Clocks, Sleep Mode
<strong>Power</strong> Analysis <strong>in</strong> the <strong>Design</strong><br />
Flow
Human Wearable Comput<strong>in</strong>g -<br />
<strong>Power</strong><br />
• Wearable comput<strong>in</strong>g – embedd<strong>in</strong>g computer<br />
<strong>in</strong>to cloth<strong>in</strong>g or creat<strong>in</strong>g a form that can be used<br />
like cloth<strong>in</strong>g<br />
• Current comput<strong>in</strong>g is limited by battery capacity,<br />
output current, and electrical outlet for<br />
recharg<strong>in</strong>g
Conclusions<br />
• High-speed design is a requirement for many applications<br />
• <strong>Low</strong>-power design is also a requirement for IC designers.<br />
• A new way of THINKING to simultaneously achieve both!!!<br />
• <strong>Low</strong> power impacts <strong>in</strong> the cost, size, weight, performance, and<br />
reliability.<br />
• Variable V dd and Vt is a trend<br />
• CAD tools high level power estimation and management<br />
• Don’t just work on <strong>VLSI</strong>, pay attention to MEMS – lot of<br />
problems and potential is great.