07.01.2015 Views

10 Low Power Design in VLSI - LEDA

10 Low Power Design in VLSI - LEDA

10 Low Power Design in VLSI - LEDA

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Low</strong> <strong>Power</strong> <strong>Design</strong> <strong>in</strong> <strong>VLSI</strong>


Evolution <strong>in</strong> <strong>Power</strong> Dissipation:


Why worry about power<br />

Heat Dissipation<br />

source : arpa-esto<br />

microprocessor power dissipation<br />

DEC 21164


Computers Def<strong>in</strong>ed by Watts not MIPS:<br />

µWatt Wireless<br />

Sensor Networks<br />

Base Stations<br />

MegaWatt<br />

Data Centers<br />

Wireless<br />

Internet<br />

PDAs, Cameras,<br />

Cellphones,<br />

Laptops, GPS,<br />

Set-tops,<br />

0.1-<strong>10</strong> Watt Clients<br />

Internet<br />

Routers


Why <strong>Low</strong> <strong>Power</strong> <br />

• Growth of battery-powered systems<br />

• Users need for:<br />

– Mobility<br />

– Portability<br />

– Reliability<br />

• Cost<br />

• Environmental effects


IC <strong>Design</strong> Space:


<strong>Power</strong> Impacts on System <strong>Design</strong>:<br />

• Energy consumed per task determ<strong>in</strong>es battery life<br />

– Second order effect is that higher current draws decrease<br />

effective battery energy capacity<br />

• Current draw causes IR drops <strong>in</strong> power supply voltage<br />

– Requires more power/ground p<strong>in</strong>s to reduce resistance R<br />

– Requires thick&wide on-chip metal wires or dedicated<br />

metal layers<br />

• Switch<strong>in</strong>g current (dI/dT) causes <strong>in</strong>ductive power supply<br />

voltage bounce ∝ LdI/dT<br />

– Requires more p<strong>in</strong>s/shorter p<strong>in</strong>s to reduce <strong>in</strong>ductance L<br />

– Requires on-chip/on-package decoupl<strong>in</strong>g capacitance to<br />

help bypass p<strong>in</strong>s dur<strong>in</strong>g switch<strong>in</strong>g transients<br />

• <strong>Power</strong> dissipated as heat, higher temps reduce speed and<br />

reliability<br />

– Requires more expensive packag<strong>in</strong>g and cool<strong>in</strong>g systems


Facts ...<br />

Moore´s Law - doubl<strong>in</strong>g transistors every 18 months<br />

• <strong>Power</strong> is proportional to die area and frequency!<br />

• In the same technology a new architecture has 2-3X <strong>in</strong> Die<br />

Area<br />

• Chang<strong>in</strong>g technology implies 2X frequency<br />

SCALING TECHNOLOGY ...<br />

• Decreas<strong>in</strong>g voltage ( 0.7 scal<strong>in</strong>g factor )<br />

• Decreas<strong>in</strong>g of die area ( 0.5 scal<strong>in</strong>g factor )<br />

• Increas<strong>in</strong>g C per unit area 43% !!!


This implies that the power density <strong>in</strong>crease of 40%<br />

every generation !!!<br />

Temperature is a function of power density and determ<strong>in</strong>ates the type of<br />

cool<strong>in</strong>g system needed.<br />

VARIABLES<br />

• PEAK POWER ( worst case )<br />

Today´s packages can susta<strong>in</strong> a power dissipation over <strong>10</strong>0W for up<br />

to <strong>10</strong>0msec >>> cheaper package if peaks are reduced<br />

• ENERGY SPENT ( for a workload )<br />

More correlated to battery life


<strong>Low</strong> <strong>Power</strong> Strategies:<br />

• OS level :<br />

PARTITIONING, POWER DOWN<br />

• Software level : REGULARITY, LOCALITY, CONCURRENCY<br />

( Compiler technology for low power, <strong>in</strong>struction schedul<strong>in</strong>g )<br />

• Architecture level : PIPELINING, REDUNDANCY, DATA ENCODING<br />

( ISA, architectural design, memory hierarchy, HW<br />

extensions, etc )<br />

• Circuit/logic level : LOGIC STYLES, TRANSISTOR SIZING, ENERGY<br />

RECOVERY<br />

( Logic families, conditional clock<strong>in</strong>g, adiabatic circuits,<br />

asynchronous design )<br />

• Technology level : Threshold reduction, multi-threshold devices, etc


<strong>Power</strong> Consumption Estimation:<br />

30<br />

25<br />

20<br />

Error estimation<br />

15<br />

<strong>10</strong><br />

power consumption<br />

5<br />

0<br />

Arch RTL Circuit Layout<br />

Levels of abstraction


Due to the relative high error rate <strong>in</strong> the architectural<br />

estimation ( no vision of the total area, circuit types,<br />

technology, block activity, etc )<br />

IMPORTANT DESIGN DECISIONS MUST BE DONE<br />

AT ARCHITECTURAL LEVEL!<br />

• Accurate power evaluation is done at late design phases<br />

• Needs of good feedback between all the design phases<br />

- Correlation between power estimation from low level to high<br />

level


TRY TO IMPROVE ACCURACY AT HIGH LEVEL<br />

- Critical path based power consumption analysis<br />

( CIRCUIT TYPES, TECHNOLOGY, ACTIVITY FACTOR )<br />

- Thermal images based correlation analysis<br />

( HOTTEST SPOTS LOCATION, COOLEST SPOTS<br />

LOCATION, TEMPERATURE DIFFERENCES, TEMPERATURE<br />

DISTRIBUTION )


Architectural <strong>Power</strong> Evaluation:<br />

Architectural design partition<br />

• <strong>Power</strong> consumption evaluation at block level<br />

- <strong>Power</strong> density of blocks ( SPICE simulation, statistical <strong>in</strong>put set,<br />

technology and circuit types def<strong>in</strong>ition )<br />

- Activity of blocks and sub-blocks ( runn<strong>in</strong>g benchmarks )<br />

- Area ( feedback from <strong>VLSI</strong> design, circuits and technology<br />

def<strong>in</strong>ed )<br />

• Try do def<strong>in</strong>e scal<strong>in</strong>g factors that allow to remap the<br />

architectural power simulator when technology, area and<br />

circuit types change<br />

• Try to reduce the error estimation at high level


<strong>Power</strong> Dissipation <strong>in</strong> CMOS:<br />

Short-Circuit<br />

Current<br />

Capacitor<br />

Charg<strong>in</strong>g<br />

Current<br />

C L<br />

Diode Leakage Current<br />

Subthreshold Leakage Current<br />

Primary Components:<br />

• Capacitor Charg<strong>in</strong>g (85-90% of active power)<br />

– Energy is ½ CV 2 per transition<br />

• Short-Circuit Current (<strong>10</strong>-15% of active power)<br />

– When both p and n transistors turn on dur<strong>in</strong>g signal<br />

transition<br />

• Subthreshold Leakage (dom<strong>in</strong>ates when <strong>in</strong>active)<br />

– Transistors don’t turn off completely<br />

• Diode Leakage (negligible)<br />

– Parasitic source and dra<strong>in</strong> diodes leak to substrate


Sources of <strong>Power</strong> Dissipation:<br />

• Dynamic power dissipations: whenever the logic<br />

level changes at different po<strong>in</strong>ts <strong>in</strong> the circuit because of the change <strong>in</strong><br />

the <strong>in</strong>put signals the dynamic power dissipation occurs.<br />

– Switch<strong>in</strong>g power dissipation.<br />

– Short-circuit power dissipation.<br />

• Static power dissipations: this is a type of dissipation,<br />

which does not have any effect of level change <strong>in</strong> the <strong>in</strong>put and output.<br />

– Leakage power.


Switch<strong>in</strong>g <strong>Power</strong> Dissipation:<br />

• Caused by the charg<strong>in</strong>g and discharg<strong>in</strong>g of the<br />

node capacitance.<br />

Figure 1: Switch<strong>in</strong>g power<br />

dissipation [1].


Switch<strong>in</strong>g <strong>Power</strong> Dissipation (Contd.):<br />

• P s/w<br />

= 0.5 * α * C L<br />

* V dd2<br />

*f clk<br />

–C L<br />

physical capacitance, V dd<br />

supply voltage, α switch<strong>in</strong>g activity,<br />

f clk<br />

clock frequency.<br />

–C L<br />

(i) = Σ j<br />

C INj<br />

+C wire<br />

+C par(i)<br />

–C IN<br />

the gate <strong>in</strong>put capacitance, C wire<br />

the parasitic <strong>in</strong>terconnect<br />

and C par<br />

diffusion capacitances of each gate[I].<br />

• Depends on:<br />

– Supply voltage<br />

– Physical Capacitance<br />

– Switch<strong>in</strong>g activity


Short circuit power dissipation:<br />

• Caused by simultaneous conduction of n and p blocks.<br />

Figure 2: Short circuit current


Short circuit power dissipation (contd.):<br />

where k = (k n<br />

=k p<br />

), the trans conductance of the transistor,<br />

τ = (t rise<br />

=t fall<br />

), the <strong>in</strong>put/output transition time, V DD<br />

= supply voltage,<br />

f = clock frequency, and V T<br />

= (V Tn<br />

= |V Tp<br />

|), the threshold voltage of<br />

MOSFET.<br />

• Depends on :<br />

– The <strong>in</strong>put ramp<br />

– Load<br />

– The transistor size of the gate<br />

– Supply voltage<br />

– Frequency<br />

– Threshold voltage.


Leakage power dissipation:<br />

• Six short-channel leakage mechanisms are<br />

there:<br />

–I 1<br />

Reverse-bias p-n junction leakage<br />

–I 2<br />

Sub threshold leakage<br />

–I 3<br />

Oxide tunnel<strong>in</strong>g current<br />

–I 4<br />

Gate current due to hot-carrier <strong>in</strong>jection<br />

–I 5<br />

GIDL (Gate Induced Dra<strong>in</strong> Leakage)<br />

–I 6<br />

Channel punch through current<br />

• I 1<br />

and I 2<br />

are the dom<strong>in</strong>ant leakage mechanisms


Leakage power dissipation (contd.)<br />

Figure 3: Summary of leakage current mechanism [2]


PN Junction reverse bias current:<br />

• The reverse bias<strong>in</strong>g of p-n junction cause<br />

reverse bias current<br />

– Caused by diffusion/drift of m<strong>in</strong>ority carrier near the<br />

edge of the depletion region.<br />

where V bias<br />

= the reverse bias voltage across the p-n junction, J s<br />

= the<br />

reverse saturation current density and A = the junction area.


Sub Threshold Leakage Current:<br />

• Caused when the gate voltage is below V th.<br />

Fig 4: Sub threshold current[2]<br />

Fig 5: Subthreshold leakage <strong>in</strong> a negativechannel<br />

metal–oxide–semiconductor<br />

(NMOS) transistor.[2]


Contribution of Different <strong>Power</strong><br />

Dissipation:<br />

Fig 6: Contribution of different powers[1]<br />

Fig 7:Static power <strong>in</strong>creases with<br />

shr<strong>in</strong>k<strong>in</strong>g device geometries [7].


Degrees of Freedom<br />

• The three degrees of freedom are:<br />

– Supply Voltage<br />

– Switch<strong>in</strong>g Activity<br />

– Physical capacitance


Reduc<strong>in</strong>g <strong>Power</strong>:<br />

• Switch<strong>in</strong>g power ∝ activity*½ CV 2 *frequency<br />

– (Ignor<strong>in</strong>g short-circuit and leakage currents)<br />

• Reduce activity<br />

– Clock and function gat<strong>in</strong>g<br />

– Reduce spurious logic glitches<br />

• Reduce switched capacitance C<br />

– Different logic styles (logic, pass transistor, dynamic)<br />

– Careful transistor siz<strong>in</strong>g<br />

– Tighter layout<br />

– Segmented structures<br />

• Reduce supply voltage V<br />

– Quadratic sav<strong>in</strong>gs <strong>in</strong> energy per transition – BIG effect<br />

– But circuit delay is reduced<br />

• Reduce frequency<br />

– Doesn’t save energy just reduces rate at which it is<br />

consumed<br />

– Some sav<strong>in</strong>g <strong>in</strong> battery life from reduction <strong>in</strong> current<br />

draw


Supply Voltage Scal<strong>in</strong>g<br />

• Switch<strong>in</strong>g and short circuit power are<br />

proportional to the square of the supply<br />

voltage.<br />

• But the delay is proportional to the supply<br />

voltage. So, the decrease <strong>in</strong> supply voltage will<br />

results <strong>in</strong> slower system.<br />

• Threshold voltage can be scaled down to get<br />

the same performance, but it may <strong>in</strong>crease the<br />

concern about the leakage current and noise<br />

marg<strong>in</strong>.


Supply Voltage Scal<strong>in</strong>g (contd.)<br />

Fig 8: Scal<strong>in</strong>g supply and threshold<br />

voltages [4]<br />

Fig 9: Scal<strong>in</strong>g of threshold voltage on<br />

leakage power and delay[4]


Switch<strong>in</strong>g Activity Reduction<br />

• Two components:<br />

– f: The average periodicity of data arrivals<br />

– α: how many transitions each arrival will generate.<br />

• There will be no net benefits by Reduc<strong>in</strong>g f.<br />

• α can be reduced by algorithmic optimization, by<br />

architecture optimization, by proper choice of<br />

logic topology and by logic-level optimization.


Physical capacitance reduction<br />

• Physical capacitance <strong>in</strong> a circuit consists of three<br />

components:<br />

– The output node capacitance (C L ).<br />

– The <strong>in</strong>put capacitance (C <strong>in</strong> ) of the driven gates.<br />

– The total <strong>in</strong>terconnect capacitance (C <strong>in</strong>t ).<br />

• Smaller the size of a device, smaller is C L.<br />

• The gate area of each transistor determ<strong>in</strong>es C <strong>in</strong>.<br />

• C <strong>in</strong>t is determ<strong>in</strong>e by width and thickness of the<br />

metal/oxide layers with which the <strong>in</strong>terconnect<br />

l<strong>in</strong>e is made of, and capacitances between layers<br />

around the <strong>in</strong>terconnect l<strong>in</strong>es.


Issues<br />

• Technology Scal<strong>in</strong>g<br />

– Capacitance per node reduces by 30%<br />

– Electrical nodes <strong>in</strong>crease by 2X<br />

– Die size grows by 14% (Moore’s Law)<br />

– Supply voltage reduces by 15%<br />

– And frequency <strong>in</strong>creases by 2X<br />

This will <strong>in</strong>crease the active power by 2.7X


Issues (contd.)<br />

• To meet frequency demand V t will be<br />

scaled, result<strong>in</strong>g high leakage power.<br />

*Source: Intel<br />

Fig <strong>10</strong>:Total power consumption of a microprocessor follow<strong>in</strong>g Moore’s Law


Ultra <strong>Low</strong> <strong>Power</strong> System <strong>Design</strong>:<br />

• <strong>Power</strong> m<strong>in</strong>imization approaches:<br />

• Run at m<strong>in</strong>imum allowable voltage<br />

• M<strong>in</strong>imize effective switch<strong>in</strong>g capacitance


Process<br />

• Progress <strong>in</strong> SOI and bulk silicon<br />

– (a) 0.5V operation of ICs us<strong>in</strong>g SOI technology<br />

– (b) 0.9V operation of bulk silicon memory, logic, and<br />

processors<br />

• Increas<strong>in</strong>g densities and clock frequencies have<br />

pushed the power up even with reduce power<br />

supply


Choice of Logic Style


Choice of Logic Style<br />

• <strong>Power</strong>-delay product improves as voltage decreases<br />

• The “best” logic style m<strong>in</strong>imizes power-delay for a


<strong>Power</strong> Consumption is Data<br />

Dependent<br />

• Example : Static 2 Input NOR Gate<br />

Assume :<br />

P(A=1) = ½<br />

P(B=1) = ½<br />

Then :<br />

P(Out=1) = ¼<br />

P(0→1)<br />

= P(Out=0).P(Out=1)<br />

=3/4 * 1/4 = 3/16<br />

C EFF = 3/16 * C L


Transition Probability of 2-<strong>in</strong>put<br />

NOR Gate<br />

as a function of <strong>in</strong>put probabilities


Switch<strong>in</strong>g Activity (α) : Example


Glitch<strong>in</strong>g <strong>in</strong> Static CMOS


At the Datapath Level…<br />

Irregular<br />

Reusable


Balanc<strong>in</strong>g Operations


Carry Ripple


Data Representation


<strong>Low</strong> <strong>Power</strong> <strong>Design</strong> Consideration<br />

(cont’)<br />

(B<strong>in</strong>ary v.s. Gray Encod<strong>in</strong>g)


Resource Shar<strong>in</strong>g Can Increase<br />

(Separate Bus Structure)<br />

Activity


Resource Shar<strong>in</strong>g Can Increase<br />

Activity (cont’d)


Operat<strong>in</strong>g at the <strong>Low</strong>est Possible<br />

Voltage<br />

• Desire to operate at lowest possible speeds<br />

(us<strong>in</strong>g low supply voltages)<br />

• Use Architecture optimization to compensate for<br />

slower operation<br />

Approach : Trade-off AREA for lower<br />

POWER


Reduc<strong>in</strong>g V dd


<strong>Low</strong>er<strong>in</strong>g V dd Increases Delay<br />

• Concept of Dynamic Voltage Scal<strong>in</strong>g (DVS)


Architecture Trade-offs : Reference<br />

Data Path


Parallel Data Path


Paralelna implementacija dela<br />

datapath :


Pipel<strong>in</strong>ed Data Path


Protočna implementacija:


Paralelno-protočna implementacija:


A Simple Data Path : Summary


Computational Complexity of DCT<br />

Algorithms


<strong>Power</strong> Down Techniques<br />

• Concept of Dynamic<br />

Frequency Scal<strong>in</strong>g (DFS)


Energy-efficient Software<br />

Cod<strong>in</strong>g<br />

• Potential for power reduction via software<br />

modification is relatively unexploited.<br />

• Code size and algorithmic efficiency can<br />

significantly affect energy dissipation<br />

• Pipel<strong>in</strong><strong>in</strong>g at software level- VLIW cod<strong>in</strong>g style<br />

• Examples -


<strong>Power</strong> Hunger – Clock Network<br />

(Always Tick<strong>in</strong>g)<br />

• H-Tree – design deficiencies based on Elmore delay<br />

model<br />

• PLL – every designer (digital or analog) should have<br />

the knowledge of PLL<br />

• Multiple frequencies <strong>in</strong> chips/systems – by PLL<br />

• <strong>Low</strong> ma<strong>in</strong> frequency, But<br />

• Jitter and Noise, Ga<strong>in</strong> and Bandwidth, Pull-<strong>in</strong><br />

and Lock Time, Stability …<br />

• Local time zone<br />

• Self-Timed<br />

• Asynchronous => Use Gated Clocks, Sleep Mode


<strong>Power</strong> Analysis <strong>in</strong> the <strong>Design</strong><br />

Flow


Human Wearable Comput<strong>in</strong>g -<br />

<strong>Power</strong><br />

• Wearable comput<strong>in</strong>g – embedd<strong>in</strong>g computer<br />

<strong>in</strong>to cloth<strong>in</strong>g or creat<strong>in</strong>g a form that can be used<br />

like cloth<strong>in</strong>g<br />

• Current comput<strong>in</strong>g is limited by battery capacity,<br />

output current, and electrical outlet for<br />

recharg<strong>in</strong>g


Conclusions<br />

• High-speed design is a requirement for many applications<br />

• <strong>Low</strong>-power design is also a requirement for IC designers.<br />

• A new way of THINKING to simultaneously achieve both!!!<br />

• <strong>Low</strong> power impacts <strong>in</strong> the cost, size, weight, performance, and<br />

reliability.<br />

• Variable V dd and Vt is a trend<br />

• CAD tools high level power estimation and management<br />

• Don’t just work on <strong>VLSI</strong>, pay attention to MEMS – lot of<br />

problems and potential is great.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!