Example: bachelor of science

Chapter 4 Low-Power VLSI DesignPower VLSI Design

Chapter 4. Low-- power vlsi Design Low Jin-Fu Li Advanced Reliable Systems y ((ARES)) Lab. Department of Electrical Engineering National Central University Jhongli, Taiwan Outline Introduction Low-Power Gate-Level Design Low-Power Architecture-Level Design Algorithmic-Level power Reduction RTL Techniques T h i for f Optimizing O i i i power P. National Central University EE4012 vlsi Design 2. Introduction Most SOC Design teams now regard power as one g concerns of their top Design Why Low-Power Design ? Battery lifetime (especially for portable devices). Reliability power consumption Peak power p Average power National Central University EE4012 vlsi Design 3. Overview of power Consumption Average power consumption Dynamic y p power consumption p Short-circuit power consumption Leakage power consumption Static power consumption D. Dynamic i power di dissipation i ti d during i switching it hi Cinput interconnect Cdrain Cinput National Central University EE4012 vlsi Design 4.

Low-Power VLSI DesignPower VLSI Design Jin-Fu Li Advanced Reliable Syy( )stems (ARES) Lab. Department of Electrical Engineering National Central UniversityNational Central …

Tags:

  Design, Power, Vlsi, Low power vlsi designpower vlsi design, Designpower

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Chapter 4 Low-Power VLSI DesignPower VLSI Design

1 Chapter 4. Low-- power vlsi Design Low Jin-Fu Li Advanced Reliable Systems y ((ARES)) Lab. Department of Electrical Engineering National Central University Jhongli, Taiwan Outline Introduction Low-Power Gate-Level Design Low-Power Architecture-Level Design Algorithmic-Level power Reduction RTL Techniques T h i for f Optimizing O i i i power P. National Central University EE4012 vlsi Design 2. Introduction Most SOC Design teams now regard power as one g concerns of their top Design Why Low-Power Design ? Battery lifetime (especially for portable devices). Reliability power consumption Peak power p Average power National Central University EE4012 vlsi Design 3. Overview of power Consumption Average power consumption Dynamic y p power consumption p Short-circuit power consumption Leakage power consumption Static power consumption D. Dynamic i power di dissipation i ti d during i switching it hi Cinput interconnect Cdrain Cinput National Central University EE4012 vlsi Design 4.

2 Overview of power Consumption Generic representation of a CMOS logic gate for gp switching power calculation VA. pMOS. VB network Vout VA nMOS C drain Cint erconnect Cinput VB network 1 T /2 dVout T dVout Pavg [ Vout ( Cload )dt (VDD Vout )(Cload )dt ]. T 0 dt T /2 dt National Central University EE4012 vlsi Design 5. Overview of power Consumption The average power consumption can be expressed as 1. Pavg 2. C load V DD C load V DD. 2. f CLK. T. The node transition rate can be slower than the clock rate. To better represent this behavior, a node transition factor f ( T ) should be introduced Pavg T C load V DD. 2. f CLK. The switching power expressed above are derived by taking into account the output node load capacitance National Central University EE4012 vlsi Design 6. Overview of power Consumption VA VA. Vinternal VB. VB Cinternal Vinternal Vout VA VB Cload Vout The generalized expression for the average power dissipation can be rewritten as # ofnodes.

3 Pavg Ti C iV i V DD f CLK.. i 1 . National Central University EE4012 vlsi Design 7. Gate--Level Design Technology Mapping Gate The objective of logic minimization is to reduce the boolean function. For Low-Power Design , the signal switching activity is minimized by restructuring a logic circuit The power minimization is constrained by the delay, however, the area may increase. During this phase of logic minimization, the function to be minimized is . i P i (1 P i ) C i National Central University EE4012 vlsi Design 8. Gate--Level Design Technology Mapping Gate The first step in technology mapping is to decompose each logic function into two-input gates The objective of this decomposition is to minimizing the total power dissipation by reducing the total switching activity ti it A B C D 0. 5. A. B. C A D B C D 0. 5 National Central University EE4012 vlsi Design 9. Gate--Level Design Phase Assignment Gate High activity node High activity node A.

4 A. B. B. C. C. National Central University EE4012 vlsi Design 10. Gate--Level Design Pin Swapping Gate a b c d a b c d d a Switchin Switching activityy c b ng activity b c a d d a c b b a c d National Central University EE4012 vlsi Design 11. Gate--Level Design Glitching power Gate Glitches spurious transitions due to imbalanced path delays A Design has more balanced delay paths has fewer g glitches,, and thus has less power p dissipation p Note that there will be no glitches in a dynamic CMOS. logic g A. A. B. B D. C. E D. C. E. National Central University EE4012 vlsi Design 12. Gate--Level Design Glitching power Gate A chain structure has more glitches A tree structure has fewer glitches A. B. C Chain structure D. A. B Tree structure C. D. National Central University EE4012 vlsi Design 13. Gate--Level Design Precomputation Gate REG REG. Combinational Logic R1 R2. REG REG. Combinational Logic R1 R2.

5 Precomputation Logic g National Central University EE4012 vlsi Design 14. Gate--Level Design Precomputation Gate A<n-1>. A<n 1> REG 1-bit Comparator B<n-1> R1 (MSB). REG. A<n-2:0>. R2. (n-1)-bit REG. Enable Comparator R4. Precomputation logic F. REG. B<n-2:0>. R3. National Central University EE4012 vlsi Design 15. Gate--Level Design Gating Clock Gate D Q D Q D Q D Q. Fail DFT rule clk checking T. Add control pin D Q D Q D Q D Q to solve DFT. violation problem clk National Central University EE4012 vlsi Design 16. Gate--Level Design Input Gating Gate f1. clk +. select l t f2. National Central University EE4012 vlsi Design 17. Clock--Gating in Low- Clock Low-Power Flip Flip--Flop D D Q. CK. Source: Prof. V. D. Agrawal National Central University EE4012 vlsi Design 18. Reduced-- power Shift Register Reduced D D Q D Q D Q D Q. multiiplexer Output D Q D Q D Q D Q. CK(f/2). Flip-flops are operated at full voltage and half the clock frequency.

6 Source: Prof. V. D. Agrawal National Central University EE4012 vlsi Design 19. power Consumption of Shift Register 16-bit shift register, 2 CMOS. P = C'VDD2f/n Deg. Of D Freq F power P 10. parallelism (MHz) ( W). 1 33 0. 1535. ed power 2 887. 4 8 25. 738 05. ormalize No C. Piguet, Circuit and Logic Level Design pages 103-133. Design , 103 133 in WW. Nebel 00. and J. Mermet (ed.), Low power 1 2 4. Design in Deep Submicron Degree of parallelism, n Electronics Springer, Electronics, Springer 1997. 1997. Source: Prof. V. D. Agrawal National Central University EE4012 vlsi Design 20. Architecture--Level Design Parallelism Architecture 16 16. A R A R. 32 16 32. 16x16 16x16. fref fref/2. multiplier multiplier 16 R. B R. M 32. U. fref fref/2 X. Assume that With the same 16x16 R. multiplier, the power supply can fref be reduced from Vref to 16x16. Vreff f reff fref/2. Pparallel ( ) 2. multiplier p 32.

7 16. 2 B R. 16. fref/2. National Central University EE4012 vlsi Design 21. Architecture--Level Design Pipelining Architecture The hardware between the pipeline stages is reduced then the reference voltage Vref can be reduced to Vnew to maintain the same worst case delay. delay For example, example let a 50 MHz multiplier is broken into two equal parts as shown below. The delay between the pipeline stages can be remained at 50 MHz when the voltage Vnew is equal to /1 83. 32 Half Half 32. (A ,B) REG REG. multiplier multiplier fref V ref Ppipeline 1 .2 C ref ( ) f ref 0 .36 Pref 2. 1 .83. National Central University EE4012 vlsi Design 22. Architecture--Level Design Retiming Architecture Retiming is a transformation technique used to change the locations of delay elements in a circuit without affecting the input/output characteristics of the circuit circuit. Two versions of an IIR filter. (1) (1).

8 X(n). ( ) y(n)). y( x(n). ( ) y(n)). y(. D. D D a 2D a w(n). (2) (1) D (2) 2D. (1). w1(n). b retiming D b w2(n). (2) (2). National Central University EE4012 vlsi Design 23. Architecture--Level Design Retiming Architecture Retiming for pipeline Design REG C1 C2 REG C3. ((6ns)) ((2ns)) (4ns). fref REG C1 REG C2. C3. (6ns) (2ns). (4ns). fref National Central University EE4012 vlsi Design 24. Architecture--Level Design Retiming Architecture Clock cycle is 4 gate delays Clock cycle is 2 gate delays National Central University EE4012 vlsi Design 25. Architecture--Level Design . Architecture power Management C2. C1. C1_FREEZE. C2_FREEZE. C2. C1. C1_FREEZE. FREEZE. C2_FREEZE. National Central University EE4012 vlsi Design 26. Architecture--Level Design . Architecture Bus Segmentation Avoid the sharing of resources Reduce the switched capacitance For example: a global system bus A single shared bus is connected to all modules, this structure results in a large bus capacitance due to The large number of drivers and receivers sharing the same bus The parasitic capacitance of the long bus line A segmented g bus structure Switched capacitance during each bus access is significantly reduced Overall routing area may be increased National Central University EE4012 vlsi Design 27.

9 Architecture--Level Design . Architecture Bus Segmentation Cbus Cbus1. Interface Bus Cbus1. National Central University EE4012 vlsi Design 28. Algorithmic--Level Design . Algorithmic factivity Reduction Minimization the switching activity, at high level, is one way to reduce the power dissipation of digital processors. One method to minimize the switching signals signals, at the algorithmic level, is to use an appropriate coding for the signals rather than straight binary code. The table shown below shows a comparison of 3-bit representation of the binary and Gray codes. Binary Code Gray Code Decimal Equivalent 000 000 0. 001 001 1. 010 011 2. 011 010 3. 100 110 4. 101 111 5. 110 101 6. 111 100 7. National Central University EE4012 vlsi Design 29. State Encoding for a Counter Two-bit binary counter: State sequence, q , 00 01 10 11 00. Six bit transitions in four clock cycles 6/4 = transitions pper clock Two-bit Gray-code counter State St t sequence, 00 01 11 10 00.

10 Four bit transitions in four clock cycles 4/4 = 0 ttransition iti per clock l k Gray-code Gray code counter is more power efficient. G. K. Yeap, Practical Low power Digital vlsi Design , Boston: Kluwer Academic Publishers (now Springer). Springer), 1998. 1998. Source: Prof. V. D. Agrawal National Central University EE4012 vlsi Design 30. Binary Counter: Original Encoding a Present Next state state A. b a b A B. B. 0 0 0 1. 0 1 1 0. 1 0 1 1. 1 1 0 0. A = a'b + ab' CK. B = a'b' + ab' CLR. Source: Prof. V. D. Agrawal National Central University EE4012 vlsi Design 31. Binary Counter: Gray Encoding Present a Next state state A. a b A B. 0 0 0 1 B. b 0 1 1 1. 1 0 0 0. 1 1 1 0. A = a'b + ab CK. B = a'b' + a'b CLR. Source: Prof. V. D. Agrawal National Central University EE4012 vlsi Design 32. Three--Bit Counters Three Binary Gray-code State gg No. of toggles State No. of toggles gg 000 - 000 - 001 1 001 1.