12.07.2015 Views

Performance Comparison of Fast Multipliers Implemented on ...

Performance Comparison of Fast Multipliers Implemented on ...

Performance Comparison of Fast Multipliers Implemented on ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Internati<strong>on</strong>al Journal <str<strong>on</strong>g>of</str<strong>on</strong>g> Computer Applicati<strong>on</strong>s in Engineering Sciences[VOL II, ISSUE II, JUNE 2012] [ISSN: 2231-4946]<str<strong>on</strong>g>Performance</str<strong>on</strong>g> <str<strong>on</strong>g>Comparis<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>Multipliers</str<strong>on</strong>g><str<strong>on</strong>g>Implemented</str<strong>on</strong>g> <strong>on</strong> Variable Precisi<strong>on</strong> FloatingPoint Multiplicati<strong>on</strong> AlgorithmNeelima Koppala 1 , Rohit Sreerama 2 , Paidi Satish 21,2 Sree Vidyanikethan Engineering College, Tirupathikoppalaneelima@gmail.comAbstract:- The multiplicati<strong>on</strong> is the basic arithmeticoperati<strong>on</strong> in any typical processor. The multiplicati<strong>on</strong>process requires more hardware resources and processingtime when compared with additi<strong>on</strong> and subtracti<strong>on</strong>. Theaccuracy <str<strong>on</strong>g>of</str<strong>on</strong>g> a multiplicati<strong>on</strong> mostly relies <strong>on</strong> the precisi<strong>on</strong><str<strong>on</strong>g>of</str<strong>on</strong>g> the multiplicati<strong>on</strong>; a variable precisi<strong>on</strong> multiplier willhave more accuracy than single or double precisi<strong>on</strong>multipliers. In this paper, a variable precisi<strong>on</strong> floatingpoint multiplier is c<strong>on</strong>sidered and total architecture for thevariable precisi<strong>on</strong> multiplier is proposed also fourdifferent multipliers are implemented using the variableprecisi<strong>on</strong> algorithm. The comparative study <strong>on</strong>performance analysis like delay characteristics and area isd<strong>on</strong>e for the c<strong>on</strong>sidered multipliers. The best multiplier tobe used for the variable precisi<strong>on</strong> algorithm is proposed.Keywords:- Array Multiplier, Carry Save Multiplier,Modified Booth Multiplier, Vedic Multiplier, VariablePrecisi<strong>on</strong>, Floating Point Multiplicati<strong>on</strong>, Speed, Accuracy.I. INTRODUCTIONThe computati<strong>on</strong> speed <str<strong>on</strong>g>of</str<strong>on</strong>g> the computers hasincreased dramatically during the last decade. Thisincrease in the speed is due to the development <str<strong>on</strong>g>of</str<strong>on</strong>g> VLSItechnology which enabled the integrati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> milli<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g>transistors <strong>on</strong> single chip [1]. Even the computati<strong>on</strong>alspeed has increased the accuracy <str<strong>on</strong>g>of</str<strong>on</strong>g> the systems is notincreased to that extent. Without accuracy, errors caneasily occur in any system. The accuracy <str<strong>on</strong>g>of</str<strong>on</strong>g> amultiplicati<strong>on</strong> mostly relies <strong>on</strong> the precisi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> themultiplicati<strong>on</strong>; a variable precisi<strong>on</strong> floating pointmultiplier will have more accuracy than single or doubleprecisi<strong>on</strong> multipliers [1].The multiplicati<strong>on</strong> is the most fundamentaloperati<strong>on</strong> in any arithmetic logic unit. Also themultipliers will take much more time for executi<strong>on</strong>, sothe need for speed multiplier with accuracy is desired.Many fast multipliers like array multiplier, boothmultiplier etc., are proposed to increase the speed <str<strong>on</strong>g>of</str<strong>on</strong>g> themultiplicati<strong>on</strong> operati<strong>on</strong>. The fast multipliers plays keyrole in VLSI high speed processor [2]. To design a bestprocessor we need to c<strong>on</strong>sider both the accuracy andspeed <str<strong>on</strong>g>of</str<strong>on</strong>g> operati<strong>on</strong>. So a variable precisi<strong>on</strong> floatingpoint multiplier when implemented with fast multiplierswill have the accuracy and speed which is desired in anyprocessors.This paper is organised as follows secti<strong>on</strong> 2 recallsthe variable precisi<strong>on</strong> floating point numberrepresentati<strong>on</strong> format and the existing variable precisi<strong>on</strong>algorithm, secti<strong>on</strong> 3 describes the functi<strong>on</strong>ality <str<strong>on</strong>g>of</str<strong>on</strong>g>various multipliers used in the paper, secti<strong>on</strong> 4 describesthe proposed architecture for the variable precisi<strong>on</strong>floating point multiplier, secti<strong>on</strong> 5 gives the results <str<strong>on</strong>g>of</str<strong>on</strong>g>comparative study <str<strong>on</strong>g>of</str<strong>on</strong>g> the various multipliersimplemented <strong>on</strong> variable precisi<strong>on</strong> floating pointmultiplier, secti<strong>on</strong> 6 c<strong>on</strong>cluded the paper followed byreferences.II.EXISTING VARIABLE PRECISION FLOATING POINTMULTIPLIER ALGORITHMIn this secti<strong>on</strong> existing variable precisi<strong>on</strong> floatingpoint multiplier is described. The variable precisi<strong>on</strong>floating point multiplier is based <strong>on</strong> the variableprecisi<strong>on</strong> floating point number representati<strong>on</strong> format[1]. The format for variable precisi<strong>on</strong> floating point isshown in the figure 1. The variable precisi<strong>on</strong>representati<strong>on</strong> is different when compared with thesingle or double precisi<strong>on</strong> that is proposed by IEEE 754format. The variable precisi<strong>on</strong> floating pointrepresentati<strong>on</strong> will have a sign bit (S), a type field (T), alength field (L), 16 bit exp<strong>on</strong>ent and significant wordwhich varies from F(0) to F(L) [1].The sign bit is either positive or negative depending<strong>on</strong> the value. If the value <str<strong>on</strong>g>of</str<strong>on</strong>g> sign bit is 1 then the numberis negative, if the sign bit is 0 then the number ispositive. The type field c<strong>on</strong>sists <str<strong>on</strong>g>of</str<strong>on</strong>g> two bits, it representsthe type <str<strong>on</strong>g>of</str<strong>on</strong>g> number. Depending <strong>on</strong> the value <str<strong>on</strong>g>of</str<strong>on</strong>g> typefield the number is c<strong>on</strong>sidered as normalized, infinite,zero or NaN. The length field is <str<strong>on</strong>g>of</str<strong>on</strong>g> five bit length, itshows the number <str<strong>on</strong>g>of</str<strong>on</strong>g> m bit words present in thesignificant. The words in the significant are stored in theformat <str<strong>on</strong>g>of</str<strong>on</strong>g> most significant F (0) to least significant F (L)[1]. The existing variable precisi<strong>on</strong> floating multiplier isbased <strong>on</strong> the algorithm which can be implementedeasily <strong>on</strong> any hardware [1]. In this algorithm <strong>on</strong>lymantissas are c<strong>on</strong>sidered. The algorithm reduces the56 | P a g e


Koppala et. al.memory that is used to store the partial products that aregenerated during computati<strong>on</strong> in classic multiplicati<strong>on</strong>method by adding the partial products as so<strong>on</strong> as theyare computed.Fig 1:- Variable Precisi<strong>on</strong> Format.This algorithm <strong>on</strong>ly uses the memory <str<strong>on</strong>g>of</str<strong>on</strong>g> (n x 2m)bits instead <str<strong>on</strong>g>of</str<strong>on</strong>g> (n 2 x 2m) bits that are used in the classicmultiplicati<strong>on</strong>. This algorithm splits the operands A andB and the result into m bits. Depending <strong>on</strong> the value <str<strong>on</strong>g>of</str<strong>on</strong>g>the m the size <str<strong>on</strong>g>of</str<strong>on</strong>g> the multiplier and the memory arec<strong>on</strong>sidered [1].III.FUNCTIONALITY OF EXISTING FAST MULTIPLIERSIn this secti<strong>on</strong> four fast multiplier are c<strong>on</strong>sideredand their functi<strong>on</strong>ality is explained. The fast multipliersplay a key Role in VLSI high speed processors. Thefour different fast multipliers that are c<strong>on</strong>sidered in thispaper are Array Multiplier, Carry Save Multiplier,Vedic Multiplier and Modified Booth Multiplier.A. Array MultiplierArray multiplier is an efficient layout <str<strong>on</strong>g>of</str<strong>on</strong>g> acombinati<strong>on</strong>al multiplier. By employing array <str<strong>on</strong>g>of</str<strong>on</strong>g> fulladders and half adders the multiplicati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> two binarynumbers is carried out in the array multiplier. For thesimultaneous additi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> all the product terms the arrayis used in the multiplier [3]-[4]. To generate the productterms an array <str<strong>on</strong>g>of</str<strong>on</strong>g> AND gates are used before the adderarray. The figure 2 shows the array multiplier.Fig 2:- Array Multiplier.In array multiplier, c<strong>on</strong>sider two binary numbers Aand B, <str<strong>on</strong>g>of</str<strong>on</strong>g> m and n bits. There are mn partial productsthat are produced in parallel by a set <str<strong>on</strong>g>of</str<strong>on</strong>g> mn AND gates.For a n x n bit multiplier requires n (n-2) full adders, nhalf-adders and n 2 AND gates. Also, in array multiplierworst case delay would be (2n+1) td [3]-[4]. The powerc<strong>on</strong>sumpti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the array multiplier is more and also thedelay is more. Due to this the array multiplier is fastmultiplier but the hardware complexity is more for thearray multiplier [4].B. Carry Save MultiplierThe carry save multipliers are much more similar to thearray multipliers [5]. In the carry save multiplier thepartial products are generated in parallel and the carrysave adder are used to sum all the partial products whichresults in faster array multiplier [5].C. Modified Booth AlgorithmA Modificati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the Booth algorithm a triplet <str<strong>on</strong>g>of</str<strong>on</strong>g> bits isscanned instead <str<strong>on</strong>g>of</str<strong>on</strong>g> two bits. The booth algorithm,usually called the Modified Booth algorithm, can begeneralized to any radix. In this technique the number<str<strong>on</strong>g>of</str<strong>on</strong>g> partial products are reduced by <strong>on</strong>e half regardless <str<strong>on</strong>g>of</str<strong>on</strong>g>the inputs [6]. The Recoding is performed in two steps:encoding and selecti<strong>on</strong>. The purpose <str<strong>on</strong>g>of</str<strong>on</strong>g> the encoding isto scan the triplet <str<strong>on</strong>g>of</str<strong>on</strong>g> bits <str<strong>on</strong>g>of</str<strong>on</strong>g> the multiplier and define theoperati<strong>on</strong> to be performed <strong>on</strong> the multiplicand, as shownin the following figure 3. The modified booth algorithmis fast but the hardware complexity increases [6].Fig 3:- Implementati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Modified Booth Algorithm.D. Vedic Multiplicati<strong>on</strong>Vedic multiplicati<strong>on</strong> is <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> the fastestmultiplicati<strong>on</strong> method that was followed in ancientmathematics. Nikhilam sutra is <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> the Vedicmethods <str<strong>on</strong>g>of</str<strong>on</strong>g> multiplicati<strong>on</strong> [7]. Nikhilam Sutra means“all from 9 and last from 10”. When large numbers are57 | P a g e


<str<strong>on</strong>g>Performance</str<strong>on</strong>g> <str<strong>on</strong>g>Comparis<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>Multipliers</str<strong>on</strong>g> <str<strong>on</strong>g>Implemented</str<strong>on</strong>g> <strong>on</strong> Variable Precisi<strong>on</strong> Floating PointMultiplicati<strong>on</strong> Algorithminvolved the nikhilam sutra is the most efficient methodto c<strong>on</strong>sider. The compliment <str<strong>on</strong>g>of</str<strong>on</strong>g> the large number fromits nearest base is calculated to perform themultiplicati<strong>on</strong> operati<strong>on</strong> <strong>on</strong> it. So larger the originalnumber, lesser the complexity <str<strong>on</strong>g>of</str<strong>on</strong>g> the multiplicati<strong>on</strong> [7].The nikhilam sutra implementati<strong>on</strong> is shown in thefigure 4.All the above four fast multipliers are c<strong>on</strong>sideredin the paper, a comparative study is made <strong>on</strong> theperformance <str<strong>on</strong>g>of</str<strong>on</strong>g> all the multipliers when implemented <strong>on</strong>the variable precisi<strong>on</strong> floating point multiplier.IV.Fig 4:- Example <str<strong>on</strong>g>of</str<strong>on</strong>g> Nikhilam Sutra.PROPOSED ARCHITECTURE FOR VARIABLEPRECISION FLOATING POINT MULTIPLIERIn this secti<strong>on</strong> architecture for variable precisi<strong>on</strong>floating point multiplier is proposed. The figure 5 showsthe architecture <str<strong>on</strong>g>of</str<strong>on</strong>g> the variable precisi<strong>on</strong> floating pointmultiplier.The total architecture is based <strong>on</strong> the variable precisi<strong>on</strong>floating point representati<strong>on</strong>. The sign bit <str<strong>on</strong>g>of</str<strong>on</strong>g> the result Rthat is S R is obtained by the XOR operati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the signbit <str<strong>on</strong>g>of</str<strong>on</strong>g> both operands A and B. The type field is obtainedfrom the c<strong>on</strong>trol unit. Depending <strong>on</strong> the type <str<strong>on</strong>g>of</str<strong>on</strong>g> theinput operands the type <str<strong>on</strong>g>of</str<strong>on</strong>g> the result is obtained [8]. Theexp<strong>on</strong>ent is obtained by the 16 bit adder/substractor.The significand is obtained from the multipliati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g>both the significands <str<strong>on</strong>g>of</str<strong>on</strong>g> input operands. The lengthfield is obtained by adding the length field <str<strong>on</strong>g>of</str<strong>on</strong>g> both theinput operands. All the additi<strong>on</strong>s are carried out usingcarry look ahead adder circuits.V. RESULTSThe comparative study is made <strong>on</strong> four fast multipliersimplemented <strong>on</strong> variable precisi<strong>on</strong> floating pointmultiplier that are c<strong>on</strong>sidered in the paper. The delaycharacteristics and the area are calculated and tabulated.TABLE 1:- COMPARISION RESULTS OF FAST MULTIPLIERSIMPLEMENTED ON VARIABLE PRECISION FLOATING POINT MULTIPLIER.Type <str<strong>on</strong>g>of</str<strong>on</strong>g>MultiplierArrayMultiplierCarry SaveMultiplierVedicMultiplierModifiedBoothMultiplierNo. OfSlices838 out <str<strong>on</strong>g>of</str<strong>on</strong>g>4656 17%794 out <str<strong>on</strong>g>of</str<strong>on</strong>g>4656 17%598 out <str<strong>on</strong>g>of</str<strong>on</strong>g>4656 12%384 out <str<strong>on</strong>g>of</str<strong>on</strong>g>4656 17%No. <str<strong>on</strong>g>of</str<strong>on</strong>g> 4input LUTs1501 out <str<strong>on</strong>g>of</str<strong>on</strong>g>9312 16%1424 out <str<strong>on</strong>g>of</str<strong>on</strong>g>9312 15%1139 out <str<strong>on</strong>g>of</str<strong>on</strong>g>9312 12%712 out <str<strong>on</strong>g>of</str<strong>on</strong>g>9312 16%No. <str<strong>on</strong>g>of</str<strong>on</strong>g>b<strong>on</strong>dedIOBs128 out <str<strong>on</strong>g>of</str<strong>on</strong>g>232 55%128 out <str<strong>on</strong>g>of</str<strong>on</strong>g>232 55%128 out <str<strong>on</strong>g>of</str<strong>on</strong>g>232 55%128 out <str<strong>on</strong>g>of</str<strong>on</strong>g>232 55%TABLE 2:- COMPARISION RESULTS OF FAST MULTIPLIERSIMPLEMENTED ON VARIABLE PRECISION FLOATING POINT MULTIPLIERType <str<strong>on</strong>g>of</str<strong>on</strong>g>MultiplierMax. combinati<strong>on</strong>alpath delayNo. <str<strong>on</strong>g>of</str<strong>on</strong>g>MULT18X18SIOsArray Multiplier 59.120ns --Carry SaveMultiplier56.854ns --Vedic Multiplier 54.963ns --Modified Booth20 out <str<strong>on</strong>g>of</str<strong>on</strong>g> 2055.010nsMultiplier100%The table 1 and table 2 shows the comparisi<strong>on</strong> results.The XILINX ISE 10.1 is used to simulate and synthesis.The FPGA family selected is Spartan 3E XC3S500E.The coading is d<strong>on</strong>e in VERILOG HDL.The simulati<strong>on</strong> result for the total architecture is shownin the figure 6.Fig 5:- Proposed Architecture for Variable Precisi<strong>on</strong> Floating PointMultiplier58 | P a g e


Koppala et. al.[6] Elguibaly, F. “A fast parallel multiplier-accumulator using themodified Booth algorithm” Circuits and Systems II: Analog andDigital Signal Processing, IEEE Transacti<strong>on</strong>s Volume: 47,Page(s): 902- 908[7] Kumar, A.; Raman, A. “Low power ALU design by ancientmathematics” Computer and Automati<strong>on</strong> Engineering(ICCAE), 2010 The 2nd Internati<strong>on</strong>al C<strong>on</strong>ference Page(s): 862– 865[8] IEEE-754 Reference Material http://babbage.cs.qc.cuny.edu/IEEE-754.old/References.xhtmlFig 6:- Simulati<strong>on</strong> Result <str<strong>on</strong>g>of</str<strong>on</strong>g> Total Architecture <str<strong>on</strong>g>of</str<strong>on</strong>g> Variable Precisi<strong>on</strong>Floating Point Multiplier.VI.CONCLUSIONIn this paper four different fast multipliers areimplemented using the variable precisi<strong>on</strong> floating pointalgorithm and design utility and path delays arecompared. The comparative results c<strong>on</strong>cludes that thevedic multiplier will have less delay when compaedwith other multipliers and modified booth algorithmwill occupy less area when compared with othermultipliers. So we can c<strong>on</strong>clude that depending up<strong>on</strong>the requirement <str<strong>on</strong>g>of</str<strong>on</strong>g> the processor either vedic multiplieror modified booth multiplier can be used with thevariable precisi<strong>on</strong> floating point algorithm. The totalarchitecture for variable precisi<strong>on</strong> floating pointmultiplier unit which follows the variable precisi<strong>on</strong>format is proposed. The simulati<strong>on</strong> and synthesis resultsare analysed using XILINX ISE.REFERENCES[1] Rohit Sreerama, Paidi Satish, K Neelima. “An Algorithm forvariable precisi<strong>on</strong> based floating point multiplicati<strong>on</strong>”, procInternati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Advances in Informati<strong>on</strong>Technology and Mobile Communicati<strong>on</strong>, AIM 2012, page no-238-242.[2] Sumit R. Vaidya, D. R. Dandekar “<str<strong>on</strong>g>Performance</str<strong>on</strong>g> <str<strong>on</strong>g>Comparis<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>of</str<strong>on</strong>g><str<strong>on</strong>g>Multipliers</str<strong>on</strong>g> for Power-Speed Trade-<str<strong>on</strong>g>of</str<strong>on</strong>g>f in VLSI Design” RecentAdvances In Networking, Vlsi And Signal Processing.pg 263-266.[3] Ravi, N.; Subbaiah, Y.; Prasad, T.J.; Rao, T.S. “A novel lowpower, low area array multiplier design for DSP applicati<strong>on</strong>s”Signal Processing, Communicati<strong>on</strong>, Computing andNetworking Technologies (ICSCCN), 2011 Internati<strong>on</strong>alC<strong>on</strong>ference Page(s): 254 - 257.[4] Gorgin, S.; Jaberipur, G.; Parhami, B. “Design and evaluati<strong>on</strong><str<strong>on</strong>g>of</str<strong>on</strong>g> decimal array multipliers” Signals, Systems and Computers,2009 C<strong>on</strong>ference Record <str<strong>on</strong>g>of</str<strong>on</strong>g> the Forty-Third AsilomarC<strong>on</strong>ference Page(s): 1782 – 1786.[5] Raghunath, R.K.J.; Farrokh, H.;Naganathan, N.; Rambaud, M.;M<strong>on</strong>dal, K.; Masci, F. Hollopeter “A compact carry-savemultiplier architecture and its applicati<strong>on</strong>s” Circuits andSystems, 1997. Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g> the 40th Midwest SymposiumPage(s): 794 - 797 vol.259 | P a g e

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!