Factorial in VHDL - ALSE

© 2009 A.L.S.E. - Factorial Application Note v1.4 

Introduction 

This Application Note demonstrates some advanced features of VHDL and some coding techniques that can 

be used to keeping the description both efficient and synthesizable. While the factorial calculation in 

hardware is unlikely to find its place in real-world applications (besides scholar assignments), the techniques 

demonstrated here are very applicable to real-world projects and can help enhance the code quality. 

(in other words : yes, we could code the factorial using a loop, and, yes, RTL project usually don't need 

recursion...) 

The function we want to implement in hardware is the well-know Factorial noted with an exclamation mark : 

0 ! = 1 (by convention) 

1 ! = 1 

N ! = N x (N-1) x (N-2) …. x 2 x 1 

This function is easy to define using recursion : 

N ! = N x (N-1) ! 

We'll see that we can code the recursion while keeping the code synthesizable. 

The Entity 

To be able to display easily the maximum operating frequency, we implement the Factorial (combinational 

function) between two banks of registers (Flip-Flops), so this module will be fully synchronous and pipelined. 

Therefore the interface needs a clock, an input vector and an output vector. This is a good practice for 

unitary synthesis, allowing fast and realistic estimation of complexity. Note that we don't really need a reset 

in this trivial case (because the system doesn't have any feedback path). 

Our first attempts will try to implement Factorial of numbers comprised between 0 and 5 or between 0 and 7 

(3 bits unsigned vectors) with an output requiring log2(7!=5040) → 13 bits or less. 

Library IEEE; 

use IEEE.std_logic_1164.all; 

use IEEE.numeric_std.all; 

Advanced use of VHDL 

A.L.S.E. Application Note 

Synthesizable Factorial 

& Recursivity with VHDL 

Entity Factorial is 

port ( Clk : in std_logic; 

Din : in std_logic_vector (2 downto 0); -- 0! .. 7! = 5040 

Result : out std_logic_vector (12 downto 0) );-- 0 .. 8191 

End Entity Factorial; 

(c) 2009 A.L.S.E. - B. Cuzeau ApNote: Factorial in VHDL 1

Test bench 

Even if the code is going to be straightforward, we must create at least a simple Test Bench to verify that the 

code works, behaviorally. 

The test bench we need is extremely simple : at the input, we apply values counting up between 0 and the 

2**3-1=7 (and cycling), remaining stable during 4 clock cycles (so we can see the input value nicely 

propagate to the output). And we just eyeball the output waveform. 

-------------------------------------------------- 

-- Test Bench. Simulate -all, eyeball the results 

-------------------------------------------------- 

-- synopsys translate_off 

library IEEE; use IEEE.std_logic_1164.all; use IEEE.numeric_std.all; 

Entity Factorial_tb is end; 

Architecture TEST of Factorial_tb is 

signal Clk : std_logic := '0'; 

signal Din : std_logic_vector (2 downto 0) := (others=>'0'); 

signal Result : std_logic_vector (12 downto 0); 

begin 

assert Clk='0' or now

Simulation : 

All looks good. 

But while this simulates correctly as seen above, it fails miserably with all the FPGA synthesis tools we tried 

as of April 2009 (Synplify, XST, Quartus). 

The reason is that they attempt to implement the function itself in hardware, and they don't discover that the 

required depth of recursion is limited (they don't “unroll the loop”). Even if we clearly limit the bounds with 

the input parameter set to “range 0 to 5”, this doesn't help. 

Be careful and prepared while trying this code on a synthesis tools : some tools will issue a warning and 

stop gracefully, others may just eat up gigabytes of your PC's memory until it dies, or end with a stack 

failure... 

Obviously, we need to find another (more synthesis-friendly) way to describe the same logic. 

Synthesis-Friendly Solution 

The way to address the (lack of) “synthesizability” of the previous description is applicable to many other 

situations (hence the interest of this Application Note) : 

Use the computational algorithm to build constants, and let the synthesis tool deal 

with this finite set of values to generate and reduce the combinational logic. 

In VHDL, it's easy to do : we build a constant Table holding the set of result values, and we simply index it 

with the input vector used as an unsigned number to retrieve the output. But doing so, we describe more or 

less a Rom memory, and we may fear that our intended combinational logic would end up in a memory 

block. In fact, it doesn't happen in this case : the synthesis tools are smart enough to figure out how to 

implement the logic, and since the Factorial decoding logic is very simple, the Quality of Result will be just 

perfect and the design will fit in just a few Logic Elements. 

Let's see the code : 

-- --------------------------------------- 

Architecture RTL of Factorial is -- yes, this is perfect for synthesis ! 

-- --------------------------------------- 

-- The usual recursive function 

function fact (d : natural) return natural is 

variable res : natural; 


if d

-- Function to initialize a table with the factorial 

impure function Init_Table return Table_t is 

variable T : Table_t; 


for I in T'range loop 

T(I) := to_unsigned(fact(I),Result'length); 

end loop; 

return T; 

end function Init_Table; 

-- The Table itself, initialized by calling Init-Table: 

constant Table : Table_t := Init_Table; 

-- note : this table will be simplified into a few LUTs 

signal Dinr : std_logic_vector (Din'range); 

------\ 

Begin -- Architecture 

------/ 

Dinr

Clk 

Din[0..2] 

Clk~clkctrl 

INCLK OUTCLK 

CLKCTRL 

Dinr[0] 

PRE 

D Q 

ENA 

SCLR 

SDATA 

1 

SLOAD 

CLR 


PRE 

D Q 

ENA 

SCLR 

SDATA 

1 

SLOAD 

CLR 


PRE 

D Q 

ENA 

SCLR 

SDATA 

1 

SLOAD 

CLR 

Mux0~0 

Mux6~1 

Mux2~0 

Result[4]~reg0 

PRE 

D Q 

ENA 

SCLR 

SDATA 

1 

SLOAD 

CLR 

Mux3~0 

Mux6~0 

Mux5~1 

Mux5~0 

Mux7~0 

F 

F 

F 

F 

F 

F 

F 

F 

Synthesis Result 


We have displayed the post-layout View and revealed the equivalent functions of the internal LUTs. 

(c) 2009 A.L.S.E. - B. Cuzeau ApNote: Factorial in VHDL 5 

PRE 

D Q 

ENA 

SCLR 

SDATA 

1 

SLOAD 

CLR 


PRE 

D Q 

ENA 

PRE 

D Q 

ENA 

PRE 

D Q 

ENA 

SCLR 

SDATA 

1 

SLOAD 

CLR 

PRE 

D Q 

ENA 

PRE 

D Q 

ENA 

PRE 

D Q 

ENA 

PRE 

D Q 

ENA 

PRE 

D Q 

ENA 

PRE 

D Q 

ENA 

CLR 


CLR 



CLR 


CLR 


CLR 


CLR 


CLR 


CLR 

2' h0 -- 

Result[0..12]

Refinement : Design Re-use and Large Integers. 

The previous solution wasn't bad, but what if we want to try with larger integers ? 

We face two issues : 

– We need to change the entity (augment the number of bits for the ports), 

– We will soon have to handle more than 31 bits for the output. 

For the first issue, using generic parameters is not a very brilliant solution, because the parameterization 

won't be easier than modifying the ports widths directly. In the previous description, this change did suffice 

because, inside the architecture, we used attributes to recover the size of items. 

Can we do even better ? Yes ! We can. 

The idea is to remove the ports dimensions ! In VHDL jargon, our ports can use unconstrained vectors. 

How can this work ? The vectors will take their actual dimensions at elaboration time, when the entity will be 

hooked to the design, as an instance. The upper level module (for example the Test Bench) will provide the 

dimensions. 

Wait a sec ! What if the module is at the top level (like for unitary synthesis) ? Are we doomed ? 

Thankfully, the workaround is easy : we use a “Wrapper” as a top level which is the same entity but with 

dimensioned vectors simply instantiating the un-dimensioned entity. Our example shows this. 

For the second issue (dealing with very large integers), we need to avoid “Naturals” and use “Unsigned” 

instead, since unsigned vectors have no limit in size. 

When we start coding the Factorial function using unsigned, we start facing a number of issues again : 

– sizing the function parameters, 

– handling the multiplication result size. 

We can cure the first concern by using again unconstrained arrays as function parameters. That's another 

powerful feature in VHDL : the dimensions are fixed dynamically when the function is called. 

The second concern comes from the fact that the multiplication operator overloaded for unsigned does 

produce a result which is as wide as the sum of the width of the operands. This is too large for our purpose : 

we want a result of the same size as the output (just like with naturals). And unfortunately, if you try to 

extract a slice of an expression like d * Fact (d-1), you'll be... disappointed. A neat work-around is to code 

the multiply operator as a function call, in which case the slice can be extracted : 

res := "*" (d,Fact (d-1))(res'range); 

Another more recommended way in this case would be to simply use the resize function : 

res := resize (d * Fact (d-1),res'length); 

And this is the final solution, reproduced next page. 

Conclusion 

As previously mentioned, all the techniques and tricks exposed here can be used in many different designs 

and circumstances, to enhance the code quality, readability, and re-usability. 

Usual disclaimer: don't ask for support if you are not a customer, but I welcome ideas and suggestions. 

Happy coding in VHDL ! 

Bertrand Cuzeau – CTO A.L.S.E 

info@alse-fr.com 

-=oOo=- 

(c) 2009 A.L.S.E. - B. Cuzeau ApNote: Factorial in VHDL 6

-- Fact_rtl.vhd 

-- -------------------------------------------------- 

-- Factorial Example - Synthesizable & efficient ! 

-- -------------------------------------------------- 

-- Author : (c) Bert Cuzeau. ALSE. http://www.alse-fr.com 

-- Version : 3.1, using unconstrained vectors. 

-- Handles numbers larger than 2**31 

-- 

-- Synthesis results : 8 LUTs for (0! .. 7!) 

-- 32 LUTS for (0! .. 15! = 1,307,674,368,000) 

-- Tested with Quartus II v 9.0. 

-- Should be fine with any synthesis tool. 

-- 

-- Make sure you synthesize "Wrapper" as the top level. 

Library IEEE; 

use IEEE.std_logic_1164.all; 

use IEEE.numeric_std.all; 

-- --------------------------------------- 

Entity Factorial is 

-- --------------------------------------port 

( Clk : in std_logic; 

Din : in std_logic_vector; -- Yep, unconstrained ! 

Result : out std_logic_vector ); 

End Entity Factorial; 

-- --------------------------------------- 

Architecture RTL of Factorial is -- yes, this is perfect for synthesis ! 

-- --------------------------------------- 

-- The (almost) usual recursive function 

function Fact (d : unsigned) return unsigned is 

variable res : unsigned (d'range); 


if d'1',others=>'0'); -- 1 

else 

res := "*" (d,Fact (d-1))(res'range); -- function call notation trick 

-- res := resize(d * Fact(d-1),res'length); -- recommended 

end if; 

return res; 

end function fact; 

-- Constant table type 

type Table_t is array (0 to 2**Din'length - 1) of unsigned(Result'range); 

-- Function to initialize a table with the factorial 

impure function Init_Table return Table_t is 

variable T : Table_t; 


for I in T'range loop 

T(I) := fact(to_unsigned(I,Result'length)); 

end loop; 

return T; 

end function Init_Table; 

-- The Table itself, initialized at creation : 

constant Table : Table_t := Init_Table; 

-- note : this table will be simplified into a few LUTs 

signal Dinr : std_logic_vector (Din'range); 

------\ 

Begin -- Architecture 

------/ 

Dinr

-------------------------------------------------------- 

-- Wrapper for Synthesis (4 -> 41 bits implementation) 

-------------------------------------------------------- 

Library IEEE; use IEEE.std_logic_1164.all; 

Entity Wrapper is -- For Synthesis of 4 bits -> 41 bits 

port ( Clk : in std_logic; 

Din : in std_logic_vector (3 downto 0); -- 0! .. 15! = 13077775800hex 

Result : out std_logic_vector (40 downto 0) );-- 41 bits result 

End Entity Wrapper; 

Architecture Wrap of Wrapper is 


Fact : entity work.Factorial port map (Clk,Din,Result); 

end architecture Wrap; 

-------------------------------------------------- 

-- Test Bench. Simulate -all, eyeball the results 

-------------------------------------------------- 

-- synopsys translate_off 

library IEEE; use IEEE.std_logic_1164.all; use IEEE.numeric_std.all; 

Entity Factorial_tb is end; 

Architecture TEST of Factorial_tb is 

signal Clk : std_logic := '0'; 

signal Din : std_logic_vector (3 downto 0) := (others=>'0'); -- 0! .. 15! 

signal Result : std_logic_vector (40 downto 0); 


assert Clk='0' or now < 800 ns 

report "Simulation has ended (not an error)." severity failure; 

Clk

Factorial in VHDL - ALSE

Create successful ePaper yourself

Delete template?

Save as template?