Page 187 - 2024-Vol20-Issue2
P. 187
183 | Hussein & AL-Assfor
Fig. 1. The format of BSP floating-point number [8] (24*24)-bit AVM which can be utilized to fulfill the mantissa
multiplication for BSP floating-point numbers. The proposed
is consist of three filed: sign-bit (SX ), biased exponent (EX ), (24*24)-bit AVM is then, optimized using the pipelining ap-
and the mantissa (or significand (Mx = 1.FX ) [9]. Where FX is proach. The proposed AVM architectures are built using the
the fraction bits of the mantissa. These three fields are packed improved XOR-gate to substantially improve the multiplier
performance. The proposed AVMs are coded in VHDL, sim-
into a word such that. ulated in Xilinx 14.7 ISE software tool, and synthesized by
different FPGA families, such as: Virtex-5, Virtex-6, Virtex-7,
X = (-1)SX · MX · 2EX (1) Zynq. and then, a complete analysis for their performances is
provided.
Floating-point multiplier is used to carry out the mantissa This work is arranged as follows: Sec. II. reviews some of
multiplications of two floating-point operands [10]. Multipli- the previous works related to the floating-point multipliers
cation two (n-bit) numbers X and Y can be carried out by the and MAC modules. Sec. III. , explains the general design of a
following three steps [4] : BSP floating-point MAC module. Section IV. affords details
Step 1: generation of partial-products. of the proposed AVM using an EBK-CSLA architecture, after
Step 2: reduction these partial-products using a set of adders, which the details of the implementation results, simulations,
like (3:2) carry-save adders (CSA) (or simply CSA) to pro- and comparing the effectiveness of the proposed multiplier
duce the intermediate product in sum and carry vectors form. with the existent multiplier designs are offered in Sec. V. , and
The (3:2) in CSA denotes the number of inputs/outputs of the the conclusion is given in Section VI. .
adder.
Step 3: generation the final product (final multiplication re- II. LITERATURE REVIEW
sult) using fast adder. The inputs of this adder are the sum
and carry vectors produced in step 2. In 2015, N. Jithendra et al [12] have presented two approaches
Generally, the multiplier speed essentially depends on the to design MAC modules one to perform fixed-point signed
accumulation of the sum and carry vectors to produce the final numbers and the other to perform floating-point numbers.
multiplication result and the multiplication algorithm utilized. Their architectures were designed utilizing Wallace-tree mul-
The goal of this work is to design high speeding and low tiplier and ripple carry adder (R-CA). Their multiplier and
area VM based on Urdhva-Tiryakbhyam-Sutra (UT-Sutra) MAC designs had presented enhancement in terms of power,
approach for BSP floating-point MAC modules to achieve but gained higher delay and area consumption due to using the
high-performance digital-signal processing. This goal may be Wallce tree structure and due to the utilization of the R-CA
accomplished throughout the steps bellow: which leads to high carry propagation delay during addition.
- Design an efficient adder to add the intermediate sum and
carry vectors that generated from the CSA to produce the In 2016, authors in [13] had proposed a VM for floating-
final multiplication result, since the speed of the multiplier is point operands. Their design had based on using three cas-
highly relied on the speed of that adder. caded carry lookahead-adders (CLA-A)s to perform the partial-
- Usage of the improved XOR gate in [11] to design the entire product reduction and the final addition to generate the final
parts of the proposed multiplier, and product. Nevertheless, their design had consumed higher area
- Improve the speed of the proposed multiplier further using and had a considerable delay due to the carry propagation
the pipelining concept. among the three adders.
Based on the above steps, this work presents a distinctive
design for a (6*6)-bit VM called here adjusted-VM (simply, In [14, 15], authors had designed (24*24)-bit Vedic based
AVM). The design has utilized the conventional (3*3)-bit multipliers to perform mantissa multiplication for floating-
VM along with an enhanced design for the Brent-Kung carry- point inputs. Their designs had comprised three cascaded
select adder (EBK-CSLA) in [11] to produce the final product levels of R-CAs to add the generated partial-products and to
result from the sum and carry vectors. The (6*6)-bit AVM produce the final product. Their designs have achieved low
circuit is in turn, utilized to design (12*12)-bit and then, a speed due to using the R-CA which is considered the slowest
adder among the adders.
G. Jha et al [16] had designed four kinds of multipliers
to be used in MAC module, namely modified-booth, Wallace
tree-reduction, add-shift, and combinational array multipliers
and analyzed its performance when using these multipliers.
However, none of these multipliers have introduced good
performance in terms of power, delay and area occupation on
the designed MAC. For example, the Wallce tree-reduction