Page 191 - 2024-Vol20-Issue2
P. 191
187 | Hussein & AL-Assfor
TABLE II. any propagation for the carry during the partial product re-
PERFORMANCE COMPARISON OF UNPIPELINED duction step and the final addition step performed to generate
FLOATING-POINT (24*24)-BIT MULTIPLIERS the final product. Thus, the proposed (24*24)-bit AVM can
be used to design an efficient floating-point MAC module to
Ref. FPGA family Delay(ns) No. of LUTs meet the requirements of cutting-edge DSP applications.
[26] Virtex-7 Design1 47.33 164
28.02 2928
[3] Design2 27.76 1121
[28] Design3 21.823 -
Virtex-6 17.33 1763
Proposed Virtex-7 12.74 1260
Virtex-5 12.395 1018
Virtex-6 11.583 1015
Virtex-7 11.583 1014
Zynq
TABLE III.
PERFORMANCE COMPARISON OF PIPELINED
FLOATING-POINT (24*24)-BIT MULTIPLIERS
Ref. FPGA family Delay(ns) No. of LUTs
[31] Virtex-5 6.61 -
Virtex-5 3.65
Proposed Virtex-6 3.452 568
Virtex-7 3.117 564
Zynq 2.58 563
558
The internal organization RTL-scheme of the synthesized Fig. 9. Internal organization of the (24*24)-bit AVM
AVM with more details is depicted in Fig. 9. scheme in RTL.
The (24*24)-bit AVM is simulated to validate their func- Fig. 10. Simulation input/output waveforms of
tionality in multiplying mantissa parts of two floating-point (24*24)-bit AVM.
operands. The functionality of the (24*24)-bit AVM is verified
by providing several cases of inputs (the inputs are in deci-
mal representation) to verify the corresponding outputs. For
example, case1: 100*24 = 2400, case2: (570*320) =182400,
and case3: (1320*23450) =30954000, etc. as illustrated in
Fig. 10.
Tables II and III show a comparison between the proposed
(24*24)-bit multiplier without/with pipelining with some ex-
isting multipliers. It is shown from Table II that the proposed
unpipelined (24*24)-bit AVM has achieved reduction in delay
and area utilization of 33.16 % and 42.42%, respectively than
the multiplier offered one in [28] for the same FPGA family
which is Virtex-7.
For pipelined design case, it can be noticed from table III
that the proposed (24*24)-bit AVM has achieved less delay of
44.78% than the one proposed in [31] for the same FPGA fam-
ily (virtex-5), and that the lowest delay and area occupation
for the pipelined (24*24)-bit AVM are obtained when using
the FPGA Zynq family. It is clear from Tables II and III that
the proposed (24*24)-bit AVM yields less delay and achieves
significant reduction in area utilization compared with the
mentioned multipliers. The reduction in delay is due to the
use of the EBK-CSLA along with a single CSA to eliminate