Page 50

Chapter 3. The FPU

] is a small double precision ﬂoating point arithmetic unit design, which is

suitable for such implementations as it focuses mainly in area optimization and cost

reduction. The design supports exception ﬂags and overﬂow-underﬂow checks that are

given as outputs to be handled by the higher level design, in our case the processor.

The implementation of the double precision FPU is ours and it follows the published

work of Paschalakis et al[

], with some alterations. For example for the multiplier an

alteration was made so the instruction is completed in 1 cycle instead of ten with only

20% more area increase, but it performs much faster.

3.1

Floating point addition-subtraction

Adding and subtracting two ﬂoating point numbers is more complicated than the cor-

responding addition-subtraction performed for integer numbers.

This is due to the

representation of the ﬂoating point numbers. Suppose we wish to add two ﬂoating

point numbers, A and B. A has the form of A = ±S

∗ 2

and B has the form of

B = ±S

∗ 2

. In order to perform an addition between those two numbers the ﬁrst

step is to modify the exponents so as to have the same value. This is done by calculating

the absolute diﬀerence |E

− E

| of the two exponents and adjusting the signiﬁcand S

accordingly. This is possible because the signiﬁcand is stored in binary and therefore it

can be divided by 2. The following example displays this adjustment.

A = 16 ∗ 2

B = 32 ∗ 2

The absolute diﬀerence of the two exponents is 1, so the smaller exponent needs to

adjust.

B = 32 ∗ 2

= 32 ∗ 2

−1

∗ 2

= 16 ∗ 2

This is only possible because the signiﬁcand will always be a multiple of 2 and therefore

a simple shift to the right by the corresponding amount will adjust the number. The

next step is to simply add or subtract the signiﬁcands and normalize the result. This

procedure is standard but the actual implementations in hardware vary. The schematic