HardwareTeams.com - The #1 job board and blog for electrical and computer engineers

Introduction to Fixed Point Numbers #

Fixed-point numbers are a common representation of real numbers in computer systems, used for a wide range of applications such as embedded systems, signal processing, and digital communications. Fixed-point numbers are typically represented as binary numbers with a fixed number of bits allocated for the integer and fractional parts. The range of fixed-point numbers that can be represented depends on the number of bits allocated for the integer and fractional parts.

Fixed point numbers have the range :

-2^(Integer Part) to 2^(Integer Part)-2^(Fractional Part)

and a resolution / step size of :

2^(-Fractional Part)

8 Bit Signed Fixed Point Number Example #

  Sign     Integer            Fractional
  Bit      Part               Part
  |        |                  |
  v        v                  v 
  +---+    +---+---+---+---+  +---+---+---+
  | 0 |    | 0 | 1 | 0 | 1 |  | 1 | 0 | 1 |  
  +---+    +---+---+---+---+  +---+---+---+

The sign bit is a single bit that indicates the sign of the fixed-point number. It can be either 0 or 1, where 0 represents a positive number and 1 represents a negative number.
The integer part consists of a certain number of bits that represent the whole number portion of the fixed-point number. In this example, we have 4 bits reserved for the integer part, which can represent values from 0 to 15.
The fractional part also consists of a certain number of bits that represent the decimal portion of the fixed-point number. In this example, we have 3 bits reserved for the fractional part, which can represent values from 2^0 to 2^-3.

Adding Fixed Point Numbers #

When adding fixed-point numbers, the integer and fractional parts are added separately. If there is a carry from the fractional part addition, it is added to the integer part.

Example: Adding Fixed-Point Numbers

Number 1:  1 1 0.1 1    (-1.25 in decimal) 5 bits
Number 2:  0 1 1.0 1    (3.25 in decimal)  5 bits
---------------------
Product: 0 0 1 0.0 0    (2 in decimal)     6 bits

In the example above, we add two fixed-point numbers. The integer parts (-1 + 3) and fractional parts (-0.25 + 0.25) are added separately. The resulting sum is 2 in decimal.

A Python library that can help us explore fixed point math is fxpmath, if you have pip install via pip install fxpmath.

from fxpmath import Fxp
a = Fxp(-1.25, signed=True, n_word=5, n_frac=2)
b = Fxp(3.25, signed=True, n_word=5, n_frac=2)
c = a+b
c
c.bin()

Multiplying Fixed Point Numbers #

When multiplying fixed-point numbers, the integer and fractional parts are multiplied separately, and the fractional part of the result is truncated or rounded as needed.

Example: Multiplying Fixed-Point Numbers

Number 1:  1 1 0.1 1    (-1.25 in decimal) 5 bits
Number 2:  0 1 1.0 1    (3.25 in decimal)  5 bits
---------------------
Product: 1 1 1 0 1 1.1 1 1 1 (-4.0625 in decimal) 10 bits

from fxpmath import Fxp
a = Fxp(-1.25, signed=True, n_word=5, n_frac=2)
b = Fxp(3.25, signed=True, n_word=5, n_frac=2)
c = a*b
c
c.bin()

In the example above, we multiply two fixed-point numbers. Note we grow by adding the number of bits!

Bit-Growth #

One important consideration in fixed-point arithmetic is bit growth, which can occur when adding or multiplying fixed-point numbers. Bit growth refers to the possibility of the result of an operation having more bits than the original operands, which can result in overflow or loss of precision.

Addition Overflow #

For example, when adding fixed-point numbers, if the sum of the fractional parts requires more bits than originally allocated, overflow can occur, resulting in loss of precision or truncation.

Example: Bit Growth in Addition

Number 1:  0 1.1 1  (1.75 in decimal) 4
Number 2:  0 0.1 1  (0.75 in decimal) 4
---------------------
Sum:      0 1 0.1 0  (2.5 in decimal) 5

from fxpmath import Fxp
a = Fxp(1.75, signed=True, n_word=4, n_frac=2)
b = Fxp(0.75, signed=True, n_word=4, n_frac=2)
c = a*b
c
c.bin()

In the example above, when adding the fractional parts (0.75 + 0.75), the result requires 3 bits to represent the sum. What if we didn’t grow the correct number of bits and the number stayed 2 bits for the integer and 2 bits for the fractional part?

In that case our sum would be:

Example: Bit Growth in Addition

Number 1:  0 1.1 1  (1.75 in decimal) 4
Number 2:  0 0.1 1  (0.75 in decimal) 4
---------------------
Sum:       1 0.1 0  (-1.5 in decimal) 4

from fxpmath import Fxp
a = Fxp(1.75, signed=True, n_word=4, n_frac=2)
b = Fxp(0.75, signed=True, n_word=4, n_frac=2)
c = Fxp(a+b, signed=True, n_word=4, n_frac=2, overflow='wrap')
c
c.bin()

That is our sum wrapped around to -1.5. We do not have enough bits to store the result!

Addition Bit Growth Requirements #

Adding two N bit numbers together requires an extra bit to represent the result. If you have a signed number on the range

-2^(N-1) <= x <= 2^(N-1) -1

Then adding two together produces either:

most positive: (2^N-1) + (2^N-1) = 2^(N+1)-2
most negative: -2^N - 2^N = -2^(N+1)

When adding M N-bit numbers with the together the bit-growth is

floor(log2(M * 2^N)) + 1

Multiplication Overflow #

Similarly, when multiplying fixed-point numbers, the product of the fractional parts may require more bits, resulting in overflow

Example: Bit Growth in Multiplication

Number 1:  0 0 . 0 1 (0.25 in decimal)   4 bits
Number 2:  0 0 . 1 0 (0.5  in decimal)   4 bits
---------------------
Product: 0 0 0 0 . 0 0 1 0 (0.125 in decimal) 8 bits

from fxpmath import Fxp
a = Fxp(0.25, signed=True, n_word=4, n_frac=2)
b = Fxp(0.5, signed=True, n_word=4, n_frac=2)
c = a*b
c
c.bin()

In the example above, when multiplying the fractional parts (0.25 x .5), the result requires 4 bits to represent the fractional product.

What would happen if we only allocated 2 bits for the fractional part, resulting in bit growth and potential overflow?

from fxpmath import Fxp
a = Fxp(0.25, signed=True, n_word=4, n_frac=2)
b = Fxp(0.5, signed=True, n_word=4, n_frac=2)
c = Fxp(a*b, signed=True, n_word=4, n_frac=2)
c
c.bin()

If we don’t grow the require bits and truncate back down to 1 sign bit, 1 integer bit, and 2 fractional bits our product will be 0!

Multiplication Bit Growth Requirements #

Consider multiplying an N bit number by a M bit number:

(-2^(N-1)) * (-2^(M-1)) = 2^(N+M-2) < 2^(N+M-1)

When you do multiplication you need to to grow N+M bits!

Always Allocate Enough Bits! #

To mitigate bit growth, careful consideration of the number of bits allocated for the integer and fractional parts is necessary. It’s important to allocate enough bits to accommodate the expected range and precision of the fixed-point numbers being used in calculations to avoid overflow or loss of precision.