Chapter 1. Introduction
8
calculations is the bit, all other numbers such as rational, fractions and irrational are
represented, and often represented by an approximation, by real numbers. As it is clear
a standard had to be established in order to represent these numbers so as to be used
in computer science. The term floating point refers to the fact that a number’s radix
point in computers, can ”float”. That is, it can be placed anywhere relative to the
significant digits of the number. This position is indicated as the exponent component
in the internal representation, and floating point can thus be thought of as a computer
realization of scientific notation. All floating point numbers are represented with the
following formula:
Signif icantDigits ∗ base
exponent
(1.1)
The numbers are, in general, represented approximately to a fixed number of significant
digits (the significand) and scaled using an exponent. The base for the scaling is normally
2, 10 or 16. The idea of floating-point representation over intrinsically integer fixed-point
numbers, which consist purely of significand, is that expanding it with the exponent
component achieves greater range. For instance, to represent large values, e.g. distances
between galaxies, there is no need to keep all 39 decimal places down to femtometre-
resolution (employed in particle physics). Assuming that the best resolution is in light
years, only the 9 most significant decimal digits matter, whereas the remaining 30 digits
carry pure noise, and thus can be safely dropped. This represents a savings of 100 bits
of computer data storage. Instead of these 100 bits, much fewer are used to represent
the scale (the exponent), e.g. 8 bits or 2 decimal digits. Given that one number can
encode both astronomic and subatomic distances with the same nine digits of accuracy,
but because a 9-digit number is 100 times less accurate than the 11 digits reserved
for scale, this is considered a trade-off exchanging range for precision. The example
of using scaling to extend the dynamic range reveals another contrast with fixed-point
numbers: Floating-point values are not uniformly spaced. Small values, close to zero,
can be represented with much higher resolution (e.g. one femtometre) than large ones
because a greater scale (e.g. light years) must be selected for encoding significantly
larger values.[1] That is, floating-point numbers cannot represent point coordinates with
atomic accuracy at galactic distances, only close to the origin.