# A floating point Notation Exercise

1. Jul 4, 2007

### hastings

Consider a floating point binary notation with 16 bits. From left to right, it consists of 1bit for the sign (0= "+"), e bits for the exponent represented in Excess$$~2^{e-1}$$ and the remaining bits for the decimal part of the mantissa, normalized between 1 and 2 ($$1 \leq m <2$$).

a) Calculate the minimum value $$e_{min}$$ of the exponent that allows us to write in the above notation, both the numbers r= -8147.31 and
s= $$0.103 \cdot 10^{-6}$$;

This is what I would do.
1. Calculate the order of magnitude of both r and s
2. Write a proportion knowing that $$2^{10} \approx 10^3$$ (like say 10:3= x: 4, considering 4 the result of point 1. ).
3. Find x from the above proportion and find the highest power of 2 which includes x (like say x=15, $$2^3 \leq 15 \leq 2^4$$, I'd take $$2^4$$)
4. Calculate $$e_{min}$$: since it's in excess $$2^{e-1}$$, I solve the equation $$2^4=2^{e-1} \Rightarrow e=e_{min}=4+1=5$$, where $$2^4$$ is the result of point 3.

Is this resoning right?

Now, when I went to calculate the order of magnitude of r and s, I got that
Ord of Magn r=$$10^4$$, better say 4.
Ord of Magn s=$$10^{-5}$$ better say -5.
Which should I consider as a starting point, $$10^4 \mbox{ or } 10^{5}$$ ?

Last edited: Jul 4, 2007
2. Jul 4, 2007

### hastings

hey, I really REALLY, need a help with this exercise, got an exam day after tomorrow.

got an idea: suppose I do the 1 to 4 steps for each of the numbers? In the end I see which one "includes" the other and I choose that one.

Let's do it.

Let's take r, its Order of magnitude is 4

$$10^4 \approx 2^{10} \Rightarrow 2^2 \leq 10 \leq \underbrace{2^3} \Rightarrow 2^3=2^{e-1} \Rightarrow e=3+1=4$$

Then take s, its O of M is -5 but since at the end what we get is the number of bits, which cannot be negative, we shall consider it simply 5.
This time we need to write a proportion about exponents:
since $$2^{10} \approx 10^3$$ and we want to find the equivalent of $$10^5=2^?$$

$$10:3=x:5 \Rightarrow x=\frac{10 \cdot 5}{3} \approx 17 \Rightarrow 2^4 \leq 17 \leq \underbraces{2^5} \Rightarrow 2^5=2^{e-1} \Rightarrow e=5+1=6$$

Since e=6 includes e=4, e_min =6.
So I suppose I should have taken 5 as the common order of magnitude for both the numbers in the decimal system.

Is what I just said a big bunch of nonsense?

Last edited: Jul 4, 2007
3. Jul 4, 2007

### rcgldr

Don't forget the common "cheat" used by most formats, where the first bit of the mantissa is assumed to be one and not included in the bits of a floating point number. You may have covered this case since you stated the mantissa represents a number between 1 and 2. In most floating point formats, the mantissa represents a number between >0 and <1. Reserved combinations of values are used for special cases, like all zero bits for zero.

However you've got a problem, 8147.31 takes more than 16 bits to represent to the nearest 1/100th.

4. Jul 4, 2007

### hastings

Sorry I didn't get you.

The notation is

sign / exponent in Exc.2^{e-1} / mantissa norm. 1 & 2
1bit / e bits / (15-e) bits = Tot. 16 bits

example
suppose we find out e=6
and we want ot represent +1.0101 * 2^3 (doesn't matter what's its value)
exponent: since it's in Excess 2^(e-1), ==> 3 +(32) =35 (32 come from 2^(e-1)=2^5=32)
with 6 bits, 35 is 1 0 0 0 1 1 .
mantissa: 9 bits (=15-e=15-6) 0101 000 00

So ultimately the number in the above notation is
0 100011 010100000 (from left to right: 0 means it's positive, the following 6 bits are the exponent and the remaining 9 are the mantissa)

5. Jul 7, 2007

### rcgldr

My point is that it takes 20 bits (or at least more than 19 bits) to represent 814731 x 10^? accurately.