# IEEE 754 - Little questions here and there for a problem that has sols

1. Mar 7, 2013

### s3a

1. The problem statement, all variables and given/known data
The problem and its solution are attached as "IEEE754_ProblemAndSolution.jpg".

2. Relevant equations
The 32-bit/single-precision IEEE 754 procedure.

3. The attempt at a solution
1) What is the reasoning behind the log|-8.573 × 10^13| / log(2) computation?

2) When obtaining the exponent 47 in 2^47 from the computation in my question #1, I noticed that a 2^47 / 1.4074 × 10^14 = 1 was multiplied with -8.573 × 10^13.

My question for this part #2 is: why was that 47 “hunted down”? Why couldn't I have multiplied -8.573 × 10^13 by 2^50 / 1.1259 × 10^15 = 1 for example? Is there a significance to this or could I have chosen to multiply by 2^46 / 7.034 × 10^13 as well?

3) For the number of significant bits, I get the 14 – 1 part but, I don't understand how the 4/log(2) computation was obtained.

If something is unclear with what I am asking, tell me and I will attempt to clarify the situation.

Any input would be greatly appreciated!

File size:
49.4 KB
Views:
66
2. Mar 8, 2013

### vela

Staff Emeritus
Say you wanted to write the number 1234 in scientific notation. Noting that 103 < 1234 < 104, you know it has to be
$$1.234 \times 10^3.$$ If you were to choose a larger exponent, the mantissa would be less than 1, e.g., $0.1234\times 10^4$. If you used a smaller exponent, the mantissa would be greater than 10, e.g., $12.34\times 10^2$. You want the closest power of 10 that's less than 1234. So how do you know 1234 between 103 and 104? It's because log10 1234 = 3.091.

You're doing the same thing here, except you're working with powers of 2. Recall that $\log_b x = \frac{\log x}{\log b}$.

The first calculation tells you that $2^{46} < 8.573\times 10^{13} < 2^{47}$. (You really want to use 246 instead of 247 that was used in the solution. Note that there had to be a multiplication by 2 at the end to fix the mantissa up.)

The bit about 4/log(2) = 13.28 is a typo (as it's clearly false). It should have said log(104)/log(2) = 13.28. Why 104? It's because the 8.573 has 4 significant figures.

3. Mar 8, 2013

### s3a

Hello and, thank you very much! :)

Just to say, 4/log(2) = 13.28 is correct if you assume log is log_10 (as opposed to log being log_2 as was previously assumed in the solution). As you likely know, the ratio of two logarithms is independent of the base of the logarithms as long the base of the logarithm on the numerator is equivalent to the base of the logarithm on the denominator.

Also, just to put it in my own words, the procedure used for the part with what should have been 2^46 instead of 2^47 was for getting the exponent in a decimal representation with an exponent with base 2 along with a mantissa of 1 and whichever fractional field whereas the point of the work that yields the approximate value of 13.28 was for knowing how many bits would be needed for keeping the same precision (or, possibly, a little more precision?) using a binary representation (instead of a decimal one), right?

4. Mar 8, 2013

### vela

Staff Emeritus
Yes, of course, you're right. I was using a natural log here.

Yup, you seem to have it down.

5. Mar 8, 2013

### s3a

Sorry, I double posted.

6. Mar 8, 2013

### s3a

Looking at the final answer, did the person who made the solutions make a mistake?

I ask because, to me, it seems that the person took 0.6092 (which was the fractional part in front of the 2^47) and put the part to the right of the radix point (where the mantissa to the left of the radix point is 0) as the fraction field portion of the IEEE 754 number. Also, the decimal version of the value stored in the exponent field is 86 which just seems wrong to me and, I have trouble trying to figure out what kind of mistake the person could have made in order to try and make sense of things. After that, it seems that the hexadecimal number was also incorrectly converted as can be seen here ( http://www.wolframalpha.com/input/?i=convert+11010110100110111111001000000_2+to+hexadecimal ).

Could you please confirm (or deny), for me, that the correct final answer is the following?:
(1 10101101 00110111111010000000000)_2 = D69BF400_16

7. Mar 8, 2013

### vela

Staff Emeritus
It looks like there's one mistake, but not the ones you're thinking of. There's a missing 0 bit, so the fourth nibble of the mantissa in the solution is 4, but it should be 2.

I think you're just misreading the final answer because the bits are split up kind of strangely. It should be

sign bit (1 bit) = 1
exponent (8 bits) = 1010 1101 = 173 = 46+127
mantissa (23 bits) = 0011 0111 1110 0010 0000 000 = 37E2016 plus the last 3 bits, all zero.

Concatenating those together and then splitting into nibbles gives

1101 0110 1001 1011 1111 0001 0000 0000

8. Mar 8, 2013

### vela

Staff Emeritus
It occurred to me that the solution is actually correct as written. If you limit the mantissa to 14 bits and round up the last bit because the following bit would have been a 1, you get the result in the solution.

9. Mar 8, 2013

### s3a

Is it convention to just round up like that in binary?

Regardless, I believe 13 digits after the radix point should be considered since the 1 before the radix point is implicit 13 + 1 = 14 significant (binary) digits and I bolded those 13 digits (in the fraction field).:

1 10101101 00110111111000100000000

It seems to me that including the rounding, in this case, would cause there to be a 15th significant digit to be considered (which is not wanted – as stated in the instruction) since the 14th digit after the radix point is a 0 just like the 13th digit – it's the 15th digit after the radix point that is a 1.

Basically, shouldn't the fraction field be as follows?:
00110111111000000000000000000000

10. Mar 8, 2013

### vela

Staff Emeritus
Yes, I think you're right. I wasn't clear in my previous post, but I was just guessing about the rounding. IEEE 754 does specify rounding conventions, but it's probably for truncating a number to fit into 32 or 64 bits. In any case, it shouldn't matter when all you want is 13 bits.

By the way, I found a converter online: http://babbage.cs.qc.cuny.edu/IEEE-754/index.xhtml

11. Mar 8, 2013

### s3a

Edit: If you read my version before I edited it, read my/this post again.

Also, sorry for being pedantic but, to be extra clear, the answer (in both binary and hexadecimal forms) is actually (slightly) wrong and the correct answer (to the exact amount of significant digits requested) is (1 10101101 00110111111000000000000)_2 = D69BF000_16 instead, right?

Last edited: Mar 8, 2013
12. Mar 8, 2013

### vela

Staff Emeritus
No, I think you're right that the last one bit should actually be zero since you want to truncate at 13 bits.

13. Mar 8, 2013

### s3a

Did you answer my latest update or the one before that?

14. Mar 8, 2013

### vela

Staff Emeritus
The one before that.

15. Mar 8, 2013

### s3a

Okay so, unless I've gotten too sleepy, it seems that you're agreeing with my most up-to-date post. Are you?

16. Mar 8, 2013

### vela

Staff Emeritus
Yes.

17. Mar 8, 2013

### s3a

Yay! Alright, thank you very much for following this all with me. :D