Round off error of floating point number.

SherlockOhms · Oct 1, 2013

(Mods, I posted a similar thread in the computer science forum but now realize that this is a more suitable place for it. Could you please remove said thread from the other forum)

I've attached a photo below of the example. 0.2 is the number that we're trying to approximate as a floating point. Fl(x) is said number. |fl(x) - 0.2| = the round off error. The lecturer jumps to a point from the above equation to |-1 + (0.1001...)2| x2^(-52) x2(-3).
Could somebody explain how he made this jump?

SherlockOhms · Oct 1, 2013

ImageUploadedByPhysics Forums1380653025.406909.jpg

Mark44 · Oct 1, 2013

What do you get when you actually do the subtraction represented by 0.2 - fl(0.2)?

SherlockOhms · Oct 1, 2013

Mark44 said:

What do you get when you actually do the subtraction represented by 0.2 - fl(0.2)?

1.10011001...1001 x 2^-3 - 1.10011001...1010 x 2^-3.
So, I assume he factored out the 2 ^-3. Just don't know where the 1 and 2^-52 came from really.

Mark44 · Oct 1, 2013

SherlockOhms said:

1.10011001...1001 x 2^-3 - 1.10011001...1010 x 2^-3.

No. If you do the subtraction as you show it, you get 0. Look at the page you took the photo of. Do you notice the bar over part of the binary representation of .2?

SherlockOhms said:

So, I assume he factored out the 2 ^-3. Just don't know where the 1 and 2^-52 came from really.

SherlockOhms · Oct 1, 2013

Mark44 said:

No. If you do the subtraction as you show it, you get 0. Look at the page you took the photo of. Do you notice the bar over part of the binary representation of .2?

Why would you get 0? The binary ends 1010 for f(x) and 1001 for 0.2, would this give 0? Yeah, I see the bar. So, 1010 is an infinite pattern.

SherlockOhms · Oct 1, 2013

I see that the two numbers are the same up until the 49th digit, then they begin to vary. Is this right? Apologies if I'm not catching this quick enough.

phinds · Oct 1, 2013

SherlockOhms said:

I see that the two numbers are the same up until the 49th digit, then they begin to vary. Is this right? Apologies if I'm not catching this quick enough.

Yes, that's the idea, but "vary" isn't quite a complete explanation. The computer representation is of necessity a fixed number of bits whereas the actual value doesn't stop. The computer representation therefore has be treated as though it were extended by 0's, thus the difference between the two values.

SherlockOhms · Oct 1, 2013

So, while the fl(x) value continues on as a string of 0s, 0.2 continues as 1010 infinitely?

phinds · Oct 1, 2013

SherlockOhms said:

So, while the fl(x) value continues on as a string of 0s, 0.2 continues as 1010 infinitely?

I'm pretty sure that's what I just said.

SherlockOhms · Oct 1, 2013

When subtracting fl(x) from 0.2, do we get 0.0000...(for 52 places)1001(repeating now)? If so, is 1010 - 1001 = 0000? We haven't properly covered binary/floating point arithmetic in proper detail. Thanks.

SherlockOhms · Oct 1, 2013

phinds said:

I'm pretty sure that's what I just said.

Hah. Just clarifying in my own words in case I took you up incorrectly.

Mark44 · Oct 1, 2013

SherlockOhms said:

When subtracting fl(x) from 0.2, do we get 0.0000...(for 52 places)1001(repeating now)? If so, is 1010 - 1001 = 0000?

No. It should be obvious that you don't get zero, because the two numbers on the left are different. Anyway, the answer is 0001.

Subtraction in base-2 works the same way as subtraction in base-10, but there are way fewer "facts" to remember.

1 -1 = 0
1 - 0 = 1
0 - 0 = 0
0 - 1 ---> requires a borrow from the next place to the right.

SherlockOhms said:

We haven't properly covered binary/floating point arithmetic in proper detail. Thanks.

SherlockOhms · Oct 1, 2013

Cool. So, the multiplying by 2^-52 is then used to bring the 1001 back to the decimal point? And, the -1 in the result of the subtraction?

Mark44 · Oct 1, 2013

The first 48 bits of both numbers are the same. The subtraction is for the 49th through 52nd bits. If they were subtracting 1001 from 1010, they would get 1, but the subtraction is the other way around, so they get -1 (after multiplying by 2⁴⁸⁺⁴. To balance multiplying the number by 2⁵², they also multiply by 2^-52, which is equivalent to dividing by 2⁵². The other bit is the repeating part of the binary representation of 0.2.

Can you figure out why they also have the factor of 2^-3?

SherlockOhms · Oct 2, 2013

Mark44 said:

The first 48 bits of both numbers are the same. The subtraction is for the 49th through 52nd bits. If they were subtracting 1001 from 1010, they would get 1, but the subtraction is the other way around, so they get -1 (after multiplying by 2⁴⁸⁺⁴. To balance multiplying the number by 2⁵², they also multiply by 2^-52, which is equivalent to dividing by 2⁵². The other bit is the repeating part of the binary representation of 0.2.

Can you figure out why they also have the factor of 2^-3?

Thanks for that. Is the 2^-3 there as it was there initially in the representation of both .2 and fl(x). 0.2 was 1.1001 x 2^-3 initially and after the approximation fl(x) was made, it was also represented in scientific notation in the base 2 to the power of -3. Is that correct?

Mark44 · Oct 2, 2013

Yes. Without the 2^-3 scaling factor, 0.2₁₀ would be 0.001001...₂.

What they've done in "normalizing" this number is moving the "binary" point enough places to the left so that there is a 1 to the left of the binary point. That requires multiplying by 2³ with a corresponding multiplier of 2^-3 .

SherlockOhms · Oct 2, 2013

That really helped. Thanks a million.

Round off error of floating point number.

Homework Help Overview

Discussion Character

Approaches and Questions Raised

Discussion Status

Contextual Notes

Similar threads

The optimal way of dividing the bet three ways

Hedging on a weather prediction

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight