IEEE Floating Divison: Is Result Guaranteed?

Click For Summary
SUMMARY

The discussion centers on the consistency of division results when using IEEE 754 floating-point representation for two rational numbers with the same mathematical value but different numerator and denominator pairs. Participants concluded that while the IEEE standard guarantees exact rounding for operations, discrepancies can arise due to binary representation of decimal fractions. Specifically, if both numerator and denominator are whole numbers representable within 50 bits, the results of the divisions will be consistent. However, caution is advised when comparing floating-point numbers for equality due to potential rounding errors.

PREREQUISITES
  • Understanding of IEEE 754 floating-point arithmetic
  • Knowledge of binary representation of numbers
  • Familiarity with rounding errors in floating-point calculations
  • Basic programming skills in languages like C that utilize floating-point comparisons
NEXT STEPS
  • Study IEEE 754 floating-point standard for detailed rounding rules
  • Learn about binary representation of decimal numbers and its implications
  • Explore techniques for comparing floating-point numbers safely, such as using epsilon
  • Implement tests for floating-point arithmetic in programming languages to observe rounding behavior
USEFUL FOR

Software developers, computer scientists, and anyone involved in numerical computing or programming that requires precise floating-point arithmetic and comparisons.

JakeD
Messages
14
Reaction score
0
Hi,

First of all, if this is wrong forum, I apologize; please direct me to the correct one.

I have 4 * 8-byte integers, representing 2 rational numbers (i.e. 2 pairs of a nominator & a denominator).
The 2 rational numbers have the same mathematical value, but different nominator & denominator components.

E.g. 7/50, 21/150.

I cast the numbers to doubles (8-byte floating number), and then divide the pairs of the casted nominator/denominators.

Is it guaranteed that the result is the same among the 2 division operations, for arbitrary integers?

The following assumptions may be made:
1. The nominator & denominator are whole numbers, representable by less than 50 bits.
2. The denominator is larger or equal to the nominator.

Regards
 
Technology news on Phys.org
JakeD said:
Is it guaranteed that the result is the same among the 2 division operations, for arbitrary integers?
No, because you are working in decimal and the computer is working in binary. That means that your decimal fractions are exact but when converted to binary may not be.

For example, if you add 1.4 + 1.6 you would certainly expect to get a "yes" to the question is the sum of these two numbers 3.0 but in fact unless you have rounding, you will find that the answer is NOT 3.0 it's something like 2.999999999 so if you do an exact comparison you don't get the answer you expect.

Thus if you convert 7 to binary long and 50 to binary long and they divide them, you get one answer but if you convert 21 to binary long and 150 to binary long and divided them you may get a slightly different answer. Again, rounding may save you but carried out to enough digits the results are often not the same for this kind of thing.

Further discussion in this thread:
http://www.vbforums.com/showthread.php?211054-Comparison-of-Double-values&s=
 
Last edited:
phinds,
if you convert 21 to binary long and 150 to binary long and divided them you may get a slightly different answer

7/50 does give the same answer of 21/150.

I'm asking specifically about division in the context of the constraints I've provided. I have a reason to believe it's consistent, but am not acquainted enough with the spec to be sure.
I would appreciate an answer that references the spec.
 
You asked a very specific question and I answered it. If you think one example proves my answer wrong, you are mistaken.
 
phinds said:
You asked a very specific question and I answered it. If you think one example proves my answer wrong, you are mistaken.

Of course I don't think so.
I don't think that you provided a convincing answer, either. You provided a very general description of floats, which doesn't address my scenario.
I thank you for your participation, anyway.
 
Phinds is correct. Rather than argue with me, consider David Goldberg and a paper most CS students read as undergraduates. This is the gold standard.

www.lsi.upc.edu/~robert/teaching/master/material/p5-goldberg.pdf

The IMPORTANT part is that comparing floats as equal (like in C the ==) is more than problematic. You should never do that in program code. Period, the end. Read the whole paper carefully, please.
 
  • Like
Likes   Reactions: harborsparrow
jim mcnamara said:

I actually read it, partially.

It says:
The IEEE standard requires that the result of addition, subtraction, multiplication and division be exactly rounded. That is, the result must be computed exactly and then rounded to the nearest floating-point number (using round to even)

The following section does seem to imply that I can expect consistency in my scenario. Can you demonstrate why or where I'm wrong?
Bear in mind that I divide only whole numbers which fit fully in the mantissa part.
 
Stop right here, please. In the time it took you to answer you cannot possibly have read and understood the citation. I'll try one last time - try the code examples here: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

If you cannot understand this second article then we can help. If you want to assert something that is a really bad idea based on a single strawman example, we cannot possibly help you. Thanks for understanding our position.
 
  • Like
Likes   Reactions: phinds
Thanks for the references, I'll read them (or at least try).
 
  • #10
jim mcnamara said:
Stop right here, please. In the time it took you to answer you cannot possibly have read and understood the citation.
The OP may well have read that before coming to this forum.
That may be right, but the original question was specifically about division when both numerator and denominator are exact integers. I think that responses should address that situation. I don't think that answers or examples that refer to other situations are directly applicable.
 
  • Like
Likes   Reactions: jbriggs444 and JakeD
  • #11
I hope that the OP understands that testing for exact equality of floats is a terrible practice and that the responses above are giving wise advice.

That being said, I have an example program that I used in the past to show (for instance) that 25.4 / 10.0 * 1.0 / 2.54 does not give the exact same answer as (25.4 / 10.0) * (1.0 / 2.54) in that program. I modified that program to loop through numerators, denominators and multipliers with all combinations of (1..1000). In all cases numerator/denominator = (numerator*multiplier)/(denominator*multiplier). It is not clear to me that this is guaranteed on all machines that follow the IEEE standard, or if it is due to particular hardware implementations of multiplication when all numerators and denominators are exact integers.
 
  • #12
Analyzing floating-point is tricky. 1.4 + 1.6 does not necessarily give you 3.0 because in that scenario rounding occurs three times - when converting 1.4 to double, when converting 1.6 to double, and adding the two numbers. Because so many roundings are happening it is possible for the answer to not be the mathematically correct answer.

In the example of the 8-byte integers there *may* only be one rounding - the division. If that is the case then the double value created by the division will be the correctly rounded result - by default the result closest to the mathematically correct answer. Therefore the different calculations will give exactly the same result. This correct rounding of division is a guarantee of the IEEE standard.

The only exception would be if the 8-byte integers were so large (larger than 2^53 IIRC) that there was rounding when converting from them to double.

I wrote that floating-point comparison article but I don't think I ever said you should *never* compare floating-point numbers for equality. You just have to know what you're doing. A more relevant citation might be this one:

https://randomascii.wordpress.com/2014/01/27/there's-only-four-billion-floatsso-test-them-all/

in which I wrote "Conventional wisdom says that you should never compare two floats for equality – you should always use an epsilon. Conventional wisdom is wrong."

My analysis of this integer division question might be incorrect, but if so it needs to be shown where my analysis was wrong. Other examples are not necessarily applicable. Floating-point math doesn't just arbitrarily mangle results. It has very specific guarantees that can be used in some cases.
 
  • Like
Likes   Reactions: jbriggs444, JakeD and FactChecker
  • #13
Bruce Dawson said:
In the example of the 8-byte integers there *may* only be one rounding - the division. If that is the case then the double value created by the division will be the correctly rounded result - by default the result closest to the mathematically correct answer. Therefore the different calculations will give exactly the same result. This correct rounding of division is a guarantee of the IEEE standard.

The only exception would be if the 8-byte integers were so large (larger than 2^53 IIRC) that there was rounding when converting from them to double.

Thank you Bruce!
This is what I understood from the article referenced above as well.

As said, the numbers in my use-case are guaranteed to require less then 50 bits, therefore it seems like I can expect consistency.
 
  • #14
JakeD said:
Thank you Bruce!
This is what I understood from the article referenced above as well.

As said, the numbers in my use-case are guaranteed to require less then 50 bits, therefore it seems like I can expect consistency.
Jake, after looking at it again, I think given the fairly extreme constraints you have specified, you are correct. Jim and I were addressing the more general case.
 
  • Like
Likes   Reactions: BvU and JakeD
  • #15
phinds said:
Jake, after looking at it again, I think given the fairly extreme constraints you have specified, you are correct. Jim and I were addressing the more general case.

Thank you phinds for clarifying it.
 
  • #16
JakeD said:
As said, the numbers in my use-case are guaranteed to require less then 50 bits, therefore it seems like I can expect consistency.
You should be careful about relying on use cases for verification. The problem may occur only in very specific situations. That is why I decided to just "brute force" a test of all numerators / denominators / multipliers in all combinations of 1 through 1000. Although that is not an exhaustive set of tests, I felt confident that any special problem cases would occur at least once in that set. On my computer it passed all the tests, as I thought it might. In my opinion, use cases are more appropriate for specification of requirements than they are for a formal verification.
 
  • Like
Likes   Reactions: BvU
  • #17
Note that 123456789 is a 27 bit integer, less than the 53 bit (leading bit assumed to be 1) mantissa used for doubles.
 
Last edited:
  • #18
Mod note: Fixed the originator of the quote.
rcgldr said:
Note that 123456789 is a 27 bit integer, less than the 53 bit (leading bit assumed to be 1) mantissa used for doubles.

This does not make the scenarios similar.
In my scenario I start with integers and end up with floats, while in the other scenario it's the other way around.

A similar issue will occur with division with integer values stored in doubles.

I'm not sure this statement follows from the former.
 
Last edited by a moderator:
  • #19
rcgldr said:
Note that 123456789 is a 27 bit integer, less than the 53 bit (leading bit assumed to be 1) mantissa used for doubles.

JakeD said:
This does not make the scenarios similar. In my scenario I start with integers and end up with floats, while in the other scenario it's the other way around.
I struck out my prior post. You're prior posts seem correct: a/b should == (a*c)/(b*c), as long as both products are < 2^50 as stated by the OP (probably < 2^53 is good enough).
 
  • #21
Cool, thanks for sharing. Reading later.

Edit: That was a very good read, thanks!
 
Last edited:
  • #22
Bruce Dawson said:
This discussion inspired me to write a new blog post explaining why 0.1 + 0.2 != 0.3, and why that's no reason to give up on floating-point math:
https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/
Some floating point additions give exact results -- for example 1.5 + .25 + .125 comes out exactly to 1.875. The reason is that all three of these numbers have exact binary representations, while .1, .2, and .3 do not.
 

Similar threads

  • · Replies 32 ·
2
Replies
32
Views
2K
Replies
10
Views
4K
  • · Replies 8 ·
Replies
8
Views
1K
Replies
7
Views
4K
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 51 ·
2
Replies
51
Views
10K
  • · Replies 7 ·
Replies
7
Views
3K