# IEEE float division

JakeD
Hi,

First of all, if this is wrong forum, I apologize; please direct me to the correct one.

I have 4 * 8-byte integers, representing 2 rational numbers (i.e. 2 pairs of a nominator & a denominator).
The 2 rational numbers have the same mathematical value, but different nominator & denominator components.

E.g. 7/50, 21/150.

I cast the numbers to doubles (8-byte floating number), and then divide the pairs of the casted nominator/denominators.

Is it guaranteed that the result is the same among the 2 division operations, for arbitrary integers?

The following assumptions may be made:
1. The nominator & denominator are whole numbers, representable by less than 50 bits.
2. The denominator is larger or equal to the nominator.

Regards

Gold Member
2022 Award
Is it guaranteed that the result is the same among the 2 division operations, for arbitrary integers?
No, because you are working in decimal and the computer is working in binary. That means that your decimal fractions are exact but when converted to binary may not be.

For example, if you add 1.4 + 1.6 you would certainly expect to get a "yes" to the question is the sum of these two numbers 3.0 but in fact unless you have rounding, you will find that the answer is NOT 3.0 it's something like 2.999999999 so if you do an exact comparison you don't get the answer you expect.

Thus if you convert 7 to binary long and 50 to binary long and they divide them, you get one answer but if you convert 21 to binary long and 150 to binary long and divided them you may get a slightly different answer. Again, rounding may save you but carried out to enough digits the results are often not the same for this kind of thing.

Last edited:
JakeD
phinds,
if you convert 21 to binary long and 150 to binary long and divided them you may get a slightly different answer

7/50 does give the same answer of 21/150.

I'm asking specifically about division in the context of the constraints I've provided. I have a reason to believe it's consistent, but am not acquainted enough with the spec to be sure.
I would appreciate an answer that references the spec.

Gold Member
2022 Award
You asked a very specific question and I answered it. If you think one example proves my answer wrong, you are mistaken.

JakeD
You asked a very specific question and I answered it. If you think one example proves my answer wrong, you are mistaken.

Of course I don't think so.
I don't think that you provided a convincing answer, either. You provided a very general description of floats, which doesn't address my scenario.
I thank you for your participation, anyway.

Mentor
Phinds is correct. Rather than argue with me, consider David Goldberg and a paper most CS students read as undergraduates. This is the gold standard.

www.lsi.upc.edu/~robert/teaching/master/material/p5-goldberg.pdf

The IMPORTANT part is that comparing floats as equal (like in C the ==) is more than problematic. You should never do that in program code. Period, the end. Read the whole paper carefully, please.

harborsparrow
JakeD

It says:
The IEEE standard requires that the result of addition, subtraction, multiplication and division be exactly rounded. That is, the result must be computed exactly and then rounded to the nearest floating-point number (using round to even)

The following section does seem to imply that I can expect consistency in my scenario. Can you demonstrate why or where I'm wrong?
Bear in mind that I divide only whole numbers which fit fully in the mantissa part.

Mentor
Stop right here, please. In the time it took you to answer you cannot possibly have read and understood the citation. I'll try one last time - try the code examples here: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

If you cannot understand this second article then we can help. If you want to assert something that is a really bad idea based on a single strawman example, we cannot possibly help you. Thanks for understanding our position.

phinds
JakeD
Thanks for the references, I'll read them (or at least try).

Homework Helper
Gold Member
Stop right here, please. In the time it took you to answer you cannot possibly have read and understood the citation.
The OP may well have read that before coming to this forum.
That may be right, but the original question was specifically about division when both numerator and denominator are exact integers. I think that responses should address that situation. I don't think that answers or examples that refer to other situations are directly applicable.

jbriggs444 and JakeD
Homework Helper
Gold Member
I hope that the OP understands that testing for exact equality of floats is a terrible practice and that the responses above are giving wise advice.

That being said, I have an example program that I used in the past to show (for instance) that 25.4 / 10.0 * 1.0 / 2.54 does not give the exact same answer as (25.4 / 10.0) * (1.0 / 2.54) in that program. I modified that program to loop through numerators, denominators and multipliers with all combinations of (1..1000). In all cases numerator/denominator = (numerator*multiplier)/(denominator*multiplier). It is not clear to me that this is guaranteed on all machines that follow the IEEE standard, or if it is due to particular hardware implementations of multiplication when all numerators and denominators are exact integers.

Bruce Dawson
Analyzing floating-point is tricky. 1.4 + 1.6 does not necessarily give you 3.0 because in that scenario rounding occurs three times - when converting 1.4 to double, when converting 1.6 to double, and adding the two numbers. Because so many roundings are happening it is possible for the answer to not be the mathematically correct answer.

In the example of the 8-byte integers there *may* only be one rounding - the division. If that is the case then the double value created by the division will be the correctly rounded result - by default the result closest to the mathematically correct answer. Therefore the different calculations will give exactly the same result. This correct rounding of division is a guarantee of the IEEE standard.

The only exception would be if the 8-byte integers were so large (larger than 2^53 IIRC) that there was rounding when converting from them to double.

I wrote that floating-point comparison article but I don't think I ever said you should *never* compare floating-point numbers for equality. You just have to know what you're doing. A more relevant citation might be this one:

https://randomascii.wordpress.com/2014/01/27/theres-only-four-billion-floatsso-test-them-all/

in which I wrote "Conventional wisdom says that you should never compare two floats for equality – you should always use an epsilon. Conventional wisdom is wrong."

My analysis of this integer division question might be incorrect, but if so it needs to be shown where my analysis was wrong. Other examples are not necessarily applicable. Floating-point math doesn't just arbitrarily mangle results. It has very specific guarantees that can be used in some cases.

jbriggs444, JakeD and FactChecker
JakeD
In the example of the 8-byte integers there *may* only be one rounding - the division. If that is the case then the double value created by the division will be the correctly rounded result - by default the result closest to the mathematically correct answer. Therefore the different calculations will give exactly the same result. This correct rounding of division is a guarantee of the IEEE standard.

The only exception would be if the 8-byte integers were so large (larger than 2^53 IIRC) that there was rounding when converting from them to double.

Thank you Bruce!
This is what I understood from the article referenced above as well.

As said, the numbers in my use-case are guaranteed to require less then 50 bits, therefore it seems like I can expect consistency.

Gold Member
2022 Award
Thank you Bruce!
This is what I understood from the article referenced above as well.

As said, the numbers in my use-case are guaranteed to require less then 50 bits, therefore it seems like I can expect consistency.
Jake, after looking at it again, I think given the fairly extreme constraints you have specified, you are correct. Jim and I were addressing the more general case.

BvU and JakeD
JakeD
Jake, after looking at it again, I think given the fairly extreme constraints you have specified, you are correct. Jim and I were addressing the more general case.

Thank you phinds for clarifying it.

Homework Helper
Gold Member
As said, the numbers in my use-case are guaranteed to require less then 50 bits, therefore it seems like I can expect consistency.
You should be careful about relying on use cases for verification. The problem may occur only in very specific situations. That is why I decided to just "brute force" a test of all numerators / denominators / multipliers in all combinations of 1 through 1000. Although that is not an exhaustive set of tests, I felt confident that any special problem cases would occur at least once in that set. On my computer it passed all the tests, as I thought it might. In my opinion, use cases are more appropriate for specification of requirements than they are for a formal verification.

BvU
Homework Helper
Note that 123456789 is a 27 bit integer, less than the 53 bit (leading bit assumed to be 1) mantissa used for doubles.

Last edited:
JakeD
Mod note: Fixed the originator of the quote.
rcgldr said:
Note that 123456789 is a 27 bit integer, less than the 53 bit (leading bit assumed to be 1) mantissa used for doubles.

This does not make the scenarios similar.
In my scenario I start with integers and end up with floats, while in the other scenario it's the other way around.

A similar issue will occur with division with integer values stored in doubles.

I'm not sure this statement follows from the former.

Last edited by a moderator:
Homework Helper
Note that 123456789 is a 27 bit integer, less than the 53 bit (leading bit assumed to be 1) mantissa used for doubles.

This does not make the scenarios similar. In my scenario I start with integers and end up with floats, while in the other scenario it's the other way around.
I struck out my prior post. You're prior posts seem correct: a/b should == (a*c)/(b*c), as long as both products are < 2^50 as stated by the OP (probably < 2^53 is good enough).

JakeD
Cool, thanks for sharing. Reading later.

Edit: That was a very good read, thanks!

Last edited:
Mentor
This discussion inspired me to write a new blog post explaining why 0.1 + 0.2 != 0.3, and why that's no reason to give up on floating-point math:
https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/
Some floating point additions give exact results -- for example 1.5 + .25 + .125 comes out exactly to 1.875. The reason is that all three of these numbers have exact binary representations, while .1, .2, and .3 do not.