Python Why do 0.1 and 1/10 work but 0.3-0.2 doesn't?

  • Thread starter Thread starter SamRoss
  • Start date Start date
  • Tags Tags
    Work
Click For Summary
The discussion centers on the differences in floating-point arithmetic in Python, specifically why operations like 0.3 - 0.2 yield unexpected results, such as 0.09999999999999998, while 0.1 and 1/10 display correctly as 0.1. This discrepancy arises because numbers like 0.3 and 0.2 cannot be precisely represented in binary, leading to truncation errors during calculations. The conversation highlights that when subtracting two floating-point numbers, precision loss occurs, resulting in a value that is slightly off from the expected result. It is noted that Python's print function rounds displayed values, which can further obscure these precision issues. Understanding these limitations is crucial for users to avoid misinterpretation of results in floating-point arithmetic.
  • #91
anorlunda said:
Who else is old enough to remember the early IBM computers, 1401, 1620, 650?
The 1620 was the second computer I worked with.
The first machine I worked with was the 402 accounting machine (programmed via a jumper panel); the the Honeywell 200, then the IBM1620 - and much later a 1401.

But getting back to the points in hand:

1) Encoding:
The central issue here is "encoding" - how a computer represents numbers - especially non-integer values.
One method is to pick units that allow integer representation. So 0.3 meters minus 0.2 meters might be an issue, but 30 centimeters minus 20 centimeters is no issue at all.
So the encoding could be 16-bit 2's complement integer with centimeter units. The results will be precise within the range of -327.68 meters to 327.65 meters.
But the floating point arithmetic supported by many computer languages and (for the past few decades) most computer processors is targeted to support a wide range of applications. The values could represent time, distance, or non-transcendental numbers. So fixed point arithmetic does not do.
The purpose of floating point arithmetic is to attempt to provide convenience. If it does not suit you, use your own encoding. I have a case in point right in front of me. The device I am working with now is intended for low power to extend battery life. It has very little ROM programming area and RAM - much too little to hold a floating point library. So my encoding is always to preserve precision. I get better results than floating point. In fact, my target is ideal results (no loss of precision from the captured measurements to the decisions based on those measurements) and with planning, I always hit that target.

2) The floating point exponent:
A lot of the discussion has focused on how 1/10, 2/10, and 3/10 are not precisely encoded. But there's a floating point vulnerability in play that is more central to the results that @SamRoss is describing. When 0.01 or 1/10 are evaluated, then floating point encoding will be as close as the encoding allows to 0.1. And the same is true with 2/10 and 3/10.
Fundamentally, the floating point encoding is a signed binary exponent (power of 2) and a mantissa. If I wanted to be precise, I would discuss the phantom bit and other subtleties related to collating sequence - but I will stay general to stay on on point.
The precision problem comes with mantissa precision (which is limited to the number of bits reserved for the mantissa) and the absolute precision (which is a function of the mantissa bits and that binary exponent).
And we will keep our mantissa in the range of 0.5 to 1.0.
So the 1/10 will be something like 4/5 times 2^-3. The 2/10 becomes 4/5 times 2^-2 and the 3/10 becomes 3/5 times 2^-1.
During the subtraction, the intermediate (internally hidden) results will be:
3/5 times 2^-1 minus 4/5 times 2^-2
aligning the mantissas: 6/5 times 2^-2 minus 4/5 times 2^-2 = 2/5 times 2^-2
then readjusting the result for floating point encoding: 4/5 times 2^-3.
During that final readjustment, the mantissa is shifted leaving the low-order bit unspecified. A zero is filled in - but that doesn't create any precision.
The problem would be even more severe (and easier to catch) if the subtraction was 10000.3 - 10000.2.

3) Is the computer "wrong":
Clearly this is an issue of semantics. But I would note that compiler statements are imperative. Even compiler statements described as "declarations" (such as "int n") are instructions (ie, "imperatives") to the compiler and computer. Assuming there is no malfunction, the results, right or wrong, are the programmers.
 
  • Like
Likes anorlunda
Technology news on Phys.org
  • #92
PeroK said:
Okay, but I would say that there may be those who agree with me but are inhibited from taking my side of the argument by the aggressive and browbeating tone of the discussion.

I am surprised that challenging the supposed perfection of the silicon chip has elicited so strong a response.

There are places I am sure where the belief that a computer can never be wrong would be met by the same condemnation with which my belief that they can be wrong has been met here.
The current behaviour is caused by two things:
1 python uses the IEEE floating point standard. this is important for compatibility with everything else.
2 python will produse a different output for every different IEEE floating point number.
Since the IEEE standard isn't exact, this will necessarily cause long decimal expansions for numbers that are close to a whole number. This is can be inconvenient for quicky programs, and I might have to look up how format worked again in python, because I always forget.

I think it's much more important, that if a floating point number changes, that I can see that it changed. |I think you might get undetected errors if you can't see that a floating point number changed. So I really want point 2. The python designers seem to think this also.

I don't understand the whole discussion of "the computer can never be wrong". The design of python might be stupid, altough I don't think so, but apart from faulty hardware or bugs, python can't really be wrong if it does as is designed.

|I think chalenging the supposed perfection of the silicon chip has elicited a strong response, because noone understood what you meant. AFAIK we are just talking about the design of python.
 
  • #93
The problem discussed here actually has nothing to do with the design of Python specifically. This is the standard floating point implementation used by all computers in the underlying hardware.

Programmers of all languages are expected to understand how this works because of how standard it is.
 
  • Like
Likes Mark44
  • #94
No. At least since post #7, we are discussing the design of oython, and how it prints floats by default. Other languages do not do this in the same way, even if they also use the same hardware and the same bit patters to represent floats.
 
  • Skeptical
  • Like
Likes vela and Dale
  • #95
PeroK said:
There's nothing wrong with the floating point math, given that it must produce an approximation.
That is not the only way to think about it.

Floating point math does not deliver an approximation. It delivers an exact result. Given a pair of 64 bit floating point operands, their sum (or difference or product or quotient) is a defined 64 bit floating point result. The result is an algebra.

It is not the algebra of the real numbers under addition, subtraction, multiplication and division. It is the algebra of the 64 bit IEEE floats under addition, subtraction, multiplication and division.

The floating point algebra does not have all of the handy mathematical properties that we are used to in the algebra of the real numbers. For instance, ##\frac{3x}{3}## may not be equal to ##3\frac{x}{3}## in the floating point algebra. That does not make floating point wrong. It merely makes it different.

The two algebras deliver results that closely approximate each other in most cases. Programmers need to be aware that there are differences.
 
  • Like
Likes DrClaude, Dale, anorlunda and 1 other person
  • #96
jbriggs444 said:
That does not make floating point wrong. It merely makes it different.
Well said.

I just want to add that your point is not limited to floating point. My first big project was a training simulator. All of the models were implemented with 24 bit fixed point integer arithmetic. That was because of the dismally slow performance of floating point in those days.

Everything you said about floating point algebra being different, yet delivering results that closely approximate other algebras, applies to fixed point algebras too.
 
  • Like
Likes jbriggs444
  • #97
willem2 said:
No. At least since post #7, we are discussing the design of oython, and how it prints floats by default. Other languages do not do this in the same way, even if they also use the same hardware and the same bit patters to represent floats.
As was mentioned in post #13 by @pbuk , Python does provide ways of printing out rounded results.
That said, Python rounds a floating point 0.1 to "0.1" on output. I would not suggest a change to Python that would, by default, also round 0.3-0.2 to "0.1" on output.

I don't know what "other language" you are using for comparison.
Try this in javascript: console.log(0.1);console.log(0.3-0.2);
 
  • #98
Dale said:
The problem discussed here actually has nothing to do with the design of Python specifically. This is the standard floating point implementation used by all computers in the underlying hardware.
It's the point most here seem to be fixated on. I think everyone who posted in the thread understands why the result of 0.3-0.2 is slightly different than 0.1 as approximated by the computer hardware. So the repeated explanations of why the results are different miss the point.

Dale said:
Programmers of all languages are expected to understand how this works because of how standard it is.
Sure, programmers should be aware of the ins and outs of floating-point arithmetic on a computer, but should the average user have to be?

Which way of displaying the result is more useful most of the time–0.1 or 0.09999999999999998? The same calculation in Excel, APL, Mathematica, Wolfram Alpha, Desmos, and, I would expect, most user-centric software, the result is rendered as 0.1. Why? Because most people don't care that the computer approximates 0.3-0.2 as 0.09999999999999998.

Note that both 0.09999999999999998 and 0.1 are rounded results. Neither is the exact representation of the computer's result from calculating 0.3-0.2. As mentioned in the Python documentation, displaying the exact result wouldn't be very useful to most people, so it displays a rounded result. The question is why round to 16 decimal places instead of, say, 7 or 10, by default?

To me, the choice of 16 digits reveals a computer-centric mindset, i.e., "55 bits of precision is about 16 decimal places, so this is what you should see." This is the mindset most people who posted seem to have. The choice of fewer digits, on the other hand, points to a user-focused mindset, i.e., "if I have $0.30 and spend $0.20, then I'll have $0.10 left over, not $0.09999999999999998." (Don't bother with a "you should represent these as integers" tangent.)
 
  • Love
Likes PeroK
  • #99
vela said:
Which way of displaying the result is more useful most of the time
Which kind of user is more likely to be typing in "0.3 - 0.2" directly as Python code or using simple print() calls to display results?

The kind of "average user" you mention is probably not doing that, because they're probably not writing code or using the Python interactive prompt. They're probably using an application written by someone else, which is displaying results to them based on the programmer's understanding of the needs of typical users of that application. That application is going to be using specific input and output functions that are suitable for that application.

For example, if the application is a typical calculator program, it is not going to take the input "0.3 - 0.2" from the user and interpret it as floating point arithmetic. It's going to interpret it as decimal arithmetic. So it's not going to take the user input "0.3 - 0.2" and interpret it directly as Python source code. Nor is it going to just use print() to display results; it's going to format the output suitably for the output of a calculator program.

In other words, the ultimate answer to the OP's question is that Python source code and the Python interactive prompt are not user interfaces for average users. They're user interfaces for Python programmers. So criticizing them on the basis that they don't do what would be "the right thing" for average users is beside the point.

vela said:
The question is why round to 16 decimal places instead of, say, 7 or 10, by default?
You give the answer to this later in your post:

vela said:
To me, the choice of 16 digits reveals a computer-centric mindset, i.e., "55 bits of precision is about 16 decimal places, so this is what you should see."
And for a user interface for programmers, I think this is perfectly reasonable.

I agree it would not be reasonable for a user interface for average users, but, as above, that is not what Python source code or the Python interactive prompt are. I believe I have already said earlier in this discussion that interfaces for average users should be built using the formatting functions that are provided by Python for that purpose. Nobody writing a program for average users should be using raw print() statements for output or interpreting user input as if it were raw Python source code. And as far as I know, nobody writing a program for average users does that.
 
  • Like
Likes .Scott
  • #100
vela said:
Sure, programmers should be aware of the ins and outs of floating-point arithmetic on a computer, but should the average user have to be?
If you are writing Python code and using the Python print function then you are by definition a programmer and not an average user.
 
Last edited:
  • Like
Likes .Scott and PeterDonis
  • #101
vela said:
Which way of displaying the result is more useful most of the time–0.1 or 0.09999999999999998? The same calculation in Excel, APL, Mathematica, Wolfram Alpha, Desmos, and, I would expect, most user-centric software, the result is rendered as 0.1. Why? Because most people don't care that the computer approximates 0.3-0.2 as 0.09999999999999998.

Note that both 0.09999999999999998 and 0.1 are rounded results. Neither is the exact representation of the computer's result from calculating 0.3-0.2. As mentioned in the Python documentation, displaying the exact result wouldn't be very useful to most people, so it displays a rounded result. The question is why round to 16 decimal places instead of, say, 7 or 10, by default?
There is more to it than "most useful most of the time". If you are going to go with 7 or 10 digits, it is important that you document that and that programmers know what it is. By defaulting to the full precision of the floating point encoding, you are setting a rule that is easy to recognize and easy to keep track of.

If you wanted to use a different number, I might go with 2 places to the right of the decimal. Aside being great for US currency, it would also be enough of a nuisance that Python programmers would quickly figure out how to be explicit about the displayed decimal precision.

One of the primary uses of the default precision is in writing debug code - code added to the code in the course of tracking down a bug - then soon deleted. In that case, defaulting to the full precision of the floating point encoding is perfect.
 
  • #102
PeterDonis said:
And for a user interface for programmers, I think this is perfectly reasonable.
Just to be clear, I'm not saying either choice is inherently right or wrong. You think the choice is reasonable. I understand your reasons, but I just don't agree with them. Likewise, I'm sure you don't agree with my reasons that fewer digits would be a better default. Either way, it's just our opinions.

I just see this in a similar vein to another topics about Python that came up awhile ago. Someone considered Python a terrible language because it didn't require strong typing of variables. How dare someone use a computer without understanding the different ways computers represent numbers! And you rightly pointed out that it was a feature, not a bug, and has saved you a lot of time not having to cater unnecessarily to the machine.
 
  • Like
Likes Dale
  • #103
vela said:
You think the choice is reasonable. I understand your reasons, but I just don't agree with them.
I understand that, but to me your disagreement appears to be based on a belief that Python source code and the Python interactive prompt are a user interface for average users instead of programmers. If that is the basis for your disagreement, I think you should reconsider what Python source code and the Python interactive prompt are actually for. I don't disagree with you if you are describing what you think a user interface for average users should do. I just don't think such requirements are appropriate for Python source code or the Python interactive prompt, since those aren't a user interface for average users.

If you actually believe a user interface for programmers should work the way you are claiming, then we have a much more fundamental disagreement and I don't find your arguments at all convincing. Your basic complaint seems to be that Python's approach is "computer centric", but that's exactly what a user interface for programmers should be.

vela said:
Either way, it's just our opinions.
I don't think the statement that Python source code and the Python interactive prompt are user interfaces for programmers is a matter of opinion.
 
  • Like
Likes jbriggs444
  • #104
As a programmer or a troubleshooter, I want a print statement that tells the truth about the internal value of a variable or of an expression. I will settle for 16 decimal digits since that is a reasonable match for the resolution of the underlying binary data. But I would not appreciate output that leads me accept an answer that tosses away the low order 19 or 20 bits of the actual result (as opposed to the "mathematically correct" result).

If I am dealing with discrepancies at about machine epsilon, I'll probably be looking for binary output. But for quick and dirty stuff a few orders of magnitude above machine epsilon, it would be nice for the print statement to lend a hand by default.

I started on a Texas Instruments SR51. So I know about guard digits. They are nice. The SR51 computed with, but did not display, three low order decimal "guard digits". But there is a difference between the user of a handheld calculator and the programmer of a Python application.
 
  • #105
Haven't we flogged this topic to death yet?
 
  • #106
anorlunda said:
Haven't we flogged this topic to death yet?
I think it's machine esilon away from death and, due to finite precision arithmetic, it always will be...
 
  • Like
  • Haha
Likes vela and jbriggs444
  • #108
After moderator review, the thread will remain closed as the topic has been sufficiently discussed and the OP is long gone. Thanks to all who participated!
 
  • Like
Likes scottdave

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 29 ·
Replies
29
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K