Evaluating expression (numerical analysis)

Click For Summary
SUMMARY

The discussion centers on evaluating the expression (1+1/x)*x-x in Octave, particularly with x set to 10^(15). The result of 1.125 arises due to the limitations of double precision floating point representation, specifically IEEE 754 format, which can lead to precision loss when subtracting nearly equal quantities. The evaluation of (1+1/x)*x yields 1e15, but the last digit is not represented, resulting in the observed discrepancy during subtraction.

PREREQUISITES
  • Understanding of Octave programming language
  • Familiarity with double precision floating point representation
  • Knowledge of IEEE 754 standards
  • Basic concepts of numerical analysis and precision loss
NEXT STEPS
  • Research IEEE 754 floating point representation in detail
  • Learn about numerical stability and precision in computations
  • Explore Octave's numerical functions and their precision limits
  • Investigate techniques to mitigate precision loss in numerical analysis
USEFUL FOR

Mathematicians, software developers, data scientists, and anyone involved in numerical analysis or using Octave for computational tasks will benefit from this discussion.

blob84
Messages
25
Reaction score
0
Hello, using octave when I evaluate this expression (1+1/x)*x-x, with x = 10^(15) i get as result 1.125, I didn't undesrtood why,
I know that octave show 15 digits in format long so when i evaluate (1+1/x)*x i get 1e15, the last digit is not visible and it is 1, so when i do substraction i get 1.12500000000000.
 
Physics news on Phys.org
blob84 said:
Hello, using octave when I evaluate this expression (1+1/x)*x-x, with x = 10^(15) i get as result 1.125, I didn't undesrtood why,
I know that octave show 15 digits in format long so when i evaluate (1+1/x)*x i get 1e15, the last digit is not visible and it is 1, so when i do substraction i get 1.12500000000000.

It is not my area of expertise but...

You are subtracting two nearly equal quantities from each other. This is a classical way to lose precision.

From what I can discover with a quick trip to Google, Octave works with double precision floating point representation. Likely this would be IEEE floating point. In this format, floating point numbers are expressed in binary with a 53 bit mantissa, an 11 bit exponent and a 1 bit sign for a total of 65 bits. This fits into an eight byte field because the leading bit in the mantissa is always 1 and need not be stored.

http://en.wikipedia.org/wiki/Double-precision_floating-point_format

10-50 is approximately equal to 1.125 * 2-50 add 1 and you get, in binary

1.00000000000000000000000000000000000000000000000001001 [54 bits]

If you hypothesize IEEE floating point with one guard bit, that would fit with a result of 1.125.
 
There are actually two classical flaws. One is in (1 + 1/x), which sums two values of vastly different magnitudes. Second is in the subtraction.
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K