How to Handle Floating Point Errors in Fortran90 Real Number Calculations

Click For Summary

Discussion Overview

The discussion revolves around handling floating point errors in Fortran90 real number calculations. Participants explore issues related to precision, representation of numbers, and potential solutions for improving accuracy in computations.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant reports unexpected results in their Fortran code, specifically with the value of Ks appearing as 9.9999999E-5 instead of the expected 1E-4.
  • Another participant suggests familiarizing oneself with floating point representation and provides a link to an article on the topic.
  • A participant questions whether there are solutions to the floating point issue, expressing uncertainty.
  • One suggestion for higher precision is to use double precision by declaring variables with REAL(KIND(1d0)).
  • Another participant proposes using formatted write statements to round numbers for display purposes, which could help present values more clearly.
  • It is noted that the printed value of Ks is very close to the assigned value, with a difference of only 1E-12, highlighting the limitations of binary representation of fractions.
  • Participants discuss that certain fractions can be stored exactly in binary, while others cannot, leading to representation errors.
  • A participant references an article they wrote about the floating point arithmetic problem.
  • One participant expresses gratitude for the insights and plans to read the linked articles to address their issue.
  • A later participant shares a separate calculation issue regarding the difference between two values, Vi and Vs, and later acknowledges finding their own error.

Areas of Agreement / Disagreement

Participants express a range of views on the nature of floating point errors, with some suggesting solutions while others highlight the inherent limitations of numerical representation. No consensus is reached on a definitive solution to the original problem.

Contextual Notes

Participants mention the dependence on the precision of variable types and the representation of numbers in binary, which may affect calculations. There are unresolved aspects regarding the best practices for handling floating point errors in Fortran90.

abdulsulo
Messages
13
Reaction score
0
Hello guys I am trying to write a code which is below;

But my results seems to be fairly wrong.

I noticed some of my real numbers are not what I assigned them. For example Ks shows on the watch window as 9.9999999E-5.

How can I fix such situation?

Fortran:
program hw1
  
    REAL:: G,DVIS,Ks, EPS,LENGTH,D,Vs,Re,Vi,V0,FFACT,DELt,I,Vi1,H,T,PHO
  
    G=9.81
    D=0.3
    H=8
    Ks=1E-4
    T=60
    DVIS=0.001
    PHO=1000
    EPS=0.01
    DELt=0.02
    write(*,*) Ks
    write(*,*) G
    write(*,*) H
    write(*,*) T
    write(*,*) DVIS
    write(*,*) PHO
    write(*,*) EPS
    write(*,*) DELt
    WRITE(*,*) 'INPUT CYLINDER DIAMETER'
    READ(*,*) D
    WRITE(*,*) 'LENGTH'
    READ(*,*) LENGTH
  
    IF (LENGTH==50) THEN
        Vs=6.586
    ELSE IF (LENGTH==100) THEN
        Vs=5.00
    ELSE IF (LENGTH==200) THEN
        Vs=3.669
    ELSE IF (LENGTH==500) THEN
        Vs=2.36
    END IF
    write(*,*) Ks
    Re=PHO*Vs*D/DVIS
    write(*,*) Re
    FFACT=0.25/((log((Ks/(3.7*D))+(5.74/Re**0.9)))**2)   
    write(*,*) FFACT
    Vi=0
    write(*,*) Vi
    DO I=0.02, 200, 0.02
        if (I==0.02) THEN
            Vi1=Vi+(DELt*(H)*G/LENGTH)
            Vi=Vi1
            write(*,*) Vi1
        ELSE
        Re=PHO*Vi*D/DVIS
        FFACT=0.25/((log((Ks/(3.7*D))+(5.74/Re**0.9)))**2)
        Vi1=Vi+(DELt*(H-((1+(FFACT*LENGTH/D))*((Vi**2)/(2*G))))*G/LENGTH)
        Vi=Vi1
        write(*,*) Vi1
        end if
    end do
  
    pause
  
end program
<<Moderator's note: added code tags>>
 
Technology news on Phys.org
I read and tried to understand. But it seems there are no solutions for this? Am I wrong?
 
If you want higher precision, then use
Fortran:
REAL(KIND(1d0)) :: G,DVIS,Ks, EPS,LENGTH,D,Vs,Re,Vi,V0,FFACT,DELt,I,Vi1,H,T,PHO
to get double precision. On modern computer, single precision should not be used anyway.
 
Last edited by a moderator:
  • Like
Likes   Reactions: abdulsulo
Notice that the printed values of K are actually very close to the number you set it to. 1E-4 and 9.9999999E-5 only differ by 0.000000000001 = 1E-12.
Number fractions on the computer are stored in binary and will not be exactly as you set them in base 10. The formats mentioned by others above will make them appear the same. If you need more accuracy, you should use a variable type for K that will keep more significant digits.
 
  • Like
Likes   Reactions: abdulsulo
FactChecker said:
Number fractions on the computer are stored in binary and will not be exactly as you set them in base 10.
... and some will not be exactly as you set them.

Fractions that are linear combinations of 1/2, 1/4, 1/8, 1/16, and so on, are stored in exact form. Fractions such as 1/5 and 1/10, that have nice, compact forms in base-10 (i.e., 1/5 = .2, 1/10 = .1), but their representations as binary fractions have patterns that repeat endlessly, and so they are not stored in exact form.
 
  • Like
Likes   Reactions: abdulsulo
Thank you all for kind insights. Now I got to read through articles that you linked and try to find solution to my problem.
 
  • #10
I am trying to make simple calculation of below;
Vi= 6.586542
Vs=6.586000

The program finds a as 6.554608 from below equation. How is this possible?
a=ABS(Vi-Vs)

EDİT: I found my error. Thank you for your interest.
 
Last edited:

Similar threads

  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 54 ·
2
Replies
54
Views
5K
  • · Replies 3 ·
Replies
3
Views
3K