Understanding IEEE Representation for Single Precision

In summary, the range for a normalized number is from ##2^{-126}## to ##2^{128}##, while the range for a denormalized number is from ##2^{-149}## to ##2^{-126}##.
  • #1
gfd43tg
Gold Member
950
50
Hello,

For the IEEE representation of a number, I wanted to ask something for clarification. For single precision, you have 3 parts: S, Exponent, and Fraction.

The S takes 1 bit (1 slot)
Exponent is 8 bits (8 slots)
Fraction is 23 bits (23 slots).

I was watching a video


and it helped me clear up how to do this, with one tiny caveat. After you divide a number to get under two, say 1.35703125. Your exponent is 2^7, so with a bias of 127 you get 134, which is 10000110 in binary. Now for the fraction part, since the number 1.35703125 is normalized, does that mean that the first '1' in the fraction is implied, and therefore does not take up one of the 23 slots permitted for the fraction? From the video it seems like that is what was done, but I got a little big murky on that point.

Thanks
 
Last edited by a moderator:
Technology news on Phys.org
  • #2
That's exactly how normalized numbers are represented.

When you represent a non-zero number in scientific notation (e.g., Avogadro's number, 6.0221413*1023), the leading digit can be any digit between 1 and 9. The leading digit can only be 1 when you represent some non-zero number in the base 2 equivalent of scientific notation. So why store that leading digit? Not storing it means you get an extra binary digit of precision at no cost.
 
  • #3
Okay, now in the case of non-normalized numbers, does the 0 take up one of the 23 slots, or is it also implied and follows the same as a normalized number?
 
  • #4
It's an implied leading zero for the denormalized numbers rather than one. There's another special rule for the denormalized numbers: The exponent of zero means a factor of 2-126 rather than the 2-127 that the exponent of zero would imply using the bias 127 notation.
 
  • #5
And I would assume the same goes for double, where the bias is 1022 instead of 1023
 
  • #7
Brilliant. Now begs the question, why would one wish to represent a number either normalized or denormalized? Why are there two different ways, and are there certain numbers that can only be represented as such? Outliers such as Inf or NaN come to mind here.

Why would the idea to 'normalize' a number come about? And what is so normal about it??
 
  • #8
Numbers between 2-126 (about 1.175×10-38) and 2128-2^104[/sup] (about 3.4×1038) are represented as normalized numbers. If the format treated exponent bits = 0 as it did everything else (i.e., no denormalized numbers), the smallest representable non-zero number would be 2-127. Adding the concept of denormalized numbers extends that lower range down to 2-149, but at the expense of a loss of a bit of precision for numbers between 2-127 and 2-126.

Regarding infinity and NaNs: The ability to represent those is a "feature" that I always turn off. My experience is that infinities and NaNs almost always represent a bug in the underlying code. I want the program to blow up the instant one of those beasts appear. That gives a nice handle for chasing down the bug. Let them persist and you'll have a much harder time finding the bug because those infinities and NaNs poison every calculation in which they appear.
 
  • #9
Maylis said:
Why would the idea to 'normalize' a number come about? And what is so normal about it??

In most numerical computing, you are using a finite-precision floating point representation (like IEEE) as an approximate model of the mathematical real numbers.

Denormalization is necessary to preserve an important property of this number model: if a and b are two different numbers (whether normalized or not), then a-b should never be calculated as zero.

The fact that the denormalized number has lower precision is irrelevant, because subtraction of any two nearly-equal floating point numbers will lose precision, even if the result can be normalized.
 
Last edited:
  • #10
How do you remember what the limits are for the normalized and denormalized values for both single and double precision?

For example, normalized single precision has a bias of 127, and normalized double precision has a bias of 1023.

denormalized single precision has a bias of 126, denormalized double precision has a bias of 1022.

It appears you take the bias and add one for its upper bound, then subtract one and multiply by negative 1 for its lower bound

Is the range for normalized single precision ##2^{-126}## to ##2^{128}##. Normalized double precision would be ##2^{-1022}## to ##2^{1024}##.

Now that seems to fall apart for denormalized numbers. Apparently its range is brought down to ##2^{-149}##. Is that just a fact to remember? Where does the factor of 149 come in? Is there an increase in the upper bound?

I am trying to keep all of this information straight, because you have to remember both normalized and denormalized, single and double precision. My lecture notes don't suffice or have good information, and I haven't been able to find any concise information on the web.
 
Last edited:
  • #11
Why do you need to remember the exact values? Humans invented writing so they didn't have to remember everything :smile:

I just remember that IEEE single precision is about 6 or 7 decimal digits with exponents up to about ##10^{\pm 38}##, and double precision is about 16 decimal digits and exponents up to about ##10^{\pm 300}## - actually it's a bit more than 300, but I can never remember exactly how much more.
 
  • #12
AlephZero said:
Denormalization is necessary to preserve an important property of this number model: if a and b are two different numbers (whether normalized or not), then a-b should never be calculated as zero.
The sole purpose of denormalization is to extend the range of numbers that are representable by the floating point standard. That's it. Denormalization certainly does not help with the problem you mentioned. That problem is an inherent to using a fixed width representation to represent the reals. It's the reason behind having a concept of "machine epsilon", the largest positive number such that (1.0+ε)-1.0 == 0.0. There are a number of properties of the reals that don't hold with the IEEE floating point representation. Most importantly, transitivity is gone. You can no longer trust that (a+b)+c is equal to a+(b+c).
Maylis said:
How do you remember what the limits are for the normalized and denormalized values for both single and double precision?
In a real world setting? You don't. You look them up. You should know those concepts exist, but knowing the specific values is asking for too much from our lousy human memory. I would assume you're in a college setting. Understand the concepts inside and out, remember that single precision is 32 bits wide, double is 64, and the remember the exponent biases for each format. That will tell you how big the exponent field is, which will in turn tell you how big the mantissa is.

Now that seems to fall apart for denormalized numbers. Apparently its range is brought down to ##2^{-149}##. Is that just a fact to remember? Where does the factor of 149 come in?
It's easy. There are three easy additional concepts to remember for the denormalized numbers, and they all make sense.
  1. The exponent bits for a denormalized number is all bits zero.
  2. The implied leading binary digit is zero for the denormalized numbers rather than one.
  3. The exponent is the same as that for 1. For single precision, the exponent is 2-126 rather than the 2-127 that would apply if you used the bias concept. For double precision, the exponent is 2-1022 rather than 2-1023.

So, just knowing the above concepts, here's how to calculate the smallest representable single precision number. The offset for single precision IEEE format is 127, or 27-1. Seven bits are needed to represent this number. The exponent bits use one more bit than this, so the exponent takes up eight bits. The sign takes up one more, leaving 32-9=23 bits for the mantissa. The smallest representable number is all bits zero except for the LSB. That LSB represents 2- <mantissa length>, or 2-23. The exponent is 21-<bias>, or 2-126. Multiply 2-126 and 2-23 and you get 2-149.

Doing the same with the double precision format, the offset is 1023, or 210-1, so that means an eleven bit exponent. The mantissa takes up 64-(11+1)=52 bits. The smallest representable number in double precision format is therefore 2-1022*2-52=2-1074.
 
  • Like
Likes 1 person
  • #13
AlephZero said:
Denormalization is necessary to preserve an important property of this number model: if a and b are two different numbers (whether normalized or not), then a-b should never be calculated as zero.

D H said:
Denormalization certainly does not help with the problem you mentioned. That problem is an inherent to using a fixed width representation to represent the reals.

OK, the wording of my post was ambiguous - what I meant was "if a and b are numbers represented by different floating-point bit patterns, them a-b should never be calculated as zero". The point I was trying to make had nothing to do with approximating real values with finite computer arithmetic.

If you don't allow denormalized numbers, you can't store the difference between any two normalized numbers when both have the minimum exponent. That would mean the concept of "machine epsilon" loses some of its nice properties.
 
  • #14
Thanks both AZ and DH, great information.

I was wondering, why is the machine epsilon not equal to the smallest representable number? My intuition is telling me that the difference between a representable number and its next closest representable number should be the smallest representable number?

Some commands of interest.
Code:
eps(1)

ans =

   2.2204e-16

EDU>> 2^-1074

ans =

  4.9407e-324

EDU>> 2^-1075

ans =

     0
 
  • #15
ImageUploadedByPhysics Forums1407618586.757164.jpg


This is an old exam with questions about IEEE representation. What do they mean by eps(1), at 1?? Is there such thing as eps(1) at 2,3,...?

For the 2nd question, I'm not sure if it's correct.

For 3, thankfully this thread helped me know that.

Amd similar for 4, and 5 I wonder if what I got is correct? Of course I have to be able to explain of all them as well..
 
  • #16
Based on your last two unanswered questions, you still appear to be a bit confused.

Perhaps it might be easier to forget about base 2 for a bit and look to base 10 instead. Suppose you want to represent positive real numbers using the form 0.ddd×10±ee. This scheme provides the ability to represent numbers between 0.001×10-99 and 0.999×1099. There are three ways to represent 1 in this scheme: 0.100×101, 0.010×102, and 0.001×103. The first is the normalized representation. In this scheme, all numbers between 0.1×10-99 and 0.999×1099 are represented normalized. The denormalized representation, where the leading digit is zero, is reserved for numbers smaller than 0.1×10-99.

What about zero? That's simple. That's 0.000×10-99. It's the smallest denormalized number.

The next representable number after zero is 0.001×10-99. The difference between this and zero is of course 0.001×10-99. The same holds for the difference between consecutive representable numbers up to 0.999×10-99 and 0.100×10-98. The next step up, however, is 0.101×10-98. The difference between 0.101×10-98 and 0.100×10-98 is 0.001×10-98. At the very top of the scale, you're looking at the difference between 0.999×1099 and 0.998×1099, or 0.001×1099. The difference between successive representable numbers depends very much on the magnitude of the number in question.The exact same concept holds for the binary IEEE floating point representations. The difference between the smallest representable number larger than one and one itself is "machine epsilon". The difference between the smallest representable number larger than two and two itself is twice this machine epsilon, and so on. The delta between successive representable numbers is very small when the number in question is small, but is rather larger when the number in question is large.
 
  • #17
Was F.1 through F.3 at least correct? In F.2 they ask for the smallest non-representable positive integer number. I just guessed that it is one since that is the smallest positive integer, but why is 1 not representable?

I still don't understand what they mean by eps(1) at 1. What would eps(1) at 2 look like??

Are there 2^53 representable numbers in the domain [2^52,2^53)?

I think the problem is that I don't know what I don't know
 
Last edited:
  • #18
You got everything right except for question F.2. You shouldn't get full credit for those answers you did get correct because you only gave an explanation for question F.3.

Regarding question F.2: One is a representable number. One is 2^0, so the sign bit is 0, the exponent is 1023, and the mantissa is zero, so the IEEE 64 representation of one is 0x3ff0000000000000.

Hint: 1+2^100 is an integer, yet it can't be represented exactly in the IEEE64 floating point format. Why not? (BTW, this is not the answer to the question.)

That's not the smallest positive integer that cannot be represented exactly in the IEEE64 floating point format. There are smaller integers that can't be represented in that format.
 
  • #19
Between any powers of 2, are there 2^52 representable numbers? I just guessed on that one. Is the number of representable numbers between 2^3 and 2^4 2^52 as well?? Or how do you determine that

Code:
log2(eps(1))

ans =

   -52
so the distance between 1 and the next representable number is 2^ -52, now I see that one. I think I discovered a pattern for determing the value of eps(x).

So you just express x as a power of 2, if its a whole number, then you do 2^x-52, and that is eps(x). If yoy have something that is not expressed as a power of 2, then you just go back to the lower number that can be, so eps(5) = eps(4)

So eps(4), 4 is 2^2, so eps(4) is 2^(2-52) = 2^-50.

So now, I can say with some confidence I understand F.1, F.3, and F.4. Now where I am stuck is F.2 and F.5.

Edit: I might be able to justify F.5 now, let's give this a whirl

So I know eps(2^52) is 1, because 2^(52-52) is 0, hence 2^0 = 1. So, everything in between there also has an eps(x) = 1. That means you can have ##2^{52}, 1+2^{52}, 2+2^{52},3+2^{52} ...2^{53}##. So there are 2^52 numbers in between 2^52 and 2^53.

So I guess what you should do to find the number of representable numbers is subtract the highest number from the lowerst number in the interval, then divide by eps(lower interval)??

##2^{53} - 2^{52} = 2^{52}/1 = 2^{52}## representable numbers

So to create another example, the number of representable numbers between [4 8).

eps(4) = 2^(2-52) = 2^-50 spacing between.

##8-4 = 4##, so ##4/2^{-50} = 4*2^{50}## representable numbers between 4 and 8
 
Last edited:
  • #20
There are 252 representable numbers in the interval [2n,2n+1) if 2n has a normalized representation. For example, there are none in the interval [21024,21025) because 21024 is out of range.

What about the denormalized numbers? There is only one representable number in the interval [2-1074,2-1073), 2-1074 itself. The next largest representable number after 2-1074 is 2-1073. There are two representable numbers in the interval [2-1073,2-1072), three in [2-1072,2-1071), and so on, until you get to the interval [2-1022,2-1021), which contains 252 representable numbers. Every power of 2 interval from that one to [21023,21024) does contain 252 representable numbers.
 
  • #21
D H said:
There are 252 representable numbers in the interval [2n,2n+1)

Does that mean the relative spacing between all numbers is the same for normalized numbers?
 
  • #22
Of course not. It means the spacing is the same for all representable numbers between 2n and 2n+1.
 
  • #23
EDIT: nevermind, I get it now.

I still don't understand the thing about the smallest representable number. Someone said it was 1+2^53, that is very big to be a smallest representable number, how do you do the analysis to determine that?
 
Last edited:
  • #24
Your question F.2 asked for the smallest non-representable integer. That obviously needs to be a biggish number.
 

1. What is IEEE representation?

IEEE representation refers to the way in which the Institute of Electrical and Electronics Engineers (IEEE) is represented in various contexts, such as in research papers, conference proceedings, and standards. This representation includes the use of specific formatting, citation styles, and language to accurately and consistently portray the work of the IEEE.

2. Why is IEEE representation important?

IEEE representation is important because it helps to maintain consistency and accuracy in the communication of scientific and technical information. It allows for the proper recognition and citation of the work done by the IEEE, which in turn helps to advance research and innovation in the field of electrical and electronics engineering.

3. What are some key elements of IEEE representation?

Some key elements of IEEE representation include the use of the IEEE citation style, which includes numbered references and a specific format for citing sources such as books, journals, and conference proceedings. Additionally, IEEE representation often includes the use of technical language and terminology specific to the field of electrical and electronics engineering.

4. How can I ensure I am using proper IEEE representation in my work?

To ensure proper IEEE representation in your work, it is important to consult the official IEEE style guide, which outlines the specific formatting and citation guidelines. Additionally, it can be helpful to reference previous research papers or conference proceedings published by the IEEE to see how they have been represented in those contexts.

5. Are there any common mistakes to avoid in IEEE representation?

Yes, there are some common mistakes to avoid in IEEE representation. These include incorrect citation formats, inconsistent use of technical terminology, and failure to properly cite sources. It is important to carefully review and edit your work to ensure that it adheres to the guidelines outlined in the IEEE style guide and accurately represents the work of the IEEE.

Similar threads

  • Programming and Computer Science
Replies
6
Views
1K
  • Computing and Technology
Replies
4
Views
743
  • Engineering and Comp Sci Homework Help
Replies
9
Views
954
Replies
27
Views
899
  • Engineering and Comp Sci Homework Help
Replies
8
Views
5K
Replies
3
Views
1K
Replies
4
Views
906
  • Engineering and Comp Sci Homework Help
Replies
10
Views
3K
  • Sticky
  • Programming and Computer Science
Replies
13
Views
4K
  • Engineering and Comp Sci Homework Help
Replies
16
Views
2K
Back
Top