Converting Decimal Numbers to Floating Point Representation

In summary: The first bit is 0 indicating the number is positive. The 8-bit exponent field equals 129, which is 2+127. The remaining bits are from the binary representation except for the leading one after the binary point.So 3.671875 in decimal has the binary representation 11.1010112. Now you want to move the binary point all the way to the left, yielding 0.11101011_2\times2^2. The first bit is 0 indicating the number is positive. The 8-bit exponent field equals 129, which is 2+127.
  • #1
XodoX
203
0
1. Convert -45 and 3.54625 32-bit ( single precision) to floating point representation.



I know how to convert positive and even decimal numbers, but I don't know how to do these. Is there somebody who knows? Help would be appreciated.
 
Physics news on Phys.org
  • #2
XodoX said:
1. Convert -45 and 3.54625 32-bit ( single precision) to floating point representation.



I know how to convert positive and even decimal numbers, but I don't know how to do these. Is there somebody who knows? Help would be appreciated.

What reference are you using for floating point representation?
 
  • #3
berkeman said:
What reference are you using for floating point representation?

What do you mean?
 
  • #4
XodoX said:
What do you mean?

I mean what spec, or what rules? I can go to wikipedia to look it up, but I was hoping you could point me to what you are using. What textbook, or what other learning materials? How are you supposed to know how to do any conversions involving floating point?

I've done a few conversions involving floating point, and I have to look up the conversion rules each time I do.
 
  • #5
berkeman said:
I mean what spec, or what rules? I can go to wikipedia to look it up, but I was hoping you could point me to what you are using. What textbook, or what other learning materials? How are you supposed to know how to do any conversions involving floating point?

I've done a few conversions involving floating point, and I have to look up the conversion rules each time I do.

Dosen't matter how. I tried to to convert it to binary first and then go from there, using the formula I found, but that didn't work.
 
  • #6
So which one of the IEEE 754 formats are you supposed to use?

http://en.wikipedia.org/wiki/Floating_point

Or is it when they say
3.54625 32-bit ( single precision)
that it's confusing? I don't know myself what they mean by a decimal number represneted by 32-bit single precision...
 
  • #7
berkeman said:
So which one of the IEEE 754 formats are you supposed to use?

http://en.wikipedia.org/wiki/Floating_point

Or is it when they say that it's confusing? I don't know myself what they mean by a decimal number represneted by 32-bit single precision...

The floating point is going to look like this: 0 1111 1111 0110 1010 0100 0000 0000 0000

32 digits. The single precision is easy. I know how to do that once I know how to convert it. I saw that on wikipedia too. It's just not clear to me.
 
  • #8
Negative numbers are easy if you know how to do positive numbers. You just flip the sign bit.

Floating point representation is essentially just scientific notation using base 2. To convert to base 2, first convert the integer part. Then multiply the fraction by 2. The integer part of the result is the next bit. To get successive bits, repeat the process of discarding the integer part and multiplying the resulting fraction by 2.

For example, the binary representation of 3.671875 starts with 11. After the binary point, the bits are

2x0.671875 = 1.34375
2x0.34375 = 0.6875
2x0.6875 = 1.375
2x0.375 = 0.75
2x0.75 = 1.5
2x0.5 = 1.0

So 3.671875 in decimal has the binary representation 11.1010112. Now you want to move the binary point all the way to the left, yielding [itex]0.11101011_2\times2^2[/itex].

In IEEE 754 single-precision format, you first have the sign bit, 0 for positive and 1 for negative. Add an offset of 127 to the exponent and store the result in the next 8 bits. You then drop the first bit of the mantissa, because you know it's always 1, and store the rest in the remaining 23 bits. For our example, you'd have

0 10000001 11010110000000000000000

The first bit is 0 indicating the number is positive. The 8-bit exponent field equals 129, which is 2+127. The remaining bits are from the binary representation except for the leading one after the binary point.
 
  • #9
vela said:
Negative numbers are easy if you know how to do positive numbers. You just flip the sign bit.

Floating point representation is essentially just scientific notation using base 2. To convert to base 2, first convert the integer part. Then multiply the fraction by 2. The integer part of the result is the next bit. To get successive bits, repeat the process of discarding the integer part and multiplying the resulting fraction by 2.

For example, the binary representation of 3.671875 starts with 11. After the binary point, the bits are

2x0.671875 = 1.34375
2x0.34375 = 0.6875
2x0.6875 = 1.375
2x0.375 = 0.75
2x0.75 = 1.5
2x0.5 = 1.0

So 3.671875 in decimal has the binary representation 11.1010112. Now you want to move the binary point all the way to the left, yielding [itex]0.11101011_2\times2^2[/itex].

Just multiply it by 2? Why did you take the 0.75 from 1.375 and not 0.375?

In IEEE 754 single-precision format, you first have the sign bit, 0 for positive and 1 for negative. Add an offset of 127 to the exponent and store the result in the next 8 bits. You then drop the first bit of the mantissa, because you know it's always 1, and store the rest in the remaining 23 bits. For our example, you'd have

0 10000001 11010110000000000000000

The first bit is 0 indicating the number is positive. The 8-bit exponent field equals 129, which is 2+127. The remaining bits are from the binary representation except for the leading one after the binary point.

Thank you.
Just multiply it by 2? Why did you take the 0.75 from 1.375 and not 0.375?
 
  • #10
Take another look at what Vela wrote, starting with "Then multiply the fraction by 2. The integer part of the result is the next bit. To get successive bits, repeat the process of discarding the integer part and multiplying the resulting fraction by 2."
 
  • #11
XodoX said:
Thank you.
Just multiply it by 2?
Multiplying by 2 just shifts the binary point over by one spot (just like multiplying by 10 shifts the decimal point over by one spot), so it causes the leading bit in the fraction to become the integer part.

You could also multiply a fraction by a power of two to turn it into an integer and then find the binary representation of the integer. For example, 0.671875x26 = 43, which is represented by 1010112. So 0.671875 = 43x2-6 = 0.1010112. The main complication with this approach are that you can't turn most fractions neatly into an integer by multiplying by a power of two, so you need to choose the power of 2 to get the number of bits you want and toss the remaining part of the fraction, and then you still have to find the (possibly large) integer's binary representation.

Why did you take the 0.75 from 1.375 and not 0.375?
I think you missed the line between the 1.375 to the 0.75.
 

What is decimal to floating point conversion?

Decimal to floating point conversion is the process of converting a decimal number to its equivalent representation in floating point format. Floating point format is commonly used in computers to represent real numbers with a fractional component.

Why do we need to convert decimals to floating point?

Computers use binary numbers to perform calculations, but decimal numbers are more convenient for humans to work with. Converting decimals to floating point allows for more accurate and efficient representation of real numbers in computer systems.

How does the conversion from decimal to floating point work?

The conversion from decimal to floating point involves breaking down the decimal number into its binary components of sign, mantissa, and exponent. The mantissa represents the significant digits of the number, while the exponent determines the placement of the decimal point.

What are the limitations of converting decimals to floating point?

One limitation of converting decimals to floating point is that not all decimal numbers can be accurately represented in binary form. This can lead to rounding errors and loss of precision. Additionally, floating point numbers can only represent a certain range of values, which can also lead to errors in calculations.

How can I ensure the accuracy of decimal to floating point conversion?

To ensure the accuracy of decimal to floating point conversion, it is important to understand the limitations and potential errors involved in the process. It is also helpful to use programming languages or tools that have built-in functions for handling floating point numbers and minimizing rounding errors.

Similar threads

  • Engineering and Comp Sci Homework Help
Replies
9
Views
954
Replies
4
Views
905
  • Computing and Technology
Replies
4
Views
742
  • Engineering and Comp Sci Homework Help
Replies
2
Views
4K
  • Engineering and Comp Sci Homework Help
Replies
2
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
25
Views
14K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
936
Replies
6
Views
9K
  • Engineering and Comp Sci Homework Help
Replies
8
Views
5K
  • Engineering and Comp Sci Homework Help
Replies
9
Views
6K
Back
Top