Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Both signed and unsigned int storing signs?

  1. Sep 5, 2009 #1
    In this code -

    Code (Text):

    #include<stdio.h>
    main()
    {
        signed int a = -9485;
        unsigned int b = -9485;
        printf("%d", a);
        printf("%d", b);
    }
     
    I get the output as -
    Code (Text):

    -9485-9485
     
    And I was wondering b wont store the negative sign.
     
  2. jcsd
  3. Sep 5, 2009 #2
    It's not storing the sign in an unsigned integer.

    "%d" is the formatting code for a signed integer...so whatever input you give it will therefore be cast into a signed integer before being output to the console. "%u" is the formatting code to print an unsigned integer.

    If you are using C++ then you can print out the value without converting it to a different type by using std::cout.
     
  4. Sep 5, 2009 #3
    I applied %u on b and this is what I get -

    -9485
    4294957811

    This is the new code -
    Code (Text):

    #include<stdio.h>
    main()
    {
        signed int a = -9485;
        unsigned int b = -9485;
        printf("%d \n", a);
        printf("%u", b);
    }
     
     
  5. Sep 5, 2009 #4
    Actually, it's more complex than that. Consider the following:

    Code (Text):
    #include <iostream>
    using namespace std;


    int main()
    {
        int a = -9485;
        unsigned int b = -9485;

        cout << "a = " << a << endl;
        cout << "b = " << b << endl;
       
       
        return 0;  
    }
     
    Your compiler should hit you with a warning (but not typically an error!) when you attempt to compile this because of the implicit conversion in this line:

    Code (Text):
    unsigned int b = -9485;
    Ordinarily, there's no difficulty with implicit conversions between signed and unsigned ints if the signed int is [itex]\ge[/itex] zero. But implicitly converting between a negative int and an unsigned int is something that's not defined by the standard and so is compiler dependent.

    To understand what's really going on in the C case, look at what the C99 specs have to say on the subject:

    For what it's worth, C++ has much, much stricter type enforcement than C. GCC will happily compile something like

    Code (Text):
    unsigned int b = -9485;
    without giving you any warning that it's dangerous unless you pass the '-Wconversion' flag at compile time. Implicit type conversions like this can be a nightmare in C because compilers will often just let you hunt for the bug on your own. Note also that you can lead yourself to a whole world of pain by casting the conversion explicitly; something like

    Code (Text):

    int a = -9485;
    unsigned int b = (unsigned int)(a);
    will quite happily compile in C without giving you any warnings at all even if you pass the '-Wconversion' flag to the compiler. The reason is that the compiler will believe you know what you're doing if you've gone to the trouble of explicitly casting from signed to unsigned int. (Something similar happens in C++ if you go to the trouble of static_cast-ing from signed to unsigned.)
     
  6. Sep 5, 2009 #5
    oook...so that made the output chaos.

    Yep it did!


    So, as I thought before, it wont happen that the sign will be simply removed...so we can't add this advantage to an unsigned datatype so as to simply convert a negative to positive.

    The only advantage that we have with an unsigned type is that it has a larger range for a positive value.
     
  7. Sep 5, 2009 #6
    There are two advantages to using an unsigned type. First, you can represent numbers that are twice as large. Second, you no longer have to worry about handling negative values. For example, if you want to check that an integer coordinate is within some range [0,width] which is signed, then you need to check (val >= 0 && val < width) but if its unsigned, you can just check (val < width). Of course, there are additional disadvantages to using signed types as well...such as making your code more error-prone, and making loops that count downwards more difficult.

    In C++ the conversion just reinterprets the bits. Imagine you have 4 bits and you want to store a signed integer. You need one bit to store the sign, this leaves only 3 bits to store the actual number...so your number must be in the range of 0-7. With an unsigned int, you can use all 4 bits to store the number, so it can range from 0-15. If you were to convert the unsigned 15 to signed, the last bit would be multiplication by -1 instead of +8, giving you a result of -7.
     
  8. Sep 5, 2009 #7

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Actually, that's not strictly true -- the C standard allows a small variety of possibilities. One's complement, for example.

    As a general rule, I would advise against writing code that relies on bit patterns of signed numbers or on the behavior of signed overflow or other similar things.

    (A brief internet search suggests that casting to signed an unsigned value that is too big is actually undefined behavior -- and so is completely unreliable, except for the fact that most desktop computers implement integer arithmetic the same way and C implementations cater to that)
     
    Last edited: Sep 5, 2009
  9. Sep 5, 2009 #8

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    Any "conversion to or from an integer type produces a value outside the range that can be represented" is undefined behavior. Long ago, I used to use

    unsigned int some_variable_name = -1;

    as an easy way to set some_variable_name to the largest possible unsigned integer. This worked on many different computers and with many different compilers -- until someone ported my code to some strange machine. My lesson-learned: Never invoke undefined behavior, intentionally or unintentionally. Doing so not only opens the doorway to Murphy's law, it opens the doorway and begs for Murphy to please come in.

    Regarding
    unsigned int foo; ... printf ("%d\n", foo);

    Printf does not know what was passed to it as it takes a variable number of arguments. It uses the format list to interpret what was passed to it. For example,

    int foo; double bar; ... printf ("foo=%f, bar=%d\n", foo, bar);

    will compile and will yield some rather inscrutable output.
     
  10. Sep 5, 2009 #9

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    For the record, 0u-1u would be a portable way to do that. I've nagged myself into doing that somewhat pedantically, and I feel a little better knowing it really is worthwhile!
     
  11. Sep 5, 2009 #10
    We are now talking about two different languages. Anyway, it can be a measurable reduction in computation time to use unsigned wraparound in this manner -- which is why I use it. When you have to wait around for hours for a job to complete, it starts to matter.

    Note that integer conversation between signed and unsigned types is not undefined behavior in general,

    Section 4.7:
    Also in C++, you can always use a reinterpret_cast.
     
  12. Sep 5, 2009 #11
    You can, but using reinterpret_cast for anything - particularly something as dumb as casting where the result of the cast is explicitly undefined in the standard - is liable to get you fired more quickly than punching the boss.

    If you really insist on using casts, the superior type-safety of C++ coupled with boost::*cast is the only sane way to go, in my opinion.
     
  13. Sep 5, 2009 #12
    Width?...the width of the data type?...how exactly do we do this...looks pretty useful.

    After a warning (in gcc)...I tired that.

    If foo was 5, it would be printed as 4.999999998...or something like that.
     
  14. Sep 5, 2009 #13
    But still the main question is sort of persisting...

    "unsigned int b = -9485;"

    Why was this able to store the number -9485 (when printed with %d using printf), where I would normally expect some chaotic value...or actually the maximum possible value as stored in b.

    I bet this has to do with the bit pattern, which I don't know of...but I think these are ways to represent a certain data type in a range of memory addresses.
     
  15. Sep 6, 2009 #14

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    Also in our forum we have icons: :surprised

    Many places have programming standards that explicitly forbid reinterpret_cast. OK, so just do some_type x = *(some_type*)(&y). Problem solved.
    That won't work if another programming standard is to compile with no warnings -- and with -Wold-style-cast (or its equivalent) enabled.


    sizeof()


    No, it would print as something completely bizarre. printf is a variable arguments function. This means arguments are passed to it in the most primitive form, the way C worked thirty years ago. The only things that were passed between functions in the original implementation of C were ints, doubles, and pointers. Printf interprets what was put on the stack according to the format list. If the format doesn't match up with what is actually placed on the stack things can get quite bizarre. Example:
    Code (Text):

    int main ()
    #include <stdio.h>
    {
       printf ("%g\n%d\n", 5, 1.0/3.0);
       return 0;
    }
     
    Gives me
    1.11391e-313
    1431655765

    That assignment worked because of some rather arcane rules of C, and because the compiler was being "nice" to you. -9485 as an int has a certain bit pattern. So, just reinterpret that bit pattern as an unsigned int. This reinterpreted value will not be -9485.

    Why printf printed -9485 with format "%d"? When you call any function all that is put on the stack is a value of some sort, a bunch of bits. In most cases, the compiler knows what should be put on the stack because of the function prototype. For example, sqrt expects a double as an argument. If you call sqrt(4), the compiler kindly converts that 4 (an int) into 4.0 (a double). printf is a special kind of function. It's prototype is int printf(char*,...). Those ellipses mean it is a variadic function. The compiler doesn't do any conversion. It just puts the arguments on the stack in their native type. The format string tells printf how to interpret the things on the stack. printf works great if the things you put on the stack agree with the types implied by the format string. It doesn't work so great if the format string and the arguments don't quite match.
     
  16. Sep 6, 2009 #15
    It appears generally that gcc is nice to human beings.

    So I presume that bit pattern is stored in the memory in a certain format...for e.g. the first bit of int is considered as for the sign.

    Sorta did not get the behavior of printf...can you please explain the behavior directly without going out to the actual printf function?

    From what I understood, the main question is still a mystery to me.
     
  17. Sep 6, 2009 #16

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    I'll try again, with sqrt and printf. You can call sqrt(4), rather than sqrt(4.0), because the compiler knows that the argument to sqrt has to be a double. It knows this because you typed #include <math.h> in your source code, and that file in turn has the statement double sqrt(double);. Because the argument to sqrt has to be a double, the compiler inserts code to convert that 4 (an int) into a double (4.0) before calling sqrt.

    What about printf? Suppose i and x are ints and doubles, respectively. This is perfectly valid code, and it does just what you expect:
    printf ("i=%d\n", i);
    printf ("x=%g\n", x);

    The above wouldn't work if the compiler converted arguments the way it does with sqrt. The compiler doesn't do any such conversion because the prototype for printf is int printf(char *, ...);. The first argument is the format string. It has to be a string; the prototype says so. Those ellipses ("...") tells the compiler that that first argument is the only thing it needs to worry about. printf can take anywhere from zero arguments (printf("Hello world");) to dozens, maybe hundreds of arguments (there's some machine limit) after the format string -- and the compiler had better not do any conversion at all. printf is a special kind of function.
     
  18. Sep 6, 2009 #17
    So if I ask printf to print a float with %d, it will assume the bit pattern of float to be that of an integer.

    Since printf does not ask the compiler to convert the type, it does not, as a result we get chaos.

    So for e.g -

    Code (Text):
    short t = -1;
    The first bit of the short type will contain information about the sign.

    Here also -

    Code (Text):
    unsigned short k = -1
    The first bit will have the sign, since %u does not expect that, it will misinterpret the sign to be a part of the number, as a result we will get awkward output...right?

    If we print k as an integer, the first bit is taken as a sign and so we get the right prints.

    Is that right?
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Both signed and unsigned int storing signs?
  1. Why are Ints 32 bits? (Replies: 8)

Loading...