Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Signed and unsigned integer expressions?

  1. Aug 25, 2015 #1
    So I just completed a simple exercise that converts all letters of a string to upper case letters. The program works but it comes with the warning: comparison between signed and unsigned integer expressions [Wsign-compare].
    What does this mean? While whatever is causing the warning does not seem to have any effect on the operation of the program, I would like to what is happening here in case it shows up at some point in the future and DOES cause problems. Here's the prog...
    Code (Text):

    #include <iostream>
    #include <string>
    using namespace std;

    //Changes letters to upper case
    void converter(string& sent_par);

    int main ()
    {
      string sentence;
      cout << "Enter a sentence: ";
      getline (cin, sentence, '\n');
      converter (sentence);
      cout << endl << sentence << endl;

      return 0;
    }

    void converter (string& sent_par)
    {
      for (int i = 0; i <= sent_par.size(); i++)
      {
      sent_par[i] = toupper(sent_par[i]);
      }
    }
     
  2. jcsd
  3. Aug 25, 2015 #2

    Titan97

    User Avatar
    Gold Member

    I ran the program in code::blocks and it worked perfectly (I saved it as TXT.cpp).

    Capture.PNG
     
  4. Aug 25, 2015 #3
    Interesting. Is that Code::blocks 13.12?
     
  5. Aug 25, 2015 #4

    Titan97

    User Avatar
    Gold Member

  6. Aug 25, 2015 #5
    Ghost in the machine of some variety or other then.
     
  7. Aug 25, 2015 #6

    Titan97

    User Avatar
    Gold Member

    Should I post a video of it?
     
  8. Aug 25, 2015 #7
    Nah, not necessary. Thanks for the input.
     
  9. Aug 25, 2015 #8
    Due to how 2-complement signed numbers are designed you will probably be be OK when comparing signed and unsigned numbers if their values are in overlapping (intersecting) range of number. If they are not you will get stuff like -1 being equal to 4294967295 (for 32 bit numbers).
     
  10. Aug 25, 2015 #9

    Filip Larsen

    User Avatar
    Gold Member

    I would guess the warning comes from the expression i <= sent_par.size()where the two sides of the comparison is signed and unsigned, respectively? If so, the warning is to make you aware that the compiler cannot in general make this comparison without promoting the signed int to an unsigned int which bound to go wrong when the number is negative. For a general comment on how to handle such comparison, see [1].

    In your case the code will work when sent_par has fewer elements than can be expressed in a signed int, which is around half the theoretical maximum size that size() method [1] can return. If it has more than that your code will probably crash at it tries to overwrite memory outside the sent_par string. A safer approach would be to declare i of type size_t.

    [1] http://jwwalker.com/pages/safe-compare.html
    [2] http://www.cplusplus.com/reference/string/string/size/
     
  11. Aug 25, 2015 #10

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    As @Filip_Larsen noted, the warning results from comparing i to sent_par.size(). The first is a signed int, typically a 32 bit signed integer; the latter is of type std::size_t, typically a 64 bit unsigned integer. The results from comparing a small negative signed integer to a not-so-small unsigned integer can be rather surprising. For example, for any reasonably sized string, -1 < some_string.length() is false.

    That said, there are better ways to write your function converter(string&). With modern c++, you should rethink what you are doing whenever you find yourself writing a loop of the form for (int ii = 0; ii < some_limit; ++ii). Some alternative formulations follow.

    1. Iterators.
    Using iterators instead of indices makes your code much more generic (and also, much more amenable to standard algorithms; see alternative #3). An implementation of your function converter(string&) using iterators:
    Code (C):
    void converter (std::string& sent_par)
    {
      for (std::string::iterator it = sent_par.begin(); it != sent_par.end(); ++it)
      {
        *it = std::toupper(*it);
      }
    }

    2. Range-based for loop (C++11 and higher).
    This new feature pretty much eliminates the ugliness/verbosity of c++ iterators. An implementation of your function converter(string&) using a range-based for loop:
    Code (C):
    void converter (std::string& sent_par)
    {
      for (auto& c : sent_par)
      {
        c = std::toupper(c);
      }
    }

    3. The std::transform function.
    You'll need to use #include <algorithm> to use this function. There's a lot of good stuff in the c++ algorithm library. An implementation of your function converter(string&) using std::transform:
    Code (C):
    void converter (std::string& sent_par)
    {
      std::transform (sent_par.begin(), sent_par.end(), sent_par.begin(), std::toupper);
    }
     
    Last edited: Aug 25, 2015
  12. Aug 26, 2015 #11

    harborsparrow

    User Avatar
    Gold Member

    It is really good that you followed up on such a compiler warning. A lot of times, those warnings are not important--but they are always there for a reason. Understanding that reason can save your butt, someday.
     
  13. Aug 27, 2015 #12
    As you say, sometimes the warnings which show up are in a way spurious. In this case though, while things ran as required, the information provided by Filip Larsen and D H show that the issue is actually rather insidious.

    Being a beginner I've never encountered
    Code (Text):
    size_t
    before now. Thanks a lot for this info, folks.
     
    Last edited by a moderator: Aug 27, 2015
  14. Aug 27, 2015 #13
    Whoops. Messed that up a bit.
     
  15. Aug 27, 2015 #14

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    That's okay. I fixed it for you -- or so I think. Let me know if I didn't get it right.

    You'll run into std::size_t again and again and again. It is pervasive throughout the C++ standard library. Ditto @harborsparrow, it truly is a good thing you followed up on this warning. It is also a good thing that you even saw this warning. That means you have your compilation options set at a reasonably high level.


    Note well: I used std::size_t rather than size_t. I'm pedantic, and I never, ever use using namespace std. There would be a good deal of unhappiness (and worse) if I saw that construct (using namespace std) in code that I ask someone to write for me. This construct is very widely considered to be extremely bad style amongst professional c++ programmers. Unfortunately, many introductory c++ texts use this construct everywhere.

    Quoting from the Zen of Python, "Namespaces are one honking great idea -- let's do more of those!" What this means is (a) it's a great idea to get in the habit of creating namespaces, and (b) it's a bad idea to use constructs such as using namespace whatever that explicitly subvert the very concept of separate namespaces.
     
  16. Aug 27, 2015 #15
    Makes sense for when programs get more intricate I suppose. On the other hand, it makes sense to get into the habit of such practices early, yes? Bad habits being hard to break and so on. As for the fixing part, close enough. "As you say, sometimes the warnings which show up are in a way spurious. In this case though, while things ran as required, the information provided by Filip Larsen and D H show that the issue is actually rather insidious." That bit should be plain text as well but unless you feel it really should be altered, so be it.
     
  17. Aug 27, 2015 #16
    @Titan97, his code is indeed wrong, check your compiler settings. Compilers default to being fairly flexible, but good code should pass the strictest settings it has.

    Forget size_t, that's just going to confused you. size_t is unsigned long (usually.)

    Think of how your data is represented: bits. So here is an 8 bit value
    00100101
    That's 37 in decimal. How do negative numbers work? I suggest learning that, here is what it looks like:
    10100101
    The first digit of an 8 bit int is how you know if it's positive or negative. This actually only gives you 7 bits that you can use for storing value, since you need one for the sign. If you know your value is always positive and won't go negative, you can then use that last bit to store data.

    So to summarize:
    unsigned int - 32 bits of data (0 - 4294967295)
    int - 31 bits of data + 1 sign bit (-2147483647 to 2147483647)

    That's why you should warn when doing comparisons and why you should cast it, they have slightly different ranges.
     
  18. Aug 27, 2015 #17

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    There's nothing wrong per se with using a loop of the form for (int = 0; i < object.size(); i++) for an object of a reasonable size. Problems arise if the object is huge and it's size is greater than the maximum value an int can attain. In the code at hand, this problem would arise if the user enters a line that is over two billion characters long. This is rather unlikely.

    A bigger problem with comparing signed to unsigned values occurs when the signed value is negative. The signed integer is widened to a signed integer with the same width as the unsigned integer and then converted to an unsigned integer by the standard promotion and conversion rules. Those promotion and conversion rules means that -1 is equivalent to 18446744073709551615 when compared to a std::size_t value on a modern computer with 64 bit integers. This in turn means that -1 is greater than or equal to any unsigned integer. This is massively confusing and massively counterintuitive. This rather than overflow is the primary reason this warning exists.

    Compilers are rather dumb. The compiler warned about i <= object.size() because it has a rule against always warning about comparing signed and unsigned integers. This warning is almost certainly superfluous in this case. However, the compiler did not warn that the code used <= instead of the idiomatic <. Whether some_string[some_string.size()] is valid depends on the compiler and and on the version of C++. (Compare with std::vector some_vector. In that case, some_vector[some_vector.size()] most definitely is invoking undefined behavior.) He's a bit lucky that his code worked. It could have erased his hard drive, which is the canonical response to invoking undefined behavior.

    Not on my computer, and not on several other computers I use. There, std::size_t is unsigned long long.
     
  19. Aug 27, 2015 #18
    Expanding on that, you have issues with unsigned numbers when you are iterating backwards:

    Code (C):

    for(unsigned int i = 0; i < 200; ++i)  //Fine
    for(unsigned int i = 200; i > 0; --i)  //Infinite loop, i can never be less than zero
     
    Is unsigned long long different than unsigned long? I've seen them different on weird machines, but usually not.

    The standard specifies this:
    unsigned shorts and ints must be at least 16 bits
    unsigned longs must be at least 32 bits AND contain all valid pointers
    unsigned long longs must be at least 64 bits AND contain all valid pointers

    try this program if you want to see what's what
    Code (C):
    #include <iostream>
    #define PRINT_SIZE(x) std::cout << #x << ": " << sizeof(x) * 8 << " bits" << std::endl
    int main(int argc, char ** argv){
         PRINT_SIZE(unsigned char);
         PRINT_SIZE(unsigned short);
         PRINT_SIZE(unsigned int);
         PRINT_SIZE(unsigned long);
         PRINT_SIZE(unsigned long long);
         PRINT_SIZE(char);
         PRINT_SIZE(short);
         PRINT_SIZE(int);
         PRINT_SIZE(long);
         PRINT_SIZE(long long);
         PRINT_SIZE(std::size_t);
         PRINT_SIZE(void *);
         return 0;
    }
     
    My output on my 64 bit mac
    Code (C):
    unsigned char: 8 bits
    unsigned short: 16 bits
    unsigned int: 32 bits
    unsigned long: 64 bits
    unsigned long long: 64 bits
    char: 8 bits
    short: 16 bits
    int: 32 bits
    long: 64 bits
    long long: 64 bits
    std::size_t: 64 bits
    void *: 64 bits
     
     
  20. Aug 27, 2015 #19

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    Almost always. Try this on your mac, a windows machine, and a linux machine:
    Code (Text):
    #include <iostream>
    #include <typeinfo>
    #include <cstddef>

    int main ()
    {
        std::cout << typeid(unsigned long).name() << '\n';
        std::cout << typeid(unsigned long long).name() << '\n';
        std::cout << typeid(std::size_t).name() << '\n';
    }
    unsigned long and unsigned long long can be different types, and they are on your mac. (This killed me with some templates.) The underlying type of std::size_t is very system/compiler dependent.
     
    Last edited: Aug 27, 2015
  21. Aug 27, 2015 #20
    Their typeids will always be different, for long and long long, I was referring to their layout in memory. For templates the type is important, for holding numbers, the bit size in important. C++ can't convert between two template types (the issue isn't the type, it's how templates work, they have different function pointers), but it can easily implicitly convert between number types (you shouldn't, but you can.)

    My output of your program showed std::size_t as the same type as unsigned long, not long long on my mac.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Signed and unsigned integer expressions?
Loading...