Signed and unsigned integer expressions?

In summary: The warning results from comparing a small negative signed integer to a not-so-small unsigned integer can be rather surprising. For example, for any reasonably sized string, -1 < some_string.length() is false.
  • #1
Lord Anoobis
131
22
So I just completed a simple exercise that converts all letters of a string to upper case letters. The program works but it comes with the warning: comparison between signed and unsigned integer expressions [Wsign-compare].
What does this mean? While whatever is causing the warning does not seem to have any effect on the operation of the program, I would like to what is happening here in case it shows up at some point in the future and DOES cause problems. Here's the prog...
Code:
#include <iostream>
#include <string>
using namespace std;

//Changes letters to upper case
void converter(string& sent_par);

int main ()
{
  string sentence;
  cout << "Enter a sentence: ";
  getline (cin, sentence, '\n');
  converter (sentence);
  cout << endl << sentence << endl;

  return 0;
}

void converter (string& sent_par)
{
  for (int i = 0; i <= sent_par.size(); i++)
  {
  sent_par[i] = toupper(sent_par[i]);
  }
}
 
Technology news on Phys.org
  • #2
I ran the program in code::blocks and it worked perfectly (I saved it as TXT.cpp).

Capture.PNG
 
  • #3
Titan97 said:
I ran the program in code::blocks and it worked perfectly.(I saved it as TXT.cpp)
View attachment 87782
Interesting. Is that Code::blocks 13.12?
 
  • #4
Yes.
 
  • #5
Titan97 said:
Yes.
Ghost in the machine of some variety or other then.
 
  • #6
Should I post a video of it?
 
  • #7
Titan97 said:
Should I post a video of it?
Nah, not necessary. Thanks for the input.
 
  • #8
Due to how 2-complement signed numbers are designed you will probably be be OK when comparing signed and unsigned numbers if their values are in overlapping (intersecting) range of number. If they are not you will get stuff like -1 being equal to 4294967295 (for 32 bit numbers).
 
  • #9
I would guess the warning comes from the expression i <= sent_par.size()where the two sides of the comparison is signed and unsigned, respectively? If so, the warning is to make you aware that the compiler cannot in general make this comparison without promoting the signed int to an unsigned int which bound to go wrong when the number is negative. For a general comment on how to handle such comparison, see [1].

In your case the code will work when sent_par has fewer elements than can be expressed in a signed int, which is around half the theoretical maximum size that size() method [1] can return. If it has more than that your code will probably crash at it tries to overwrite memory outside the sent_par string. A safer approach would be to declare i of type size_t.

[1] http://jwwalker.com/pages/safe-compare.html
[2] http://www.cplusplus.com/reference/string/string/size/
 
  • Like
Likes Lord Anoobis and harborsparrow
  • #10
As @Filip_Larsen noted, the warning results from comparing i to sent_par.size(). The first is a signed int, typically a 32 bit signed integer; the latter is of type std::size_t, typically a 64 bit unsigned integer. The results from comparing a small negative signed integer to a not-so-small unsigned integer can be rather surprising. For example, for any reasonably sized string, -1 < some_string.length() is false.

That said, there are better ways to write your function converter(string&). With modern c++, you should rethink what you are doing whenever you find yourself writing a loop of the form for (int ii = 0; ii < some_limit; ++ii). Some alternative formulations follow.

1. Iterators.
Using iterators instead of indices makes your code much more generic (and also, much more amenable to standard algorithms; see alternative #3). An implementation of your function converter(string&) using iterators:
Code:
void converter (std::string& sent_par)
{
  for (std::string::iterator it = sent_par.begin(); it != sent_par.end(); ++it)
  {
    *it = std::toupper(*it);
  }
}
2. Range-based for loop (C++11 and higher).
This new feature pretty much eliminates the ugliness/verbosity of c++ iterators. An implementation of your function converter(string&) using a range-based for loop:
Code:
void converter (std::string& sent_par)
{
  for (auto& c : sent_par)
  {
    c = std::toupper(c);
  }
}
3. The std::transform function.
You'll need to use #include <algorithm> to use this function. There's a lot of good stuff in the c++ algorithm library. An implementation of your function converter(string&) using std::transform:
Code:
void converter (std::string& sent_par)
{
  std::transform (sent_par.begin(), sent_par.end(), sent_par.begin(), std::toupper);
}
 
Last edited:
  • Like
Likes Lord Anoobis
  • #11
It is really good that you followed up on such a compiler warning. A lot of times, those warnings are not important--but they are always there for a reason. Understanding that reason can save your butt, someday.
 
  • #12
harborsparrow said:
It is really good that you followed up on such a compiler warning. A lot of times, those warnings are not important--but they are always there for a reason. Understanding that reason can save your butt, someday.

D H said:
As @Filip_Larsen noted, the warning results from comparing i to sent_par.size(). The first is a signed int, typically a 32 bit signed integer; the latter is of type std::size_t, typically a 64 bit unsigned integer. The results from comparing a small negative signed integer to a not-so-small unsigned integer can be rather surprising. For example, for any reasonably sized string, -1 < some_string.length() is false.

Filip Larsen said:
I would guess the warning comes from the expression i <= sent_par.size()where the two sides of the comparison is signed and unsigned, respectively? If so, the warning is to make you aware that the compiler cannot in general make this comparison without promoting the signed int to an unsigned int which bound to go wrong when the number is negative. For a general comment on how to handle such comparison, see [1].

In your case the code will work when sent_par has fewer elements than can be expressed in a signed int, which is around half the theoretical maximum size that size() method [1] can return. If it has more than that your code will probably crash at it tries to overwrite memory outside the sent_par string. A safer approach would be to declare i of type size_t.

[1] http://jwwalker.com/pages/safe-compare.html
[2] http://www.cplusplus.com/reference/string/string/size/

As you say, sometimes the warnings which show up are in a way spurious. In this case though, while things ran as required, the information provided by Filip Larsen and D H show that the issue is actually rather insidious.

Being a beginner I've never encountered
Code:
size_t
before now. Thanks a lot for this info, folks.
 
Last edited by a moderator:
  • #13
Whoops. Messed that up a bit.
 
  • #14
Lord Anoobis said:
Whoops. Messed that up a bit.
That's okay. I fixed it for you -- or so I think. Let me know if I didn't get it right.

Lord Anoobis said:
Being a beginner I've never encountered
Code:
size_t
before now. Thanks a lot for this info, folks.
You'll run into std::size_t again and again and again. It is pervasive throughout the C++ standard library. Ditto @harborsparrow, it truly is a good thing you followed up on this warning. It is also a good thing that you even saw this warning. That means you have your compilation options set at a reasonably high level.Note well: I used std::size_t rather than size_t. I'm pedantic, and I never, ever use using namespace std. There would be a good deal of unhappiness (and worse) if I saw that construct (using namespace std) in code that I ask someone to write for me. This construct is very widely considered to be extremely bad style amongst professional c++ programmers. Unfortunately, many introductory c++ texts use this construct everywhere.

Quoting from the Zen of Python, "Namespaces are one honking great idea -- let's do more of those!" What this means is (a) it's a great idea to get in the habit of creating namespaces, and (b) it's a bad idea to use constructs such as using namespace whatever that explicitly subvert the very concept of separate namespaces.
 
  • #15
D H said:
That's okay. I fixed it for you -- or so I think. Let me know if I didn't get it right.You'll run into std::size_t again and again and again. It is pervasive throughout the C++ standard library. Ditto @harborsparrow, it truly is a good thing you followed up on this warning. It is also a good thing that you even saw this warning. That means you have your compilation options set at a reasonably high level.Note well: I used std::size_t rather than size_t. I'm pedantic, and I never, ever use using namespace std. There would be a good deal of unhappiness (and worse) if I saw that construct (using namespace std) in code that I ask someone to write for me. This construct is very widely considered to be extremely bad style amongst professional c++ programmers. Unfortunately, many introductory c++ texts use this construct everywhere.

Quoting from the Zen of Python, "Namespaces are one honking great idea -- let's do more of those!" What this means is (a) it's a great idea to get in the habit of creating namespaces, and (b) it's a bad idea to use constructs such as using namespace whatever that explicitly subvert the very concept of separate namespaces.
Makes sense for when programs get more intricate I suppose. On the other hand, it makes sense to get into the habit of such practices early, yes? Bad habits being hard to break and so on. As for the fixing part, close enough. "As you say, sometimes the warnings which show up are in a way spurious. In this case though, while things ran as required, the information provided by Filip Larsen and D H show that the issue is actually rather insidious." That bit should be plain text as well but unless you feel it really should be altered, so be it.
 
  • #16
@Titan97, his code is indeed wrong, check your compiler settings. Compilers default to being fairly flexible, but good code should pass the strictest settings it has.

Forget size_t, that's just going to confused you. size_t is unsigned long (usually.)

Think of how your data is represented: bits. So here is an 8 bit value
00100101
That's 37 in decimal. How do negative numbers work? I suggest learning that, here is what it looks like:
10100101
The first digit of an 8 bit int is how you know if it's positive or negative. This actually only gives you 7 bits that you can use for storing value, since you need one for the sign. If you know your value is always positive and won't go negative, you can then use that last bit to store data.

So to summarize:
unsigned int - 32 bits of data (0 - 4294967295)
int - 31 bits of data + 1 sign bit (-2147483647 to 2147483647)

That's why you should warn when doing comparisons and why you should cast it, they have slightly different ranges.
 
  • #17
newjerseyrunner said:
@Titan97, his code is indeed wrong, check your compiler settings. Compilers default to being fairly flexible, but good code should pass the strictest settings it has.
There's nothing wrong per se with using a loop of the form for (int = 0; i < object.size(); i++) for an object of a reasonable size. Problems arise if the object is huge and it's size is greater than the maximum value an int can attain. In the code at hand, this problem would arise if the user enters a line that is over two billion characters long. This is rather unlikely.

A bigger problem with comparing signed to unsigned values occurs when the signed value is negative. The signed integer is widened to a signed integer with the same width as the unsigned integer and then converted to an unsigned integer by the standard promotion and conversion rules. Those promotion and conversion rules means that -1 is equivalent to 18446744073709551615 when compared to a std::size_t value on a modern computer with 64 bit integers. This in turn means that -1 is greater than or equal to any unsigned integer. This is massively confusing and massively counterintuitive. This rather than overflow is the primary reason this warning exists.

Compilers are rather dumb. The compiler warned about i <= object.size() because it has a rule against always warning about comparing signed and unsigned integers. This warning is almost certainly superfluous in this case. However, the compiler did not warn that the code used <= instead of the idiomatic <. Whether some_string[some_string.size()] is valid depends on the compiler and and on the version of C++. (Compare with std::vector some_vector. In that case, some_vector[some_vector.size()] most definitely is invoking undefined behavior.) He's a bit lucky that his code worked. It could have erased his hard drive, which is the canonical response to invoking undefined behavior.

Forget size_t, that's just going to confused you. size_t is unsigned long (usually.)
Not on my computer, and not on several other computers I use. There, std::size_t is unsigned long long.
 
  • #18
D H said:
There's nothing wrong per se with using a loop of the form for (int = 0; i < object.size(); i++) for an object of a reasonable size. Problems arise if the object is huge and it's size is greater than the maximum value an int can attain. In the code at hand, this problem would arise if the user enters a line that is over two billion characters long. This is rather unlikely.

Expanding on that, you have issues with unsigned numbers when you are iterating backwards:

Code:
for(unsigned int i = 0; i < 200; ++i)  //Fine
for(unsigned int i = 200; i > 0; --i)  //Infinite loop, i can never be less than zero

D H said:
Not on my computer, and not on several other computers I use. There, std::size_t is unsigned long long.
Is unsigned long long different than unsigned long? I've seen them different on weird machines, but usually not.

The standard specifies this:
unsigned shorts and ints must be at least 16 bits
unsigned longs must be at least 32 bits AND contain all valid pointers
unsigned long longs must be at least 64 bits AND contain all valid pointers

try this program if you want to see what's what
Code:
#include <iostream>
#define PRINT_SIZE(x) std::cout << #x << ": " << sizeof(x) * 8 << " bits" << std::endl
int main(int argc, char ** argv){
     PRINT_SIZE(unsigned char);
     PRINT_SIZE(unsigned short);
     PRINT_SIZE(unsigned int);
     PRINT_SIZE(unsigned long);
     PRINT_SIZE(unsigned long long);
     PRINT_SIZE(char);
     PRINT_SIZE(short);
     PRINT_SIZE(int);
     PRINT_SIZE(long);
     PRINT_SIZE(long long);
     PRINT_SIZE(std::size_t);
     PRINT_SIZE(void *);
     return 0;
}

My output on my 64 bit mac
Code:
unsigned char: 8 bits
unsigned short: 16 bits
unsigned int: 32 bits
unsigned long: 64 bits
unsigned long long: 64 bits
char: 8 bits
short: 16 bits
int: 32 bits
long: 64 bits
long long: 64 bits
std::size_t: 64 bits
void *: 64 bits
 
  • #19
newjerseyrunner said:
Is unsigned long long different than unsigned long?
Almost always. Try this on your mac, a windows machine, and a linux machine:
Code:
#include <iostream>
#include <typeinfo>
#include <cstddef>

int main ()
{
    std::cout << typeid(unsigned long).name() << '\n';
    std::cout << typeid(unsigned long long).name() << '\n';
    std::cout << typeid(std::size_t).name() << '\n';
}

unsigned long and unsigned long long can be different types, and they are on your mac. (This killed me with some templates.) The underlying type of std::size_t is very system/compiler dependent.
 
Last edited:
  • #20
D H said:
Almost always. Try this on your mac, a windows machine, and a linux machine:
Code:
#include <iostream>
#include <typeinfo>
#include <cstddef>

int main ()
{
    std::cout << typeid(unsigned long).name() << '\n';
    std::cout << typeid(unsigned long long).name() << '\n';
    std::cout << typeid(std::size_t).name() << '\n';
}

unsigned long and unsigned long long are different types (this killed me with some templates), and the underlying type of std::size_t is very system/compiler dependent.

Their typeids will always be different, for long and long long, I was referring to their layout in memory. For templates the type is important, for holding numbers, the bit size in important. C++ can't convert between two template types (the issue isn't the type, it's how templates work, they have different function pointers), but it can easily implicitly convert between number types (you shouldn't, but you can.)

My output of your program showed std::size_t as the same type as unsigned long, not long long on my mac.
 
  • #21
newjerseyrunner said:
The standard specifies this:
unsigned shorts and ints must be at least 16 bits
unsigned longs must be at least 32 bits AND contain all valid pointers
unsigned long longs must be at least 64 bits AND contain all valid pointers
Neither the C nor the C++ standard specifies that (the "AND" part). There is no requirement that any of the integer types be able to represent a pointer. If in some implementation, some integer type is capable of containing a pointer, the implementation should define the type uintptr_t (or std::uintptr_t in c++), but even that is optional.

The restrictions on size_t are amazingly small. The only requirement is that this is the type of the sizeof operator, and the standards allow implementations to limit arrays and allocated memory to very small sizes.
 
  • #22
You are right, my mistake, I'm not sure where I read that, but it's not in the standard text.

If anyone is curious where I'm looking: http://open-std.org/JTC1/SC22/WG21/docs/papers/2015/n4527.pdf
That's the C++14 standard they are working on. I can't find the C++11 one.

It's worth noting though, the C++ Standards committee doesn't write any compilers. The standard is released usually long before compilers catch up to them. Clang for example, still doesn't support a select set of features from C++11. Compilers are also allowed to implement their own features: long long was just put in the standard, but most C++ compilers have had it for quite some time.
 

What are signed and unsigned integer expressions?

Signed and unsigned integer expressions are numerical values that are used to represent whole numbers. The main difference between the two is that signed integers can represent both positive and negative numbers, while unsigned integers can only represent positive numbers.

How are signed and unsigned integer expressions stored in memory?

Signed and unsigned integer expressions are stored in memory using a specific number of bits, which determines the range of values that can be represented. For example, a 32-bit integer can store values from -2,147,483,648 to 2,147,483,647 for signed integers, and 0 to 4,294,967,295 for unsigned integers.

What are some common uses for signed and unsigned integer expressions?

Signed and unsigned integer expressions are commonly used in computer programming for mathematical calculations, storing and processing data, and representing values in various data structures such as arrays and lists.

What happens if a signed integer expression exceeds its maximum value?

If a signed integer expression exceeds its maximum value, an overflow error may occur. This means that the value cannot be represented with the given number of bits and the result may be unexpected or incorrect.

How can I convert between signed and unsigned integer expressions?

To convert between signed and unsigned integer expressions, you can use the appropriate conversion functions or operators in your programming language. It is important to pay attention to the range of values that can be represented to avoid potential errors.

Similar threads

  • Programming and Computer Science
Replies
15
Views
2K
  • Programming and Computer Science
Replies
12
Views
1K
  • Programming and Computer Science
Replies
5
Views
4K
  • Programming and Computer Science
Replies
4
Views
5K
  • Programming and Computer Science
4
Replies
118
Views
6K
  • Programming and Computer Science
Replies
5
Views
885
  • Programming and Computer Science
Replies
8
Views
2K
Replies
10
Views
961
  • Programming and Computer Science
Replies
3
Views
3K
  • Programming and Computer Science
Replies
5
Views
2K
Back
Top