What are signed and unsigned chars?

adjacent · Nov 20, 2014

I am reading a book called C++ primer, which mentions about a data type of C++ called char(Character). I know that this data type can hold characters such as "A" or "5". But what does signed and unsigned char mean?
The book says: "an 8-bit unsigned char can hold values from 0 through 255 inclusive". What does values mean here?
I have Googled but found no good explanation. The book also does not explain this.

jedishrfu · Nov 20, 2014

signed chars are 7-bit but stored as 8-bit (ie 1 byte). They are the official ASCII character set:

http://en.wikipedia.org/wiki/ASCII

the unsigned chars are the ones added by PC vendors like IBM and MS. As an example, box drawing characters used in the day when graphics wasnt an option for computer displays (ie characters could only be displayed) as an extension to ASCII.

http://en.wikipedia.org/wiki/Code_page_437

Code pages may vary from country to country where glyphs relevant to the country overlay others.

adjacent · Nov 20, 2014

Oh, thank you very much. Do I have to know a lot about unsigned and signed chars to learn C++? I am not a computer student, so I don't know much about computers yet.

jtbell · Nov 20, 2014

A Google search for "C++ unsigned char" gave me this as its first hit:

http://stackoverflow.com/questions/75191/what-is-an-unsigned-char

It's from 2008, so the situation may have changed somewhat in newer versions of C++.

adjacent · Nov 20, 2014

jtbell said:

A Google search for "C++ unsigned char" gave me this as its first hit:

http://stackoverflow.com/questions/75191/what-is-an-unsigned-char

It's from 2008, so the situation may have changed somewhat in newer versions of C++.

Yes, I read that. The part which confused me was this part:

signed char, which gives you at least the -127 to 127 range. (-128 to 127 is common)
unsigned char, which gives you at least the 0 to 255 range."

What does values mean here? I am guessing that each character is represented by a specific value, like in that code page given above by jedishrfu.

nsaspook · Nov 20, 2014

When using c or c++ for embedded or systems programming the signedness and bit size of integers becomes very important (and confusing when using 'char' and 'int') during the manipulation of bit pattern sequences (possible values within the range of the variable purely from its bit size) with logical/arithmetic operations that also evaluate sign from those bit patterns so a large 'in range' unsigned 8-bit 'uint8_t' number could be a small signed 8-bit 'int8_t' number of the same bits. So there is usually a header that defines standard types.
http://embeddedgurus.com/stack-over...t-c-tips-1-choosing-the-correct-integer-size/

There are esoteric reasons for the precise use of signedness that are mainly concerned with optimizing software performance.
http://clip.dia.fi.upm.es/~jorge/docs/wrapped-intervals-aplas12.pdf

jtbell · Nov 20, 2014

8 bits can store a set of binary bit patterns ranging from 00000000 to 11111111. What those patterns mean as integers depends on whether the data type is specified as 'signed char' or 'unsigned char'. Try the following program:

Code:

#include <iostream>
using namespace std;

int main ()
{
//  hexadecimal 61 = binary 01000001
    unsigned char my_unsigned_char = 0x61;
    signed char my_signed_char = 0x61;

// display them as ints
    cout << "Unsigned: " << int(my_unsigned_char) << endl;
    cout << "Signed: " << int(my_signed_char) << endl;

// hexadecimal E1 = binary 11000001
    my_unsigned_char = 0xE1;
    my_signed_char = 0xE1;

// display them as ints
    cout << "Unsigned: " << int(my_unsigned_char) << endl;
    cout << "Signed: " << int(my_signed_char) << endl;

    return 0;
}

voko · Nov 20, 2014

Most computers, which includes all mainstream computers, operate with binary digits (bits), which are grouped into bytes (8 bits), words (16 bits), double words (32 bits) and quad-words (64 bits). These entities can be used in various ways, but very frequently they are used to represent integers. Integers, in turn, can be signed or unsigned.

In C and C++, 'char' is an integer of the smallest width, but no less than 8 bits; in most practical cases it is exactly 8 bits, i.e., a byte. Because it is an integer, it can be signed or unsigned. One byte can represent 256 different integer values; for signed characters, they are -128 ... 127, for unsigned characters, they are 0 ... 255.

These values are in turn re-interpreted as symbols. For example, the value 65, which is the same for both signed and unsigned, is interpreted as symbol 'A'.

If the char type is used only to store characters (i.e., symbols), then the difference between the signed and unsigned char types is practically absent. But in certain cases, char types are used for arithmetic, in which case the difference may be crucial. An example of such arithmetic is the Base-64 encoding.

Mark44 · Nov 20, 2014

adjacent said:

What does values mean here? I am guessing that each character is represented by a specific value, like in that code page given above by jedishrfu.

The only things that can be stored on a computer are numbers, which means that characters are stored as numbers. For example, the character 'A' is stored as a bit pattern 0100 0001 in binary, or 65 as a decimal number. In C you can use printf to display a character in various forms.

Code:

char ch = 'A';
printf("Decimal: %d\n", ch);  // Prints 65, the decimal form
printf("Hex:%x\n", ch);  // Prints 0x41, a hexadecimal form
printf("Octal: %o\n", ch); // Prints 0101, an octal form
printf("Character: %c\n", ch) // Prints A, the character form

The underlying value of the variable ch is 0100 0001. We get four different representations in the code above because printf is converting from the bit pattern to representations in other bases or other forms.

adjacent · Nov 21, 2014

Thanks a lot everyone, I think I should learn a bit about computers before learning to program in C++. I am glad that I chose to learn C++.

What are signed and unsigned chars?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

HTML/CSS Problems with DNS records

PHP My website presents the visitor with the choice of opting out of using cookies....

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect