What is a signed int? an unsigned int? a char?

In C language, the natural numbers are known as unsigned integers, while the integer numbers are known as signed integers. Unsigned integers are implemented by the data types unsigned char and unsigned int. Signed integers are implemented by the data types char and int. To create variables u, n, c, and i of these types, just write

(In place of unsigned int you can simply write unsigned.) There are also types short int and long int and the corresponding unsigned versions, but this site uses them very seldom.

The unsigned char type

An unsigned char is an integer in the interval 0 . . 2⁸−1, that is, 0 . . 255. Each unsigned char is represented by 1 byte using binary notation.

The integers outside the interval 0 . . 255 are reduced modulo 2⁸, that is, represented by the remainder on division by 256. In other words, every unsigned integer N is represented by the unsigned char u for which the difference N − u is a multiple of 256.

Example. To show an example that will fit on the page, we shall pretend that each byte has only 4 bits. Then the value of every unsigned char belongs to the interval 0 . . 2⁴−1. This interval can be represented by a circle to suggest the reduction modulo 2⁴:

For example, 16 is represented by 0 since 16 = 1×2⁴ + 0. Likewise, 17 is represented by 1 since 17 = 1×2⁴ + 1. Similarly, 36 is represented by 4 since 36 = 2×2⁴ + 4.

Characters. Inspite of the char in its name, the type unsigned char does not necessarily represent characters. But unsigned chars smaller than 128 can be used to represent the characters in the ASCII alphabet.

The char type

A char is an integer in the interval −2⁷ . . 2⁷−1, that is, in −128 . . 127. (Actually, the C language standard does not impose this interval. Some implementations may use the interval 0 . . 2⁸−1. But we shall ignore this possibility.) Each char is represented by 1 byte using two's complement notation.

Integers outside the interval −128 . . 127 are reduced modulo 2⁸. In other words, every integer N is represented by the char c for which the difference N − c is an integer (positive or negative) multiple of 256.

Inspite of its name, the type char does not necessarily represent characters. To stress this, we shall often use byte as a stand-in for char:

Example. Let's pretend, in this example, that every byte has only 4 bits. Then, the value of every char belongs to the interval −8 . . 7. This interval can be represented by a circle to suggest the reduction modulo 2⁴:

For example, 8 modulo 2⁴ is −8 (since 8 = 1×2⁴ − 8), 9 modulo 2⁴ is −7 (since 9 = 1×2⁴ − 7), and −30 modulo 2⁴ is 2 (since −30 = −2×2⁴ + 2).

Characters. Numbers of type char between 0 and 127 may be used to represent characters in the ASCII alphabet. In this interval, a char and an unsigned char have the same pattern of 8 bits and therefore represent the same character.

Exercises 1

Write, in binary notation, the smallest and the largest value that an unsigned char can have. Repeat the exercise for char.
Write, in hexadecimal notation, the smallest and the largest value that an unsigned char can have. Repeat the exercise for char.
Which unsigned char represents the number 1000 (one thousand)? Which char represents 1000? Which char represents −1000?
Write and test a program to display on the screen the characters represented by chars 32 to 127. (See printing in %c format in the Input and output chapter.) Display ten characters per line.
Write and test a program that attemps to display on the screen the characters represented by the bytes whose first bit is 1. (See printing in %c format in the Input and output chapter.)
Write a program to read two characters of the ASCII alphabet typed in by the user and say whether the first comes before or after the second in the ASCII table. (See input format %c in the Input and output chapter.)
The ctype library. Study the documentation of the functions in the ctype library. The function isspace, for example, decides whether a given ASCII character is a white-space. The function isalpha decides whether a given ASCII character is a letter.

The unsigned int type

The hardware of every computer works with blocks of s consecutive bytes, where s can be 1, 2, 4 or 8 depending on the machine. Each block of s consecutive bytes is a word. Each word can have 2^8s different values.

An unsigned int is an integer in the interval 0 . . 2^8s−1. Each unsigned int is represented in one word using binary notation. The value of s is given by the expression sizeof (unsigned int) and the number 2^8s−1 is recorded in the constant UINT_MAX defined in the limits.h interface.

Integers greater than UINT_MAX are reduced modulo UINT_MAX + 1. Hence, every positive integer N is represented by the unsigned int n for which the difference N − n is an integer multiple of UINT_MAX + 1.

From this point on, the examples assume that s = 4. Hence, UINT_MAX is 2³²−1, equal to 4294967295.

Unsigned int arithmetic. The operations of addition, subtraction, and multiplication between numbers of type unsigned int are subject to overflow since the exact result of an operation can fall outside the range 0..UINT_MAX. (Overflows are unusual because most of the time we work with small numbers.) Overflows are not treated as errors and the exact result of each operation is quietly reduced modulo UINT_MAX + 1. For example:

The quotient of division between unsigned ints is truncated: the expression 9/2, for example, has value ⌊9/2⌋, that is, the floor of 9/2.

The int type

An int is a integer in the interval −2^8s−1 . . 2^8s−1−1. Each int is represented in s consecutive bytes using two's complement notation. The value of s is given by the expression sizeof (int), which is equal to sizeof (unsigned int). The numbers −2^8s−1 and 2^8s−1−1 are recorded in the constants INT_MIN and INT_MAX respectively, both defined in the limits.h interface. The difference INT_MAX − INT_MIN is of course equal to UINT_MAX.

Integers outside the interval INT_MIN..INT_MAX are reduced modulo UINT_MAX + 1. Hence, every integer N is represented by the int i for which the difference N − i is a (positive or negative) integer multiple of UINT_MAX + 1.

From this point on, we assume that s = 4. Hence, INT_MIN is −2³¹, equal to −2147483648, and INT_MAX is 2³¹−1, equal to 2147483647.

Int arithmetic. The operations of addition, subtraction, and multiplication between numbers of type int are subject to overflow since the exact result of an operation may fall outside the range INT_MIN..INT_MAX. (Overflows are unusual because most of the time we work with small numbers.) Overflows may go unnoticed because they are not treated as errors and the exact result of each operation is automatically reduced modulo UINT_MAX + 1. For example:

The assignment of an unsigned int to an int can also result in an overflow and is therefore done modulo UINT_MAX + 1. For example:

The division operation between ints is truncated: the expression 9/2, for example, has value ⌊9/2⌋, that is, the floor of 9/2. In the case of strictly negative numbers, the result of the division is truncated towards zero: the expression -9/2 has value −⌊9/2⌋ rather than ⌊−9/2⌋.

Exercises 2

Sizeof. Compile and execute the following program:

int main (void) {
   printf ("sizeof (unsigned): %lu\n", 
            sizeof (unsigned));        
   printf ("UINT_MAX: %u\n",
            UINT_MAX);
   printf ("sizeof (int) = %lu\n",  
            sizeof (int));        
   printf ("INT_MIN: %d\nINT_MAX: %d\n",
            INT_MIN,     INT_MAX);
   return EXIT_SUCCESS; }

Write the numbers UINT_MAX, INT_MIN, and INT_MAX in hexadecimal notation.
Suppose that we must count the number of occurrences of some event using a computer in which sizeof (unsigned) is 2. We know beforehand that the event does not occur more than 65535 times. Should we use a variable of type unsigned to count the occurrences of the event? What if the event may occur more than 65535 times? Propose a solution that uses a linked list to represent a counter in base 100.
Suppose that sizeof (unsigned) is 2 on your machine. What is the value of 60000 + 30000 in unsigned int arithmetic? What is the value of 30000 + 15000 in int arithmetic? What is the value of 60000 × 11 in unsigned int arithmetic? What is the value of 30000 × 2 in int arithmetic?
★ Detecting overflow. Write a boolean function that will receive unsigned ints n and m and decide (true or false) whether the addition of n and m will produce an overflow. Repeat the exercise assuming that n and m are of type int.
★ How far? Write a program to receive an unsigned int n and print the powers n², n³, n⁴, n⁵, etc. Your program must stop only when it is not able to store the next power in a variable of type unsigned int.

Constants

The integer constants like the 999 in the example, are treated as if they were of type int and its value must be between INT_MIN and INT_MAX.

The character constants, like the 'a' in the example above, are wrapped in single quotes to distinguish them from names of variables. They are restricted to the ASCII alphabet. The value of a character constant is given by the ASCII table (hence, 'a' is the same as 97).

Character constants are of type int, rather than of type char, as one might expect. But when a constant is assigned to a variable of type char, it is converted into a byte (for example, 'a' is converted into 01100001).

As an illustration, the following code fragment displays the twenty-six uppercase letters of the ASCII alphabet. Note that these letters occupy consecutive positions in the ASCII table and are in the usual alphabetic order. Something similar happens with the lowercase letters.

Exercises 3

What is the difference between 'O', 'o', '0', '\0' and 0?

What is the effect of the following code fragment?

for (int i = 1; i <= 26; ++i) 
   printf ("%c\n", 'a' + i - 1);

Mixed type arithmetic expressions

Arithmetic operations between operands of type char and/or unsigned char are not executed modulo 2⁸, as one could expect. In such operations, all the operands are first converted (promoted) to type int and the operation is executed in int arithmetic.

For example, if the variables u and v are of type unsigned char and their values are 255 and 2 respectively, the expression u + v is of type int and has value 257. (But the assignment of u + v to a variable of type unsigned char is, of course, done modulo 2⁸.)

Similar considerations apply to arithmetic expressions on operands of different types, as int, char, and unsigned char. For example, if a variable c is of type char and has value 127, the expression c + 2 is of type int and has value 129 (the constant 2 is of type int by definition).

Exercises 4