−2147483648 . . 2147483647
−128 . . 127
In C language, the natural numbers are known as unsigned integers, while the integer numbers are known as signed integers. Unsigned integers are implemented by the data types unsigned char and unsigned int. Signed integers are implemented by the data types char and int. To create variables u, n, c, and i of these types, just write
unsigned char u; unsigned int n; char c; int i;
(In place of unsigned int
you can simply write unsigned
.)
There are also types
short int
and long int
and the corresponding unsigned
versions,
but this site uses them very seldom.
Table of contents:
An unsigned char is an integer in the interval 0 . . 2^{8}−1, that is, 0 . . 255. Each unsigned char is represented by 1 byte using binary notation.
The integers outside the interval 0 . . 255 are reduced modulo 2^{8}, that is, represented by the remainder on division by 256. In other words, every unsigned integer N is represented by the unsigned char u for which the difference N − u is a multiple of 256.
Example. To show an example that will fit on the page, we shall pretend that each byte has only 4 bits. Then the value of every unsigned char belongs to the interval 0 . . 2^{4}−1. This interval can be represented by a circle to suggest the reduction modulo 2^{4}:
0 15 1 14 2 13 3 12 4 11 5 10 6 9 7 8 |
For example, 16 is represented by 0 since 16 = 1×2^{4} + 0. Likewise, 17 is represented by 1 since 17 = 1×2^{4} + 1. Similarly, 36 is represented by 4 since 36 = 2×2^{4} + 4.
Characters.
Inspite of the char
in its name,
the type unsigned char
does not necessarily represent characters.
But unsigned chars
smaller than 128
can be used to represent the characters in the
ASCII alphabet.
A char is an integer in the interval −2^{7} . . 2^{7}−1, that is, in −128 . . 127. (Actually, the C language standard does not impose this interval. Some implementations may use the interval 0 . . 2^{8}−1. But we shall ignore this possibility.) Each char is represented by 1 byte using two's complement notation.
Integers outside the interval −128 . . 127 are reduced modulo 2^{8}. In other words, every integer N is represented by the char c for which the difference N − c is an integer (positive or negative) multiple of 256.
Inspite of its name,
the type char
does not necessarily represent characters.
To stress this, we shall often use
byte
as a stand-in for
char
:
typedef char byte;
Example. Let's pretend, in this example, that every byte has only 4 bits. Then, the value of every char belongs to the interval −8 . . 7. This interval can be represented by a circle to suggest the reduction modulo 2^{4}:
+0 −1 +1 −2 +2 −3 +3 −4 +4 −5 +5 −6 +6 −7 +7 −8 |
For example, 8 modulo 2^{4} is −8 (since 8 = 1×2^{4} − 8), 9 modulo 2^{4} is −7 (since 9 = 1×2^{4} − 7), and −30 modulo 2^{4} is 2 (since −30 = −2×2^{4} + 2).
Characters. Numbers of type char between 0 and 127 may be used to represent characters in the ASCII alphabet. In this interval, a char and an unsigned char have the same pattern of 8 bits and therefore represent the same character.
The hardware of every computer works with blocks of s consecutive bytes, where s can be 1, 2, 4 or 8 depending on the machine. Each block of s consecutive bytes is a word. Each word can have 2^{8s} different values.
An unsigned int is an integer in the interval 0 . . 2^{8s}−1. Each unsigned int is represented in one word using binary notation. The value of s is given by the expression sizeof (unsigned int) and the number 2^{8s}−1 is recorded in the constant UINT_MAX defined in the limits.h interface.
Integers greater than UINT_MAX are reduced modulo UINT_MAX + 1. Hence, every positive integer N is represented by the unsigned int n for which the difference N − n is an integer multiple of UINT_MAX + 1.
From this point on, the examples assume that s = 4. Hence, UINT_MAX is 2^{32}−1, equal to 4294967295.
Unsigned int arithmetic. The operations of addition, subtraction, and multiplication between numbers of type unsigned int are subject to overflow since the exact result of an operation can fall outside the range 0..UINT_MAX. (Overflows are unusual because most of the time we work with small numbers.) Overflows are not treated as errors and the exact result of each operation is quietly reduced modulo UINT_MAX + 1. For example:
unsigned int n, m, x; n = 4000000000; m = 300000000; x = n + m; // overflow // x == 5032704
The quotient of division between unsigned ints is truncated: the expression 9/2, for example, has value ⌊9/2⌋, that is, the floor of 9/2.
An int is a integer in the interval −2^{8s−1} . . 2^{8s−1}−1. Each int is represented in s consecutive bytes using two's complement notation. The value of s is given by the expression sizeof (int), which is equal to sizeof (unsigned int). The numbers −2^{8s−1} and 2^{8s−1}−1 are recorded in the constants INT_MIN and INT_MAX respectively, both defined in the limits.h interface. The difference INT_MAX − INT_MIN is of course equal to UINT_MAX.
Integers outside the interval INT_MIN..INT_MAX are reduced modulo UINT_MAX + 1. Hence, every integer N is represented by the int i for which the difference N − i is a (positive or negative) integer multiple of UINT_MAX + 1.
From this point on, we assume that s = 4. Hence, INT_MIN is −2^{31}, equal to −2147483648, and INT_MAX is 2^{31}−1, equal to 2147483647.
Int arithmetic. The operations of addition, subtraction, and multiplication between numbers of type int are subject to overflow since the exact result of an operation may fall outside the range INT_MIN..INT_MAX. (Overflows are unusual because most of the time we work with small numbers.) Overflows may go unnoticed because they are not treated as errors and the exact result of each operation is automatically reduced modulo UINT_MAX + 1. For example:
int i, j, x; i = 2147483000; j = 2147483000; x = i + j; // overflow // x == -1296
The assignment of an unsigned int to an int can also result in an overflow and is therefore done modulo UINT_MAX + 1. For example:
int i; unsigned int n; n = 2147483700; i = n; // overflow // i == -2147483596
The division operation between ints is
truncated:
the expression 9/2, for example, has value ⌊9/2⌋,
that is,
the floor
of 9/2.
In the case of strictly negative numbers,
the result of the division is truncated
towards zero:
the expression -9/2 has value −⌊9/2⌋
rather than ⌊−9/2⌋.
int main (void) { printf ("sizeof (unsigned): %lu\n", sizeof (unsigned)); printf ("UINT_MAX: %u\n", UINT_MAX); printf ("sizeof (int) = %lu\n", sizeof (int)); printf ("INT_MIN: %d\nINT_MAX: %d\n", INT_MIN, INT_MAX); return EXIT_SUCCESS; }
The code of almost every C program contains integer constants. Many programs also have character constants (also known as literals). For example:
a = 999; c = 'a';
The integer constants like the 999 in the example, are treated as if they were of type int and its value must be between INT_MIN and INT_MAX.
The character constants, like the 'a' in the example above, are wrapped in single quotes to distinguish them from names of variables. They are restricted to the ASCII alphabet. The value of a character constant is given by the ASCII table (hence, 'a' is the same as 97).
Character constants are of type int, rather than of type char, as one might expect. But when a constant is assigned to a variable of type char, it is converted into a byte (for example, 'a' is converted into 01100001).
As an illustration, the following code fragment displays the twenty-six uppercase letters of the ASCII alphabet. Note that these letters occupy consecutive positions in the ASCII table and are in the usual alphabetic order. Something similar happens with the lowercase letters.
char c; for (c = 'A'; c <= 'Z'; ++c) printf ("%c ", c);
for (int i = 1; i <= 26; ++i) printf ("%c\n", 'a' + i - 1);
Arithmetic operations between operands
of type char and/or unsigned char
are not executed modulo 2^{8},
as one
could expect.
In such operations,
all the operands are first converted
(promoted
)
to type int
and the operation is executed in int arithmetic.
For example, if the variables u and v are of type unsigned char and their values are 255 and 2 respectively, the expression u + v is of type int and has value 257. (But the assignment of u + v to a variable of type unsigned char is, of course, done modulo 2^{8}.)
Similar considerations apply to arithmetic expressions on operands of different types, as int, char, and unsigned char. For example, if a variable c is of type char and has value 127, the expression c + 2 is of type int and has value 129 (the constant 2 is of type int by definition).
unsigned char u, v, w; u = 255; v = 2; printf ("%d", u + v); w = u + v; printf ("%d", w);
unsigned char u; for (u = 0; u < 256; ++u) printf (".");
char c; for (c = 0; c < 128; ++c) printf (".");
Some applications require large integer numbers that do not fit into an int. To deal with these applications, C has the data type long int, that occupies more bytes than the type int. On my machine, a long int occupies 8 bytes, that is, 64 bits. (But see the page Is there any need of “long” data type in C and C++? in GeeksforGeeks.)
Answer: The data type int32_t, for example, occupies exactly 4 bytes regardless of the value of sizeof (int). The other types are defined similarly. For some applications, the use of these types improves the portabilidade of the programs. But since the present site is more concerned with the algorithms than with portability and implementation details, we shall continue to use int and unsigned. (See the answer of Matt Whiting to the question Why does the C library have their own Int and other datatypes? on Quora.)