Input and output

[monitor-2.png]

[keyboard-3.png]

This chapter describes, rather superficially, the most important input and output functions of the C language.  They are all in the stdio library. To access this library, your program must include the interface of the library by means of

#include <stdio.h>

Table of contents:

Keyboard and screen

The function  printf  (the name is a shorthand for print formatted) prints on the screen a formatted list of numbers, characters, strings, etc.  The first argument of the function is a string that specifies the printing format.

The function  scanf  (shorthand for scan formatted) reads from the keyboard a list of numbers, characters, strings, etc.  The first argument of the function is a string that specifies the format of the list.  The other arguments are the addresses of the variables where the read values must be stored.  The function treats all the white-spaces as if they were ordinary spaces (i.e., ' ' characters).  Here is an example:

#include <stdio.h>
#include <stdlib.h>

int main (void) {
   int a, b;
   scanf ("%d %d", &a, &b);
   double avg;
   avg = (a + b)/2.0;
   printf ("The average of %d and %d is %f\n", a, b, avg);
   return EXIT_SUCCESS;
}

If the name of the program is average, we shall see the following result on the screen (the computer writes in red):

~$ ./average
222 333
The average of 222 and 333 is 277.500000
~$

To print the ASCII character represented by a char or unsigned char variable, use the function printf with format specification  %c.  For example, the code fragment

char a = 65; // equivalently, a = 'A';
printf ("%d %c", a, a);

displays

65 A

To read an ASCII character from the keyboard, use the function scanf with format specification %c. In the example below, the function will read the first non-white-space character (the preceding white-spaces are discarded), convert the character into the corresponding byte, and store the byte in the variable a:

char a;
scanf ("%c", &a);

To read a character chain from the keyboard, use scanf with format specification %s. The function stops reading as soon as it runs into the first white-space character. (To read a character chain that contains white-spaces, use the fgets function.)  For example, the program fragment below reads a character chain, converts the sequence into a string and stores the string in the array str:

char str[100];
scanf ("%s", str); 

Files

A file is a sequence of bytes that resides in the slow memory of the computer (a hard drive, for example). The bytes of a file have no individual addresses. So, the access to the file is strictly sequential: to reach the fifth byte we must go through the first, the second, the third, and the fourth bytes.

Before a program can manipulate a file, it must be assigned to a variable of type  FILE  (this type is defined in the stdio.h interface).  The assignment operation is known as opening the file and is executed by the fopen function. The first argument of the function is the name of the file and the second argument is "r" or "w" to indicate whether the file must be open for reading or for writing. The fopen function returns the address of a FILE (or NULL, if the specified file was not found).  After you are done using the file, call the fclose function to close it.

Suppose, for example, that the text file data.txt contains a sequence of integers (in decimal notation) separated by white-space characters. The program below computes the average of these numbers. To read the file, the program uses the function fscanf, that generalizes scanf:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

int main (void) {
   FILE *input;
   input = fopen ("data.txt", "r");
   if (input == NULL) {
      printf ("\nFile not found!\n");
      exit (EXIT_FAILURE);
   }  
   double sum = 0.0;
   int n = 0;
   while (true) {
      int x;
      int k = fscanf (input, "%d", &x);
      if (k != 1) break;
      sum += x;
      n += 1;
   }
   fclose (input);
   printf ("The average is %f\n", sum/n);
   return EXIT_SUCCESS;
}

The function fscanf, just as scanf, returns the number of objects actually read. If nothing was read, it returns 0.  The program above uses this feature to detect the end of the file. (The program assumes that the file contains at least one number.)

Stdin and stdout.  The standard input file is the keyboard.  It is permanently open and is represented by the constant stdin. Therefore fscanf (stdin, ...) is equivalent to scanf (...).

Something similar happens with the functions printf, fprintf, and the standard output file stdout, that corresponds to the screen.

The functions putc and getc

The most basic output function — more basic than fprintf — is  putc  (the name is a shorthand for put character).  The function receives a byte and saves it to the specified file.  (Often the byte has value between 0 and 127 and therefore represents an ASCII character.)  If c is a char and f is a FILE then putc (c, f) saves c to f.  For example, putc ('?', stdout) displays the character ? on the screen.

The corresponding input function is  getc  (the name is a shorthand for get character).  Each call to the function reads a byte from the specified file.  (Often the byte represents an ASCII character.)  For example, getc (stdin) reads a byte from the keyboard and returns this byte.

The expressions putchar (x) and getchar () are shorthands for putc (x, stdout) and getc (stdin) respectively.

Example.  The program below reads from the keyboard a sequence of bytes (for example, a line of ASCII characters), stores this sequence in an array, and displays the corresponding characters on the screen.  The program assumes that the sequence has at most 100 bytes (including the final \n):

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

int main (void) {
   char line[100];
   int n = 0;
   while (true) {
      line[n] = getchar ();
      if (line[n] == '\n') break;
      n = n + 1;
   }
   for (int i = 0; i <= n; i += 1)
      putchar (line[i]);
   return EXIT_SUCCESS;
}

Another example.  The program below should read the first byte of the file data.txt and display the corresponding character (assuming that it is an ASCII character) on the screen:

#include <stdio.h>
#include <stdlib.h>

int main (void) {
   FILE *input;
   input = fopen ("data.txt", "r");
   if (input == NULL) exit (EXIT_FAILURE);
   char c; // wrong!
   c = getc (stdin);
   fclose (input);
   putc (c, stdout);
   return EXIT_SUCCESS;
}

The program has a defect to be discussed in the next section.

What kind of object does getc return?

What happens if getc attempts to read a byte from a file whose end has already been reached?  One would need getc to return some kind of invalid byte, but all the 256 bytes are valid!

To solve this deadlock, the function getc always returns an int rather than a byte. The set of int values contains all the char values and then some.  Hence, if the end of the file has been reached, getc can return an int that is distinct from any char.  More specifically,

  1. if the file has a next byte, getc reads that byte as if it were an unsigned char, converts it to a (positive) int, and returns the int;
  2. if the file has no more bytes, getc returns a strictly negative int.

More exactly, if the file has no more bytes, getc returns the constant  EOF  (the name is a shorthand for end of file), that is defined in the stdio.h interface and is equal to −1 on most computers.

In summary, the function getc returns an element of a superset of the set in which we are really interested. By doing so, the output of getc is always of the same type (an int), even in exceptional situations.  This trick is an important lesson in program design.

Example.  The following code fragment displays the next byte of a file unless the file has ended:

   int c;
   c = getc (inpt);
   if (c != EOF) 
      putc (c, stdout);
   else 
      printf ("\nThe file has ended!");

(By the way, if the file inpt is stdin, the end of the file is signaled by the combination of keys Ctrl D, that generates the byte 4.)

Exercises 1

  1. A programmer proposes to redefine the function getc so that it returns a char (rather than an int) and the expression getc (inpt) has the following effect: if there is no next byte in file inpt, the function returns the character \a and displays the message file ended on the screen.  Criticize this proposal.
  2. Write a complete program to do a byte-by-byte copy of the file whose name the user types on the keyboard.  [Solution: ./solutions/io1.html]
  3. Write a program to delete all the comments (of the /*...*/ and //... kinds) from the source file of a C program.  Write the output to a new file.

The functions fputs and fgets

The function fputs (the name is a shorthand for file put string) of the stdio library writes a string to a file. For example,

fputs (str, file);

writes the string str (except its final null byte) to the file. (The effect is essentially the same as that of fprintf (file, "%s", str).)

If the second argument of fputs is stdout, the string str is displayed on the terminal screen after being converted to a character chain.

The function fgets (the name is a shorthand for file get string) of the stdio library reads a line from a file and stores the line in a string. For example, if str is an array of bytes then

fgets (str, 100, file); 

stores in str the current line of file (including the \n that marks the end of the line) and adds a final null byte.

The second argument of fgets is a protection against very long lines, longer than the space allocated to the first argument. In the example above, if the number of bytes of the current line of file (including the final \n) is greater than 99, only the first 99 bytes will be stored in str.

If the last argument of fgets is stdin, the function reads a character chain from the keyboard up to the end of the line ( key), converts the chain into a string, and stores the string at the address specified by the first agument.

Command line arguments

The execution of any C program consists of the execution of the main function (which usually calls other functions).  The main function has two parameters, which we call numargs and arg.  The second parameter is an array of strings while the first is the number of elements of the array.

int main (int numargs, char *arg[]) {
   . . . 
} 

The arguments are typed by the user on the terminal when calling the program. (The user types one or more spaces to separate each argument from the previous one.)  The first string typed is the name of the program and will become the value of arg[0].  The following strings are known as command line arguments and will become the values of arg[1], arg[2], …  The value of numargs is defined implicitly by number of strings typed.

Suppose, for example, that the name of the program is prog.  Then, after the user types the command line

~$ ./prog -a  bb   ccc 2222

the value of numargs will be 5 and the values of arg[0] to arg[4] will be the strings  "prog", "-a", "bb", "ccc", and "2222" respectively.

Example.  The following program computes the average of the integer numbers typed on the command line.  The user must make sure that each of the numbers belongs to the interval INT_MIN..INT_MAX.

#include <stdio.h>
#include <stdlib.h>

int main (int numargs, char *arg[]) {
   int sum = 0;
   for (int i = 1; i < numargs; ++i) 
      sum += strtol (arg[i], NULL, 10);
   int n = numargs - 1;
   printf ("average = %.2f\n", (double) sum / n);
   return EXIT_SUCCESS;
} 

Assuming that the name of the program is prog, we may see the following exchange on the screen:

~$ ./prog +22   33 -11      +44
average = 22.00
~$

Another example.  The following program displays a conversion table from Celsius degrees to Fahrenheit degrees or vice-versa. For example, if you type

./temperature c-f 10 40

the program will print a conversion table fom Celsius to Fahrenheit that starts at 10 degrees Celsius and proceed in steps of 1 until 40 degrees.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main (int numargs, char *arg[]) {
   if (numargs != 4) {
      printf ("Wrong number of arguments.\n");
      return EXIT_FAILURE;
   }
   int inf = strtol (arg[2], NULL, 10);
   int sup = strtol (arg[3], NULL, 10);
   // arg[2] and arg[3] must represent integers
   // in the interval INT_MIN..INT_MAX
   if (strcmp (arg[1], "c-f") == 0) {
      printf ("Celsius Fahrenheit\n");
      for (int c = inf; c <= sup; c += 1) 
         printf ("%7d %10.2f\n", c, 9.0/5.0*c + 32);
      return EXIT_SUCCESS;
   }
   if (strcmp (arg[1], "f-c") == 0) {
      printf ("Fahrenheit  Celsius\n");
      for (int f = inf; f <= sup; f += 1) 
         printf ("%10d %8.2f\n", f, 5.0*(f-32)/9.0);
      return EXIT_SUCCESS;
   }
   return EXIT_FAILURE;
} 

Redirecting input and output.  If an argument on the command line is the name of a file preceded by the character  <  then this file will assume the role of standard input.  Similarly, if an argument on the command line is the name of a file preceded by the character  >  then this file will assume the role of standard output.  For example, if the file in.txt in the current directory contains

222  333

and the program shown in the first example of this chapter, is named average then the call

~$ ./average  < in.txt > out.txt

will produce a file out.txt with the following contents:

The average of 222 and 333 is 277.500000

Exercises 2

  1. Modify the program of the example above so that it will abort if strtol returns a number that does not fit into an int.
  2. Write and test a program that will display a table with the values of all the arguments typed in by the user on the command line, and then stop.
  3. Word count.  Write a program to count the number of occurrences of each character in a text file whose bytes represent ASCII characters.  The program must read the name of the file from the command line and display on the screen a table with the number of occurrences of each character.  (To gain inspiration, study the behavior of the wc utility.)

Questions and answers