File Input and Output

Textbook Reading

Up to this point, we have been reading information from the keyboard and saving it in temporary memory while we work with it. But once the program ends, all of that information is gone. While this is fine for small programs, having to re-enter large amounts of data every time we run a program is frustratingly tedious. You may have already wondered, as you are testing your programs, if there is not a better way.

Thankfully, as you know from working with other programs, we can save information in files between different sessions of running a program such as Word. Managing file input and output is a big part of the C standard library, and there are many powerful functions that we can use. In this reading and lab, we will be introduced to just the beginning of our options. The necessary background details on high-level file reading and writing (input and output, or I/O) are given in the following textbook reading.

King: Sections 9.5, pp 202 - 204; 22.1-22.6, pp. 539-572

Text File I/O in C

As the King reading tells us, there are two major types of files: text files and binary files. Text files are those that contain human-readable characters. They are divided into lines and contain a special integer, EOF, that tells a program that it has reached the end of the file. We will focus mostly on reading and writing text files.

Many of the input and output functions we have been using to this point, such as scanf and print, have related functions that are used when reading and writing to a file. Instead of scanf, we would use fscanf. Instead of printf, we can use fprintf. They are nearly identical in how they are used, but the file-based versions have an additional argument, which stream (the FILE * variable), that is the source or destination of the operation.

For example, see: fileioexample.c. Note that we use the value returned by fscanf to determine when the end of the file is reached. Remember that EOF is not a character. It is an integer, which is a larger variable type. Any time you are comparing a variable to EOF, usually the result of a function, that variable should be of type int.

Handling Errors

Nearly all of the C library calls we have introduced have the ability to report a status about their execution. The functions might report errors, successes, or something else. The programmer must decide whether to heed this information, and if so, how to handle it.

This type of handling is even more important where files are concerned, because so many other complicating factors enter into the fray, such as permissions or disk and network failure. While it is rare to encounter problems, they still do occur, especially if handling files that are located on other computers, such as a network file system (such as our MathLAN).

Many programming languages provide robust exception handling mechanisms. Because C is a low-level programming language, one should develop the necessary habit of detecting and handling errors. Often this means simply printing a helpful message and exiting the program. To see the plethora of possible failures, type man errno in your Linux terminal.

Good design accommodates failures. When things go wrong, it is important to deliver the unfortunate news to the right audience. When you write library code (code linked to other programs), the correct approach is usually an appropriate return code. After all, your client program likely wants to decide how to deal with your failure, rather than have your library print output to the user that is likely to befuddle them.

When writing programs directly for the end user, the correct approach to an unrecoverable error is usually to print an informative error message to the unbuffered stderr file stream. If the failure is due to a system or library call, the standard library functions perror and strerror can assist in providing an even more informative message.

You are likely used to the terminal utilities providing such helpful messages. You could try the following example yourself.

cat /imaginary/file

cat: cannot access /imaginary/file: No such file or directory

/bin/cat /etc/shadow-

/bin/cat:  /etc/shadow-: Permission denied

How do these programs do that? The following example gives us one possible approach.

fopen-test.c

/* Program to demonstrate simple error reporting */

#include <stdio.h>  /* for fopen, fprintf */
#include <stdlib.h> /* for EXIT_FAILURE, normally used with exit function */
#include <string.h> /* for strerror */
#include <errno.h>  /* for errno variable */

int
main (int argc, char* argv[])
{

  if (argc != 2)                            /* Verify user command-line input */
  {
    fprintf (stderr, "Usage: %s filename\n",argv[0]);
    return EXIT_FAILURE;
  }
  
  FILE* stream = fopen (argv[1],"r");      /* Open the file named by the user */

  if (stream == NULL)                       /* Verify the open was successful */
  {                                         /* Report a failure */
    fprintf (stderr, "%s: Cannot open %s: %s\n",
             argv[0], argv[1], strerror(errno));
    return EXIT_FAILURE;
  }

  /* stream ready for reading with fread, fgets, etc. ... */

  if (fclose (stream))                       /* Close the file stream */
  {
    fprintf (stderr, "%s: Error closing file %s: %s\n",
             argv[0], argv[1], strerror(errno));
    return EXIT_FAILURE;
  }
  
  return EXIT_SUCCESS;
} // main

This approach allows us to give very specifically formatted error messages. For example, we include the name of the executable (just like cat does) and the file we are unable to open (which can help disambiguate errors when multiple files could be the cause of failure).

Sometimes we do not need such extensive, configurable prefaces; the simpler perror would then suffice (and it only requires you to #include <stdio.h>). For example:

perror ("Cannot open file");

The perror function prints your message followed by the separating colon and most recent error.

Note in both cases that the error message is printed to stderr, a different stream designed to receive error messages. We may not want to clutter up normal output (particularly when it is captured with shell-based redirection) and we may want to see errors immediately (standard output may be buffered; standard error is completely unbuffered by default so error messages can appear immediately).

Finally, it is important to reiterate that perror and strerror are only applicable if a system or standard library function call fails. For other functions (i.e., one of your own, the MyroC library), you will likely need to craft your own error message for use with fprintf(stderr,...).

Thus, while we might be able to change the error reporting to use perror within the (stream == NULL) or fclose(stream) tests, we cannot change the fprintf of the initial argc test to use perror because the program usage error is not caused by a failed function call.

Binary I/O in C

Writing output

fwrite literally writes bits (the 0s and 1s that represent whatever is pointed to in its first argument) to the file you specify in its last argument. However, to know how many bits to write you have to tell fwrite how big (how many bytes) the data you want to write is and how much data there is. This information corresponds to the second and third arguments respectively.

Suppose you want to write an array of 10 doubles (we'll call it array) to a file stream pointer. The call to fwrite would look something like this:

fwrite (array, sizeof(double), 10, yourStream)

Similarly, if you wanted to write just one Pixel structure you had previously declared as pxl, the call would resemble:

fwrite (&pxl, sizeof(Pixel), 1, yourStream)

Note that because the raw bits (rather than an ASCII representation) are being written, trying to view the resulting file data in the terminal with a program like cat or less will yield gibberish.

Reading input

Not suprisingly, fread functions exactly like fwrite except in reverse. It reads the data in in the same manner—where you specify the data to be read, the sizeof the data, and how many 'packets' of data you want.