File Input and Output
Textbook Reading
Up to this point, we have been reading information from the keyboard and saving it in temporary memory while we work with it. But once the program ends, all of that information is gone. While this is fine for small programs, having to re-enter large amounts of data every time we run a program is frustratingly tedious. You may have already wondered, as you are testing your programs, if there is not a better way.
Thankfully, as you know from working with other programs, we can save information in files between different sessions of running a program such as Word. Managing file input and output is a big part of the C standard library, and there are many powerful functions that we can use. In this reading and lab, we will be introduced to just the beginning of our options. The necessary background details on high-level file reading and writing (input and output, or I/O) are given in the following textbook reading.
- King: Sections 9.5, pp 202 - 204; 22.1-22.6, pp. 539-572
Text File I/O in C
As the King reading tells us, there are two major types of files: text files and binary files. Text files are those that contain human-readable characters. They are divided into lines and contain a special integer, EOF, that tells a program that it has reached the end of the file. We will focus mostly on reading and writing text files.
Many of the input and output functions we have been using to this point, such as scanf and print, have related functions that are used when reading and writing to a file. Instead of scanf, we would use fscanf. Instead of printf, we can use fprintf. They are nearly identical in how they are used, but the file-based versions have an additional argument, which stream (the FILE * variable), that is the source or destination of the operation.
For example, see: fileioexample.c. Note that we use the value returned by fscanf to determine when the end of the file is reached. Remember that EOF is not a character. It is an integer, which is a larger variable type. Any time you are comparing a variable to EOF, usually the result of a function, that variable should be of type int.
Handling Errors
Nearly all of the C library calls we have introduced have the ability to report a status about their execution. The functions might report errors, successes, or something else. The programmer must decide whether to heed this information, and if so, how to handle it.
This type of handling is even more important where files are concerned, because so many other complicating factors enter into the fray, such as permissions or disk and network failure. While it is rare to encounter problems, they still do occur, especially if handling files that are located on other computers, such as a network file system (such as our MathLAN).
Many programming languages provide robust exception handling
mechanisms. Because C is a low-level programming language, one should
develop the necessary habit of detecting and handling errors. Often
this means simply printing a helpful message and exiting the
program. To see the plethora of possible failures,
type man errno in your Linux terminal.
Good design accommodates failures. When things go wrong, it is important to deliver the unfortunate news to the right audience. When you write library code (code linked to other programs), the correct approach is usually an appropriate return code. After all, your client program likely wants to decide how to deal with your failure, rather than have your library print output to the user that is likely to befuddle them.
When writing programs directly for the end user, the correct approach
to an unrecoverable error is usually to print an informative error
message to the unbuffered stderr file stream. If the
failure is due to a system or library call, the standard library
functions perror and strerror can assist in
providing an even more informative message.
You are likely used to the terminal utilities providing such helpful messages. You could try the following example yourself.
cat /imaginary/file
cat: cannot access /imaginary/file: No such file or directory
/bin/cat /etc/shadow-
/bin/cat: /etc/shadow-: Permission denied
How do these programs do that? The following example gives us one possible approach.
/* Program to demonstrate simple error reporting */
#include /* for fopen, fprintf */
#include /* for EXIT_FAILURE, normally used with exit function */
#include /* for strerror */
#include /* for errno variable */
int
main (int argc, char* argv[])
{
if (argc != 2) /* Verify user command-line input */
{
fprintf (stderr, "Usage: %s filename\n",argv[0]);
return EXIT_FAILURE;
}
FILE* stream = fopen (argv[1],"r"); /* Open the file named by the user */
if (stream == NULL) /* Verify the open was successful */
{ /* Report a failure */
fprintf (stderr, "%s: Cannot open %s: %s\n",
argv[0], argv[1], strerror(errno));
return EXIT_FAILURE;
}
/* stream ready for reading with fread, fgets, etc. ... */
if (fclose (stream)) /* Close the file stream */
{
fprintf (stderr, "%s: Error closing file %s: %s\n",
argv[0], argv[1], strerror(errno));
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
} // main
This approach allows us to give very specifically formatted error
messages. For example, we include the name of the executable (just
like cat does) and the file we are unable to open (which
can help disambiguate errors when multiple files could be the cause
of failure).
Sometimes we do not need such extensive, configurable prefaces; the
simpler perror would then suffice (and it only requires
you to
#include <stdio.h>). For
example:
perror ("Cannot open file");
The perror function prints your message followed by the
separating colon and most recent error.
Note in both cases that the error message is printed
to stderr, a different stream designed to receive error
messages. We may not want to clutter up normal output (particularly
when it is captured with shell-based redirection) and we may want to
see errors immediately (standard output may be buffered; standard
error is completely unbuffered by default so error messages can
appear immediately).
Finally, it is important to reiterate that perror
and strerror are only applicable if a system or
standard library function call fails. For other functions (i.e., one
of your own, the MyroC library), you will likely need to craft your
own error message for use with
fprintf(stderr,...).
Thus, while we might be able to change the error reporting to
use perror within
the (stream == NULL) or
fclose(stream)
tests, we cannot change the fprintf of the
initial argc test to use perror because the
program usage error is not caused by a failed function call.
Binary I/O in C
Writing output
fwrite literally writes bits (the 0s and 1s that
represent whatever is pointed to in its first argument) to the file you
specify in its last argument. However, to know how many bits to write you
have to tell fwrite how big (how many bytes) the data you want
to write is and how much data there is. This information corresponds to the
second and third arguments respectively.
Suppose you want to write an array of 10 doubles (we'll call it array) to a file stream pointer. The
call to fwrite would look something like this:
fwrite (array, sizeof(double), 10, yourStream)
Similarly, if you wanted to write just one Pixel structure you had previously declared as pxl, the call
would resemble:
fwrite (&pxl, sizeof(Pixel), 1, yourStream)
Note that because the raw bits (rather than an ASCII representation)
are being written, trying to view the resulting file data in the
terminal with a program like cat or less will yield gibberish.
Reading input
Not suprisingly, fread functions exactly
like fwrite except in reverse. It reads the data in in
the same manner—where you specify the data to be read,
the sizeof the data, and how many 'packets' of data you
want.