We’ve spent much of this week thinking about how we can scale up our projects in C. Specifically, we now know how to:
Makefile.This reading introduces some additional techniques you can use to divide up code in a large C project. Not all of these techniques are great for creating easily-understandable C projects, but you will encounter them so it’s important that you’re familiar with these techniques.
We often define constants using #define to give a descriptive name to a number that would otherwise be unclear.
We’ve seen #define used to set the size of an array, an upper limit on user input, or when we use a value for a special meaning (like an error).
We can put #define lines in source files or header files.
When a #define appears in a header it is available to any source file that includes it.
C allows us to define new types.
The only way we know to do this right now is with typedef, which appeared briefly in an earlier reading.
It is common to define new types when we use a built-in C type for some special meaning.
For example, a program that keeps a database of students may use an int for student IDs, but it could be clearer if we included this line:
typedef int student_id_t;
Now any time a function expects a student ID it will use the student_id_t type for its parameter instead of just an int.
If we ever want to change the underlying type for a student ID we can modify the typedef, but the code that uses it doesn’t necessarily have to change.
Note that the name student_id_t has the suffix _t;
this is a common convention in C when defining new types with typedef.
That suffix helps to distinguish them from variables, which have the same naming rules.
The _t suffix is so common that most programming editors (including VSCode) will look for the suffix to highlight types differently from variable names.
When a function is defined in one file, we can call it from another file by declaring it in that second file.
We can do something similar with global variables.
As an example, consider this excerpt from a file data.c:
int somevalue;
Another file, user.c, may want to access the somevalue variable.
It can do this using the following declaration:
extern int somevalue;
The important keyword here is extern.
The extern keyword means the variable is available somewhere else.
So the extern line in user.c is equivalent to a function declaration, while the line in data.c is more like a definition.
We can make this a bit more modular by putting the extern line in a header file, perhaps data.h.
That would allow any source file that includes data.h to access somevalue.
extern variables in header files?You may wonder whether we’re allowed to create global variables in header files instead of a C source file.
As an example, we could do away with data.c entirely and write this in data.h:
int somedata;
While this might seem reasonable, it does not work in C.
That’s because every source file that includes data.h will create a global variable called somedata.
When we compile the program we will get a linker error because the somedata variable is defined multiple times.
So, as a general rule we cannot put non-extern variables in header files.
Our examples so far have all included header files from .c source files, but C allows us to include header files from other headers.
For example, the header file cow.h may contain:
#include "util.h"
void cow_sound(int count);
In this example, any file that includes cow.h would also have access to the declarations in util.h.
The risk with nested includes is that they could end up being recursive;
util.h may include helpers.h, which could in turn include util.h.
This is exactly the situation include guards are designed to resolve.
static keywordThe static keyword is one of the most confusing parts of C because it has so many different meanings.
If all you remember from this part of the reading is that static can be confusing and you should check its meaning before using it that would be enough.
Here are some of the ways static can be used in C:
We can use static to create variables or functions that have limited visibility like this:
static int x = 1;
static void do_something() {
...
}
In this example, only the file that contains these declarations can use x and do_something.
If we put these lines in a .c file, only that .c file can access them.
If we put these lines in a header file, we end up with something slightly different.
Every .c file that includes the header will have its own version of x and do_something.
It doesn’t usually matter whether two source files share a function (provided the implementation is the same) but there is a significant difference in how x will behave.
If one file includes this header and modifies x, only that file will see the update.
Another file that includes the header has its own separate x variable that can have a different value.
Note that header files are not treated specially;
the meaning of static variables or functions in a header file is no different than in a .c file, it just has slightly different consequences because a header file can be included in multiple places.
We can also write static next to a local variable inside a function.
In this case, the variable is preserved across function calls.
For example, this function returns a number that increases by one each time you call it:
int count_up() {
static int next = 0;
int result = next;
next++;
return result;
}
You can think of these static variables a bit like globals.
This implementation of count_up is equivalent:
int next = 0;
int count_up() {
int result = next;
next++;
return result;
}
The only difference between the two examples is the scope of next.
The first version makes next available inside the count_up function and nowhere else.
In the second version, next is a global and can be accessed and modified anywhere in your program.
There are many different practices for organizing large C projects, but almost any C project will place limits on some of the features described in this reading.
But you’ll also find that most rules also have exceptions.
For example, most C projects disallow the use of extern global variables, but the C standard library has an extern global variable called errno that is so common and important it has its own manpage.
We will spend time in class discussing the advantages and drawbacks of different methods of organizing our code. That discussion will lead us to a reasonable set of standards we’ll follow for the rest of this class, but you’ll also need to be prepared to adapt to other styles if you work with C in the future.