Now that we understand how to allocate memory with malloc, we have the added responsibility to track that memory carefully and free it when we are finished with it.
Even before learning about malloc, we also dealt with functions that take pointer parameters for a variety of reasons.
This reading covers a broad range of concerns that all fall under the general notion of memory ownership.
This reading will introduce a standard model for thinking about memory ownership in C that will help you answer the following questions:
malloc versus reserving space with a local variable?These are important questions to ask about any C program, but it’s easier to answer them when your code conforms to common conventions in C. These conventions even span language boundaries; the Rust and C++ languages both use similar conventions for memory ownership, either in the language itself (for Rust) or in the language’s standard library (for C++). As with any convention, there are sometimes exceptions. You’ll inevitably encounter a project written in C that uses a different approach, but every non-trivial system will adopt some model of memory ownership to help you safely manage memory.
constOur discussion of memory ownership has to start with how parameters are passed to functions in C. Recall that every parameter in C is passed by value. Consider the code fragment below:
int fun(int x, int* y);
int main() {
int a = 4;
int b = 5;
fun(a, &b);
...
}
The main function calls fun by passing in a and &b.
The values that fun receives as parameters are copies of these values.
The value of x is 4 and the value of y is &b.
There are languages that can pass parameters other than by value;
the alternative is to pass a parameter by reference, meaning the function being called (the callee) can modify the value stored in the caller’s scope.
At this point you may be thinking “isn’t &b a reference?”
It is true that the value of y will be a pointer to b, but the parameter was passed as a copy of that pointer.
The example above gives fun a pointer to b, but what does that mean for the function’s behavior?
The fun function could modify b using that pointer, or it could just read through the pointer to see the current value of b.
This ambiguity makes it difficult to tell what a C function will do with a pointer parameter.
Now that we know how to allocate memory, we have even more ambiguity to deal with:
what if fun calls free(y) somewhere in its implementation?
If it does, we have to be careful because it’s not safe to pass &b to fun anymore;
&b does not point to memory allocated by malloc so it should not be freed.
If we passed a malloced pointer to fun we’d also need to be careful not to use that memory again because it has been freed.
const parametersOne good way to resolve ambiguity around how a function will use pointer parameters is with the const qualifier.
You can read about the general use of const in Beej’s Guide to C, section 16.1.1, but our use of const will be limited to a single common case:
indicating that a pointer can be used to read the data it points to, but not to change it.
Consider this slightly modified version of the fun declaration from above:
int fun(int x, const int* y);
This declaration removes all of the ambiguity from the earlier version.
The y parameter has type const int*, which you should read as “a pointer to an integer that cannot be modified through this pointer”.
That doesn’t mean y can never change, just that fun cannot use the pointer y to change the stored value.
From this declaration, we can tell that y is an input to fun, not an output.
Our question about whether fun calls free(y) at any point is also resolved by this update.
The free function expects a pointer value, not a const pointer value as a parameter.
C does not let us silently convert from const int* to int*, so fun can’t pass y to free without jumping through some extra hoops.
Removing the const-ness of a pointer is generally bad practice in C, so we (a) won’t do it ourselves, and (b) trust that functions we call won’t do it either.
If y was meant to be an output—meaning fun will change the value of b by writing through y—then we wouldn’t include const in the type.
In C, we’re left to resolve ambiguity about memory management with comments.
When you write functions that take non-const pointer parameters you’ll need to write comments to explain how that pointer will be used and what is expected at that location.
Similarly, pay attention to documentation for provided functions that take pointer parameters.
The ambiguity around passing pointer parameters to functions is fairly easy to avoid with single values like ints or floats;
inputs are usually passed as values, and we typically only pass pointers to these values when they are meant to be used as outputs from the function.
The ambiguity that const helps to resolve is most important for arrays and structures.
Recall that arrays in C are just pointers to the first element of the array.
There is no way to pass an array to a C function that doesn’t give a function a pointer into the original array;
the only way we can indicate a read-only array parameter is using const.
For example, the following function takes an array parameter and could modify it:
void array_function(int arr[], int length);
While the following function takes an array as input only;
the const qualifier prevents the function from modifying the array’s contents, so arr is unambiguously an input parameter:
void array_function(const int arr[], int length);
We sometimes do the same thing with structs in C.
It is possible to pass a struct by value (instead of passing a pointer to a struct):
struct complex {
float real;
float imaginary;
};
void print_complex_number(struct complex c);
Calling this function will copy the struct complex parameter you provide.
Copying takes extra time, so sometimes we pass structs as pointers to avoid copying;
the only value copied is the pointer to the struct, instead of the entire struct (which could be quite large).
Here is a function that takes a struct input, passed via pointer:
void print_complex_number(const struct complex* c);
Again, const prevents the function from modifying the struct through the pointer c.
We could drop the const if we want c to be an output parameter instead, although that would be an odd decision for a printing function.
As always, comments will help to avoid ambiguity.
Now that we have a handle on how const can limit a function’s uses of a pointer parameter we are prepared to discuss memory ownership.
Let’s start with some terminology:
An owning pointer is the “primary” pointer to some memory returned by malloc or one of its relatives.
If you call malloc and save the result in a local variable, that local variable is the owning pointer.
There are two key responsibilities we have when dealing with owning pointers:
free to release the memory when we are finished with it.In other words, our maintenance of the owning pointer is what prevents our program from leaking memory.
While we always have an owning pointer for any allocated memory, we sometimes pass copies of that pointer to other functions for use.
For example, the following code “loans” a pointer to the strcpy and printf functions:
char* str = malloc(sizeof(char) * STRING_LENGTH);
strcpy(str, "Hello, memory!\n");
printf("Message: %s\n", str);
free(str);
In other words, we can say that strcpy and printf borrow the pointer str.
We give borrowed pointers to functions so they can access allocated memory, but they should not free the memory with a borrowed pointer;
that responsibility lies exclusively with the owning pointer, str.
We can “loan” a pointer to a function without granting write access to the loaned memory by using const, as we see with the strlen standard function:
int strlen(const char* s);
Note: the actual return type from strlen is a size_t, which is an integer type appropriate for measuring the size of arrays, strings, or memory allocations. We haven’t discussed this type in class, but it behaves just like any unsigned integer and is available in stdlib.h.
For this function, we allow strlen to borrow the pointer we pass to it, but we do not expect (or want) strlen to free the memory that pointer points to.
Not all borrowed pointers will be const—we can loan a pointer to a function as an output parameter that it writes through—but a const pointer is always a borrowed pointer.
Owning pointers must be non-const, since we presumably needed the allocated memory to hold data, and a pointer that doesn’t allow us to write through it is not useful for storing anything.
The free function takes a non-const pointer as a parameter, so we couldn’t free a const pointer without removing its const-ness through casting, which is strongly discouraged.
We can also borrow pointers to memory that wasn’t allocated by malloc, like this example:
char message[] = "Test message";
printf("%s is %d characters long\n", message, strlen(message));
In this example, we’ve reserved space to hold a string on the stack instead of calling malloc.
We can safely borrow pointers to this memory by passing it to strlen and printf because these functions will not try to free the pointer we pass them.
This is why many standard C functions take pointers as parameters: it gives us the flexibility to decide where to store values (i.e. on the stack or the heap) regardless of how we plan to use those values.
With both owning and borrowed pointers, we have the following details to add to our ownership model:
The second item above is a critical new detail: as long as there are borrowed copies of a pointer, we are not done with the memory it points to.
To round out our model of memory ownership we need the ability to change which pointer owns an allocated block of memory. This often comes up when we are building data structures; we may allocate memory to hold an important value, then “insert” that value into a data structure by saving a pointer to the allocated memory in the data structure. Here is an example that uses an array:
char* messages[MAX_MESSAGES];
int message_count = 0;
/**
* Add a message to the messages array.
*
* \param new_message The message to store in the array.
* This function takes ownership of new_message from the caller.
* \returns true if the message was added successfully
*/
bool add_message(char* new_message) {
if (message_count >= MAX_MESSAGES) {
return false;
}
messages[message_count] = new_message;
message_count++;
return true;
}
Notice the comment about the new_message parameter to add_message;
this function’s documentation declares that it will take ownership of the pointer you pass to it.
That means the pointer you pass in stops being the owning pointer.
After calling this function, the owning pointer is stored in the messages array.
At some point, before the program exits, it must free each of the pointers in messages because these are owning pointers to allocated memory.
We also have to deal with functions that transfer ownership to our code by returning a pointer.
The most obvious example of a function that does this is malloc;
we are responsible for retaining and then freeing the pointer that `malloc gives us.
We will look at some additional functions that transfer ownership by returning pointers in the lab for this topic.
Unlike with borrowing, we cannot transfer ownership of memory on the stack. When we take the address of a local variable we are always borrowing a pointer to that memory; it will always be “freed” when the function returns. That means you should never pass a pointer to a local variable to a function if that function expects to take ownership of the pointer.
For example, this would be an unsafe function call in the context of our previous example:
int main() {
char some_message[] = "Hello, locals.";
add_message(some_message);
...
}
The C compiler won’t do anything to stop us from writing, compiling, and running this code.
But when the program eventually frees the elements of messages we will end up with an error;
freeing a pointer that didn’t come from malloc, calloc, or realloc is always an error, and can have unpredictable and serious consequences.
You’ll need to be careful to understand the behavior of functions you call, and to be aware of whether or not they take ownership of the pointers you pass them.
This model of ownership reflects the conventions you’ll see in the C standard library and many C projects, but what does it mean for us as C programmers and how does it help us write memory-safe programs?
Critically, we must always be able to decide whether any given pointer owns the memory it points to, or if it is a borrowed pointer to that memory. Given that information, we then know our responsibilities:
We must free the memory when we are done with it. It is safe to loan out copies of an owning pointer, but we should not free the pointer until all borrowed copies are gone or will no longer be used.
If we ever transfer ownership of this pointer to another part of the program, this pointer becomes a borrowed pointer.
You can even think of calling free(p) as a way of transferring ownership of the pointer back to the heap, although we shouldn’t do this if other parts of the program are still using borrowed copies of the pointer.
Never free this pointer.
We can borrow copies of borrowed pointers;
for example, a function that receives a borrowed pointer as a parameter can safely loan that pointer to strlen or memcpy.
We cannot transfer ownership of a borrowed pointer (it is not ours to transfer), and there is no way to convert a borrowed pointer into an owning one.