Memory Management Revisited

C is a unique programming language in that the programmer is wholly responsible for managing their program uses and/or abuses memory. The old adage is particularly useful to remember here:

With great power comes great responsibility!

Indeed, manual memory management is both C's greatest strength and weakness. Certain patterns of low-level, systems programming are quite natural to perform in C. However, we must be diligent in ensuring that our programs do not cause memory safety violations.

For this final leg of the course, we revisit this critical "feature" of C, looking at more advanced, modern perspectives on how to take advantage of manual memory management and stay safe while doing so.

Review: Stack and Heap Allocation

In C, we can allocate data in one of two places:

  • On the stack via a local variable declaration or a function parameter.

    int num_widgets = 0;
    
    void process(int amount) {
      // ...
    }
    
  • On the heap via a call to malloc.

    int *mutable_cell = (int*) malloc(sizeof(int))
    

    (Note: this line allocates two chunks of memory: a pointer on the stack called mutable_cell and an int-sized chunk on the heap that mutable_cell points to!)

Stack allocation has many beneficial properties:

  • Stack allocation is fast, requiring no additional function calls.
  • Stack allocation is direct, i.e., we (usually) manipulate stack-allocated values directly rather than indirectly via pointers.
  • Cleaning up stack-allocated memory is automatic, occurring whenever a function call returns.

Because of these properties, we should allocate our data primarily on the stack whenever possible. However, there are limitations to this method of allocation:

  • Stack-allocated chunks cannot exist outside of the stack frame they were allocated in.
  • The stack has a limited amount of space. On modern operating systems, this is usually in the range of 1–8 MiB.
  • The size of stack-allocated chunks must be known at compile-time, i.e., their sizes must be statically determined. In other words, a stack-allocated chunk's size cannot depend on a value known only at runtime, e.g., the number of characters a user will enter at a prompt.

In some cases, we can design our program to circumvent these problems and use stack allocation. In other cases, we must resort to using heap allocation via malloc.

"Out" Parameters

In C, like most languages, a function can only return one value. However, we frequently encounter situations where we want a function to produce multiple values. For example, consider a function read_scores that reads a series of integers from the user, e.g., using fgetc and strtol. This function ought to return an array (likely allocated on the heap) that contains the entered integers. However, the function also needs to report the size of the array as well.

Because C is imperative, i.e., allows values to be mutated, and has explicit pointer types, the idiomatic way of handling this problem is by using the return value of read_scores for the array and passing a pointer to a location that read_scores will fill in with the size of the array. This pointer parameter is called an out parameter. We would write the function read_scores as follows:

int* read_scores (int *sz_out) {
    // ... 
}

Inside of read_scores, we can then write the size of the array to the sz_out parameter. The caller of read_scores would then allocate space for the size, presumably on the stack, and pass a pointer to that chunk to read_scores.

int sz = 0;   // this 0 will be overwritten by read_scores
int *nums = read_scores(&sz);

In some cases, these "out" parameters also allow us to get around the lifetime restrictions of stack-allocated chunks by "inverting" where we perform allocation. For example, recall that stack-allocated arrays are not returned by-copy like other types. We, instead, return a pointer to that stack-allocated array implicitly.

char* read_fixed_size_input() {
  char result [50];
  // ...
  return result;    // returns a pointer to the array, not a copy!
}

This function immediately has a memory error! The returned pointer to result is a dangling pointer because the result array is deallocated when the function returns.

Instead of allocating result inside of read_fixed_size_input, the caller of the function can allocate the result array and pass it as an "out" parameter to the function:

void read_fixed_size_input(char *result) {
  // Now, the function mutates result and returns nothing!
}

int main() {
  char result[50];
  read_fixed_size_input(result);
}

Variable-length Arrays

While "out" parameters solve the lifetime issues associated with stack memory, they do not address the other issues surrounding large, dynamically-determined sizes. An extension to C available in most compilers, variable-length arrays, allows us to use stack-allocated arrays in situations where the size of the desired data is fixed-size but that size is determined.

As a somewhat contrived example, consider the following program snippet that first prompts the user for the size of their name and then uses a variable-length array (VLA) to store that name:


int main() {
  int len = 0;
  printf("How long is your name? ");
  scanf("%d", &len);

  // Consumes the newline character left behind from scanf
  fgetc(stdin);

  printf("What is your name? ");
  char buf [len+1];
  for (int i = 0; i < len; i++) {
    buf[i] = fgetc(stdin);
  }
  buf[len] = '\0';

  printf("Your name is \"%s\"\n", buf);
}

In the code snippet above, note how buf's length is determined dynamically, i.e., it is a value entered by the user! Otherwise, the code behaves as expected, prompting the user first for the length of their name before asking for it. The VLA allows us to cope with a name of any (reasonable) size provided the user gives us the length up-front.

While useful in some cases, VLAs tend to not be as useful in practice as we might expect. The above code snippet demonstrates why: it is rare to be in a situation where our data is both fixed-size but dynamically known. A better prompting function would use our charlist implementation from a previous lab to grow our buffer as the user enters characters rather than demanding the user give us the bound up-front.

When to Allocate on the Heap

We have seen several ways of mitigating the perils of stack allocation. Notably, if we need to avoid a dangling pointer issue where we want to return a chunk of memory from a function, we can allocate the memory outside of the function and use "out" parameters to modify that chunk. However, in situations where:

  • We need to allocate large chunks of memory that can quickly exceed the size of the stack.
  • Our memory demands are variable-sized and dynamically determined.

We need to resort to heap allocation with malloc!

Copying and Sharing Data

One area where we need to be especially cognizant of how we manage our memory is copying data. For example, consider our standard linked list implementation over ints:

typedef struct node {
    int value;
    struct node *next;
} node;

typedef struct {
    node *first;
} list;

node* make_node(int value, node *next) {
    node *ret = (node*) malloc(sizeof(node));
    ret->value = value;
    ret->next = next;
    return ret;
}

list* make_list() {
    list *ret = (list*) malloc(sizeof(list));
    ret->first = NULL;
    return ret;
}

Now consider implementing list* clone_list(list *l) which returns a copy of input list l. A natural implementation of this function is to create new copies of each node as we traverse l:

list* clone_list(list *l) {
    list *ret = make_list()
    if (l->first != NULL) {
        ret->first = make_node(l->first->value, NULL)
        node *cur = l->first;
        node *ret_cur = ret->first;
        while (cur->next != NULL) {
            ret_cur->next = make_node(cur->next->value, NULL);
            cur = cur->next;
            ret_cur = ret_cur->next;
        }
    }
    return ret;
}

For example, the following small program utilizing clone_list:

int main() {
  list *l1 = make_node(0, make_node(1, make_node(2, NULL)));
  list *l2 = clone_list(l1);
  // Point A
}

Produces the following memory layout at the end of the program:

Stack       Heap
-----       ----
l1 [o]----->[o]-->[0][o]-->[1][o]-->[2][/]

l2 [o]----->[o]-->[0][o]-->[1][o]-->[2][/]

However, this is not the only choice of implementation for clone_list. Consider this much simpler implementation instead:

list* clone_list(list *l) {
    list *ret = make_list()
    ret->first = l->first;
    return ret;
}

What memory layout do we obtain with our main program now? We obtain this, very different, layout!

Stack       Heap
-----       ----
l1 [o]----->[o]-->[0][o]-->[1][o]-->[2][/]
                   ∧ 
l2 [o]----->[o]----|

With this second implementation, l2 now shares its nodes with l1!

We say that the first implementation of clone_list produces a deep copy of its input. In contrast, the second implementation produces a shallow copy. With a shallow copy, the resulting copy of the value shares part or all of its memory with its original. In contrast, a deep copy is completely independent of the original.

From both implementations, it should be clear that creating shallow copies of an object is more efficient than producing deep copies. A deep copy requires that we traverse the entire structure to copy values whereas a shallow copy can be made with just a few pointer copies.

However, is sharing nodes between linked lists like this bad? If we never mutated the contents of the linked list, then we would not be able to observe a difference between a shallow and a deep copy! But, we're programming in C and mutation is everywhere, so we have to contend with these effects. In short, whether sharing nodes is acceptable or even preferred depends on the application at hand.

In today's class, we'll continue exploring some of these concepts surrounding memory sharing through several examples. Importantly, we'll also look at a major problem unique to our low-level context when sharing memory: deletion.