Pointers and Memory Allocation

Goals

This lab provides practice with basic elements of pointers, addresses, values, and memory allocation in C.

Parts of the lab deliberately will give "errors" and ask you to understand why they occur. Usually, if you read a little further in the lab, the explanation will be given.

Since MathLAN is often updated, it is possible that instructions will not run as expected.

Printing Memory Addresses

Download and examine the short C program memory.c that declares and initializes an int, a double, and a string. It then prints the address of and value stored in each of the variables.
  1. To run a new shell that disables memory address randomization (a security feature) in Linux, run the following Terminal command before proceeding.
    setarch $(uname -m) --addr-no-randomize /bin/bash
    1. Compile and run the memory.c program. Make sure you understand the output.
    2. Draw a small memory diagram showing the location of each of the variables in the program. Are they allocated in the same order that you declared them? Is there any empty space between them?
    3. Modify the program by rearranging the variable declarations and/or changing the length of the string. (In particular, try a string that uses 5 or 7 bytes, including the null terminator.) Does this change the results you got previously?

The take-home message:

Small changes within a program can change how memory is laid out for a given program. The compiler will try to arrange memory for optimal performance, and this may include aligning variables with 4-byte boundaries. For C programmers, this can sometimes mean that a program which appears to work correctly (but in fact overwrites the end of an array), can suddenly stop working due to seemingly innocuous changes—for example, changing the order in which variables are declared.

Allocating and Freeing Memory

The Variable-Length Array (VLA) option within 1999 Standard C allows yet another mechanism for declaring arrays, as follows.

	int numapps = 0;
printf("Number of applicants? \n");
scanf("%d", &numapps);
Applicant roster[numapps];

The size of an array is a variable (numapps), and a value must be assigned to this variable before space for the array is allocated. This technique allows the user to specify the size of an array at run time (possibly getting it from the user); but once the array is declared, its size cannot be changed. (The details are discussed in King, Section 8.3.)

  1. What can the compiler tell us about the size of VLAs, versus statically declared arrays, versus dynamically allocated memory? To answer this question, consider the following program.
    /* Program to compare sizes of static arrays, VLAs, and malloced memory. */
    #include <stdio.h>
    #include <stdlib.h>
    
    int
    main()
    {
      int length = 5;
    
      int staticArray[5];                  /* Compiler statically allocated array */
      int varLenArray[length];             /* Program dynamically allocates array */
                                       /* Programmer dynamically allocates memory */
      int * dynamicArray = malloc(length * sizeof(int)); 
    
      if (dynamicArray==NULL) {      /* Verify memory was available and allocated */
        printf("unable to allocate dynamic array. exiting.");
        return 1;
      }
      
      printf("sizeof(int) = %lu\n", sizeof(int) );
      printf("sizeof(staticArray) = %lu\n", sizeof(staticArray) );
      printf("sizeof(varLenArray) = %lu\n", sizeof(varLenArray) );
      printf("sizeof(dynamicArray) = %lu\n", sizeof(dynamicArray) );
    
      free(dynamicArray);                               /* Free memory allocation */
    }
    
    1. What output do you expect this program to produce?
    2. Copy, compile, and run the program to verify your predictions.

The take-home message:

You should only use the sizeof operator on types, and never on variable names. Such uses frequently lead to confusion, especially with variables that are passed to functions as parameters.

  1. We are going to practice with memory allocation by creating a roster of applicants for a student club. Each student's information will be stored in a variable that this a structure type. But since we hope to have quite a lot of them (and we are not sure how many applicants we will have until we start running the program), we will use a variable length array to store the collection of applicants. Once the records are entered, the program will print them in entry order and then will print them in reverse order. So, copy, compile and run club-roster.c and be sure you understand how it works. (Use a small number for the number of applicants!)

  2. Next, we explore the alternative strategy using dynamic memory allocation. You might want to comment out the existing declaration for roster rather than deleting it!
    1. Within club-roster.c, replace the declaration
        	Applicant roster[numapps];
      with the lines
        Applicant * roster;
        roster = malloc (numapps * sizeof (Applicant));
      Notes:
      • In this revised declaration, roster is a pointer to an array of applicants. That is, roster identifies the location of an array where each element has type Applicant.
      • In the first line above, roster only indicates a location for an array. Thus the program must allocate space for the array separately, in the second line. The malloc statement asks the C library to perform this memory allocation.
      • Once declared and initialized, references to the roster array are exactly the same as in the original version.
    2. Add a statement to verify the memory allocation (printing an error message and exiting if it fails).
    3. Add a statement to free the memory when it is no longer needed.
    4. Recompile and run the revised program club-roster.c.
    5. Draw a diagram of main memory for both the original and revised versions of club-roster.c. In the diagram, show what variables are stored on the run-time stack and what information (if any) is stored elsewhere.

Memory Leaks and Other Problems

  1. Consider the following program.
    #include <stdio.h>
    #include <stdlib.h>
    
    int
    main (void)
    {
      int j=0;
    
      while (j>=0) {
        int n = 100000000;
        int * a = (int*) malloc (n * sizeof(int));
        
        for (int i=0; i < n; i++)
          a[i] = i;
        
        j++;
        printf ("%d\n", j);
      }
    
      return 0;
    } // main
    
    1. What is wrong with the program? What do you expect it to do when run?
    2. Now copy the program and run it. On some machines, it prints numbers up to between 20 or 30 before it crashes. How about yours? Do you understand why it crashes?

      (If you find yourself waiting for the crash, read on below about Address Sanitizer.)

    3. Add the following code immediately after the malloc call to confirm your understanding.
      if (a==NULL)
        {
          perror ("Error allocating memory");
          exit (EXIT_FAILURE);
        }
      The library function perror(), declared in stdio.h, prints a message regarding the most recent error that occurred in any system or C library call. Thus, with this placement, perror will print any error that may occur related to malloc. (We will discuss system calls later in the course.)

      If you still are not sure why the error occurred, please ask.

Detecting Memory Errors

In the next few exercises, you will experiment with a tool built-in to clang (but not gcc) named AddressSanitizer (or ASan) that can detect and report on several types of errors related to dynamic memory management:

ASan will invoke your executable code line by line. This allows it to monitor your use of memory and report related errors. It also adds a lot of overhead, so you may notice that it runs slowly.

  1. Modify your program from the previous exercise so that it allocates (and fails to free) only ten arrays by changing your while loop to while (j < 10). To enable AddressSanitizer in your executable, you need to use the compiler flag -fsanitize=address along with the -g flag to get the helpful debugging symbols. Using make, you'd add:
    make CFLAGS="-g -fsanitize=address" target
  2. After compling with ASan enabled, run your program. Your program will function normally until AddressSanitizer detects one of the errors above, spewing out a lot of diagnostic information related to the problem.
  3. Read through the output to make sure you understand what problem it has diagnosed and where (i.e., what line of code) it manifests on.
  4. Modify your code from the previous exercise to free the memory you have allocated. Note that you will need a call to free in each loop iteration, so that you can free the memory before you lose the pointer to it!

    Now rebuild your code with ASan again and run it. What happens to the output when you run the program?

  5. In this exercise, you will experiment with a few more memory-related errors ASan can catch.
    1. Add an extra call to free() somewhere in your program. Then rebuild your program, run it, and examine the diagnostic output. (After you have done so, remove the offending call again.)
    2. Another common error that ASan can catch is accessing memory after it has been freed (known as "use free after" or "dangling pointers dereference"). To test this, add statements such as the following immediately after your call to free().
      a[0] = 5;
      printf("a[0]=%d\n", a[0]);
      Compile your program, run it, and study the ASan diagnostic output. After you understand what "use after free" means, remove the offending code from your program.
    3. ASan can also tell you when you access elements that are out-of-bounds of an allocated memory block. Modify your program to test this, noting what information ASan gives you about the error. (Then remove the error afterwards.)
  6. Note that ASan will stop your program when it encounters the first three kinds of memory errors, but it can only report a memory leak after the program completes. To learn more, look at the on-line documentation for AddressSanitizer.

Note that compilers have become increasingly "smart" and may fix obvious errors for you, such as forgetting to free memory in a loop, and so there will be no error for the AddressSanitizer to catch if you do this lab on your personal computer.