Array-based Lists

Today we introduced the notion of an abstract data type, a type that exposes a set of operations on its values, its interface, but does not expose how it is implemented. Abstract data types are a foundational concept in computer science. They allow us to build interchangable software components that we can develop independently, a requirement for architecting large software systems.

In this lab, we'll build a complete C project that contains:

  • The definition of a charlist abstract data type that represents a list of char values.
  • An implementation of the charlist abstract data type using arrays, an array list implementation.
  • A main function that demonstrates usage of a charlist to implement a function getline that retrieves a complete line of input from the user.

Part 1: Motivation

Consider the following implementation of a function that attempts to get a line of input from the user:

#include <stdio.h>

void fixed_getline(char *buf, FILE *in) {
    char ch = fgetc(in);
    while (ch != '\n') {
        *buf = ch;
        buf++;
        ch = fgetc(in);
    }
    *buf = '\0';
}

Now consider the following main function that invokes fixed_getline to get a line of input from the user:

int main(void) {
    printf("What is your name? ");
    char name [10];
    fixed_getline(name);
    printf("Hello %s!", name);
    // Point A
}

With your partner, discuss a scenario where the user's input causes a memory error in the program. In a sentence or two, classify this error and why it arises in the code.

Part 2: Setting Up the Project

Next, we will set up our project as a multifile C program. Create a new folder and add the following files to it:

  • charlist.h: a C header file defining the charlist interface.
  • charlist.c: an implementation of the charlist interface using arrays.
  • main.c: the entry point of the program which will demonstrate usage of charlist.

In addition to these files, we will also add a Makefile to the folder so that we can build the program in a single short command, make, rather than typing out the clang invocation directly. Add a file called Makefile to this folder with the following contents:

main : charlist.c main.c
	clang -Wall -Werror -g $^ -o $@

.PHONY : clean
clean:
	rm -rf main

Note that the indentation in this file are genuine tab characters and not spaces. With this Makefile, you should be able to issue the command make in the folder to build the program main.

Part 3: Array-based Lists

Before we dive into the code, let's first understand how to implement lists using arrays. Recall from the reading that an array is a fixed-sized, homogeneous, sequential structure. Notably, a list is like an array but it can grow at runtime! Thus, our choice of backing datatype for our charlist will be an array with some extra data needed to facilitate dynamic grow.

As a starting point, let's imagine the following scenario for our charlist:

['a']['b']['c'][   ][   ]

Here, our charlist contains three elements, 'a', 'b' and 'c' whereas the backing array has five slots in total. Thus, in our charlist, we must track two values:

  • The number of elements of contained by the charlist, its size.
  • The overall size of the backing array, its capacity.

Throughout the lifetime of a charlist, we'll maintain the invariant that a charlist's size never exceeds its capacity.

With this basic representation in mind, briefly discuss the following questions with your partner. Ask a member of the course staff if you are unsure about any of them!

  1. Why must our charlist maintain this "size ≤ capacity" invariant?
  2. How would we add a new character, say 'd' to our charlist? Where would 'd' go and how would we update our size and capacity accordingly?
  3. Given that the elements of the list are stored in an array, how can we retrieve, say the element 'b' from the list?

Part 4: The charlist Interface and Array-based Lists

Next, we'll define the charlist interface in charlist.h interface. Recall from the reading that an ADTs interface defines (a) the datatypes and (b) functions exposed by the ADT. Implementation of these functions will eventually be found in charlist.c.

To complete the charlist interface, add the following to charlist.h.

  • A #pragma once directive at the top of the file. Agree with your partner on what this directive does and ask a member of the course staff if you are still unsure!

  • Add a definition for a typedefed struct called charlist. charlist should contain three fields corresponding to data identified in part 3.

  • Signatures for the following functions over charlists that we will implement:

    • make_charlist takes no arguments and returns a pointer to a newly allocated charlist on the heap.
    • charlist_size takes a pointer to a charlist as input and returns the size of the charlist as output.
    • charlist_get takes a pointer to a charlist and index as input and returns the element at that index as output.
    • free_charlist takes a pointer to a charlist and deallocates it.
    • charlist_add takes a pointer to a charlist and a char as input and adds that char to the end of the charlist. The function does not return anything as output.

Part 5: The charlist Implementation

Now, let's implement the first four charlist functions in charlist.c. In charlist.c you should first include a #include "charlist.h" directive so that our implementations are consistent with the signatures defined in our interface. From there, implement:

  • make_charlist. You will find that you will need to pick an initial size for your charlist's backing array. For now, you can initialize the charlist's array to be an array of size 10.
  • charlist_size. If your charlist struct is set up correctly, this function should be a one-liner!
  • charlist_get. Like charlist_size, this function should also be a one-liner. For now, you do not need to worry about the case where the index is not valid for the charlist.
  • free_charlist. In this function make sure you deallocate not only the memory for the charlist struct but also its backing array!

Part 6: charlist Add

As you saw in part 3, when there is still room in the charlist's backing array, adding an element onto the end of charlist is straightforward. However, what happens when the charlist's backing array is full?

['a']['b']['c']['d']['e']

Suppose we wanted to add 'f' to this list. How can we do it when the backing array is full?

Our strategy is to:

  1. Allocate a new backing array with more space. To avoid having to constantly reallocate memory, let's allocate a new backing array that is twice the size of the original.
  2. Copy the element from the old backing array to the new backing array.
  3. Update our charlist to point to this new backing array, updating the auxiliary fields as needed. Additionally, we should also free the old backing array since it is no longer needed.

In our example, we would allocate a new array of size 10 since the backing array has size 5 and copy the elements over. We can then add 'f' to the end of the new array!

['a']['b']['c']['d']['e']
  |    |    |    |    |
  ∨    ∨    ∨    ∨    ∨
['a']['b']['c']['d']['e']['f'][   ][   ][   ][   ]

Implement this functionality in charlist_add.

(Hint: critically, charlist_add should do two different depending on whether the backing array is full. How can we test whether the backing array is full with the fields in the charlist struct?)

Part 7: Trying Out charlist

At this point, we have a complete charlist implementation. But we should make sure everything is in working order. Copy the following skeleton into your main.c file, filling in the middle portion with appropriate code that exercises your implementation.

#include <stdio.h>

// This directive is necessary for main.c to reference our charlist functions!
#include "charlist.h"

int main(void) {
    charlist *lst = make_charlist();

    // TODO: perform operations on lst and check that they work correctly!

    free_charlist(lst);
    return 0;
}

Part 8: Line-by-line Input

Now that our charlist implementation is complete, we can now use it for a pragmatic task: implementing getline in a memory fashion. Add an implementation of void my_getline(charlist *buf, FILE *in) to main.c. You can base your implementation on fixed_getline found in part 1, replacing the explicit array manipulation with calls to charlist functions.

Once you are done, try out my_getline in main.c on a fresh charlist, ensuring that you appropriately store the line of input entered by the user into the input charlist buffer.

(Hint 1: when calling my_getline, you can pass stdin for the in parameter to read from the user's keyboard, i.e., standard input. Note that this parameterization means that we can read lines from any stream, including files via fopen!)

(Hint 2: provided that you add a '\0' null-terminator to the end of the buffer, you can simply pass the backing buffer to a printf call to check your work because it ought to have type char*!)