Array-based Lists
Today we introduced the notion of an abstract data type, a type that exposes a set of operations on its values, its interface, but does not expose how it is implemented. Abstract data types are a foundational concept in computer science. They allow us to build interchangable software components that we can develop independently, a requirement for architecting large software systems.
In this lab, we'll build a complete C project that contains:
- The definition of a
charlist
abstract data type that represents a list ofchar
values. - An implementation of the
charlist
abstract data type using arrays, an array list implementation. - A
main
function that demonstrates usage of acharlist
to implement a functiongetline
that retrieves a complete line of input from the user.
Part 1: Motivation
Consider the following implementation of a function that attempts to get a line of input from the user:
#include <stdio.h>
void fixed_getline(char *buf, FILE *in) {
char ch = fgetc(in);
while (ch != '\n') {
*buf = ch;
buf++;
ch = fgetc(in);
}
*buf = '\0';
}
Now consider the following main
function that invokes fixed_getline
to get a line of input from the user:
int main(void) {
printf("What is your name? ");
char name [10];
fixed_getline(name);
printf("Hello %s!", name);
// Point A
}
With your partner, discuss a scenario where the user's input causes a memory error in the program. In a sentence or two, classify this error and why it arises in the code.
Part 2: Setting Up the Project
Next, we will set up our project as a multifile C program. Create a new folder and add the following files to it:
charlist.h
: a C header file defining thecharlist
interface.charlist.c
: an implementation of thecharlist
interface using arrays.main.c
: the entry point of the program which will demonstrate usage ofcharlist
.
In addition to these files, we will also add a Makefile
to the folder so that we can build the program in a single short command, make
, rather than typing out the clang
invocation directly.
Add a file called Makefile
to this folder with the following contents:
main : charlist.c main.c
clang -Wall -Werror -g $^ -o $@
.PHONY : clean
clean:
rm -rf main
Note that the indentation in this file are genuine tab characters and not spaces.
With this Makefile
, you should be able to issue the command make
in the folder to build the program main
.
Part 3: Array-based Lists
Before we dive into the code, let's first understand how to implement lists using arrays.
Recall from the reading that an array is a fixed-sized, homogeneous, sequential structure.
Notably, a list is like an array but it can grow at runtime!
Thus, our choice of backing datatype for our charlist
will be an array with some extra data needed to facilitate dynamic grow.
As a starting point, let's imagine the following scenario for our charlist
:
['a']['b']['c'][ ][ ]
Here, our charlist
contains three elements, 'a'
, 'b'
and 'c'
whereas the backing array has five slots in total.
Thus, in our charlist
, we must track two values:
- The number of elements of contained by the
charlist
, its size. - The overall size of the backing array, its capacity.
Throughout the lifetime of a charlist
, we'll maintain the invariant that a charlist
's size never exceeds its capacity.
With this basic representation in mind, briefly discuss the following questions with your partner. Ask a member of the course staff if you are unsure about any of them!
- Why must our
charlist
maintain this "size ≤ capacity
" invariant? - How would we add a new character, say
'd'
to ourcharlist
? Where would'd'
go and how would we update oursize
andcapacity
accordingly? - Given that the elements of the list are stored in an array, how can we retrieve, say the element
'b'
from the list?
Part 4: The charlist
Interface and Array-based Lists
Next, we'll define the charlist
interface in charlist.h
interface.
Recall from the reading that an ADTs interface defines (a) the datatypes and (b) functions exposed by the ADT.
Implementation of these functions will eventually be found in charlist.c
.
To complete the charlist
interface, add the following to charlist.h
.
-
A
#pragma once
directive at the top of the file. Agree with your partner on what this directive does and ask a member of the course staff if you are still unsure! -
Add a definition for a
typedef
edstruct
calledcharlist
.charlist
should contain three fields corresponding to data identified in part 3. -
Signatures for the following functions over
charlist
s that we will implement:make_charlist
takes no arguments and returns a pointer to a newly allocatedcharlist
on the heap.charlist_size
takes a pointer to acharlist
as input and returns the size of thecharlist
as output.charlist_get
takes a pointer to acharlist
and index as input and returns the element at that index as output.free_charlist
takes a pointer to acharlist
and deallocates it.charlist_add
takes a pointer to acharlist
and achar
as input and adds thatchar
to the end of thecharlist
. The function does not return anything as output.
Part 5: The charlist
Implementation
Now, let's implement the first four charlist
functions in charlist.c
.
In charlist.c
you should first include a #include "charlist.h"
directive so that our implementations are consistent with the signatures defined in our interface.
From there, implement:
make_charlist
. You will find that you will need to pick an initial size for yourcharlist
's backing array. For now, you can initialize thecharlist
's array to be an array of size 10.charlist_size
. If yourcharlist
struct
is set up correctly, this function should be a one-liner!charlist_get
. Likecharlist_size
, this function should also be a one-liner. For now, you do not need to worry about the case where the index is not valid for thecharlist
.free_charlist
. In this function make sure you deallocate not only the memory for thecharlist
struct
but also its backing array!
Part 6: charlist
Add
As you saw in part 3, when there is still room in the charlist
's backing array, adding an element onto the end of charlist
is straightforward.
However, what happens when the charlist
's backing array is full?
['a']['b']['c']['d']['e']
Suppose we wanted to add 'f'
to this list.
How can we do it when the backing array is full?
Our strategy is to:
- Allocate a new backing array with more space. To avoid having to constantly reallocate memory, let's allocate a new backing array that is twice the size of the original.
- Copy the element from the old backing array to the new backing array.
- Update our
charlist
to point to this new backing array, updating the auxiliary fields as needed. Additionally, we should alsofree
the old backing array since it is no longer needed.
In our example, we would allocate a new array of size 10 since the backing array has size 5 and copy the elements over.
We can then add 'f'
to the end of the new array!
['a']['b']['c']['d']['e']
| | | | |
∨ ∨ ∨ ∨ ∨
['a']['b']['c']['d']['e']['f'][ ][ ][ ][ ]
Implement this functionality in charlist_add
.
(Hint: critically, charlist_add
should do two different depending on whether the backing array is full.
How can we test whether the backing array is full with the fields in the charlist
struct
?)
Part 7: Trying Out charlist
At this point, we have a complete charlist
implementation.
But we should make sure everything is in working order.
Copy the following skeleton into your main.c
file, filling in the middle portion with appropriate code that exercises your implementation.
#include <stdio.h>
// This directive is necessary for main.c to reference our charlist functions!
#include "charlist.h"
int main(void) {
charlist *lst = make_charlist();
// TODO: perform operations on lst and check that they work correctly!
free_charlist(lst);
return 0;
}
Part 8: Line-by-line Input
Now that our charlist
implementation is complete, we can now use it for a pragmatic task: implementing getline
in a memory fashion.
Add an implementation of void my_getline(charlist *buf, FILE *in)
to main.c
.
You can base your implementation on fixed_getline
found in part 1, replacing the explicit array manipulation with calls to charlist
functions.
Once you are done, try out my_getline
in main.c
on a fresh charlist
, ensuring that you appropriately store the line of input entered by the user into the input charlist
buffer.
(Hint 1: when calling my_getline
, you can pass stdin
for the in
parameter to read from the user's keyboard, i.e., standard input.
Note that this parameterization means that we can read lines from any stream, including files via fopen
!)
(Hint 2: provided that you add a '\0'
null-terminator to the end of the buffer, you can simply pass the backing buffer to a printf
call to check your work because it ought to have type char*
!)