Assignment 3 – Exploring course data

Assigned
Friday, Feb 7, 2020
Due
Thursday, Feb 13, 2020 by 10:30pm
Summary
For this assignment, you will write procedures for exploring data about course at Grinnell.
Collaboration
You must work with your assigned partner(s) on this assignment. You may discuss this assignment with anyone, provided you credit such discussions when you submit the assignment.
Submitting
Email your answer to csc151-02-grader@grinnell.edu. The subject of your email should be [CSC151-02] Assignment 3 and should contain your answers to all parts of the assignment. Submit your scheme code as an attachment for assignments.

Throughout this course, we will be considering how to represent and process collections of data. At present, you know only a limited set of operations. You can extract portions of strings, do some numeric computation, define procedures, and work with lists using procedures like map, reduce, and drop. Perhaps not so surprisingly, even this limited set of operations provide you with some power to explore data sets.

The file courses.rkt contains a list of courses called courses. Each course is represented as a string, such as "80631 CSC-151-02 4.00 00 32 Functional Prob Solving w/lab" or "80495 THD-245-01 4.00 04 12 Lighting for the Stage". In case you could not tell, the string is arranged as follows.

  • Five characters that represent some special number used by the Office of the Registrar.
  • A space.
  • Three letters that represent the department.
  • A dash.
  • Three digits that represent the course number.
  • A dash.
  • Two digits that represent the section number.
  • A space.
  • Four characters that represent the number of credits associated with the course.
  • A space.
  • Two digits that represent the number of available spaces.
  • A space.
  • Two digits that represent the capacity of the class.
  • A space.
  • An arbitrary number of characters that represent the course title.

This approach to representing data is typically called “fixed-width fields”.

Preparation

  • Make a copy of courses.rkt. You should not change this file.
  • Make a copy of hw03.rkt. You will put your answers to the assignment in this class.
  • Rename hw03.rkt to include your user names. For example, Fahmida and Charlie might call it hamidfah-curtsing-hw03.rkt.

Problem 1: Extracting course information

Topics: Writing your own procedures, Strings, Numbers

Our first step is to figure out how to extract all of the portions of a course. Write the following procedures.

  • (course-code course), which extracts the five-digit code as a string.
  • (course-department course), which extracts the three-letter department code as a string.
  • (course-number course), which extracts the three-digit course number as a string.
  • (course-section course), which extracts the two-digit course section as a string.
  • (course-credits course), which extracts the course credits as a real number.
  • (course-available course), which extracts the number of available seats as an integer.
  • (course-capacity course), which extracts the capacity of the as an integer.
  • (course-name course), which extracts the name of the course.

For example,

> (course-code "80495 THD-245-01 4.00 04 12 Lighting for the Stage")
"80495"
> (course-department "80495 THD-245-01 4.00 04 12 Lighting for the Stage")
"THD"
> (course-number "80495 THD-245-01 4.00 04 12 Lighting for the Stage")
"245"
> (course-section "80495 THD-245-01 4.00 04 12 Lighting for the Stage")
"01"
> (course-credits "80495 THD-245-01 4.00 04 12 Lighting for the Stage")
4.0
> (course-available "80495 THD-245-01 4.00 04 12 Lighting for the Stage")
4
> (course-capacity "80495 THD-245-01 4.00 04 12 Lighting for the Stage")
12
> (course-name "80495 THD-245-01 4.00 04 12 Lighting for the Stage")
"Lighting for the Stage"

Problem 2: Deriving course information

Topics: Writing your own procedures, Numeric computation

In addition to extracting information, we may also want to compute values based on the information in a course. Write the following procedures:

  • (course-enrollment course), which gives the number of students enrolled in a course.
  • (course-level course), which gives the “level” of a course as a string (“100”, “200”, “300”, or “400”).
  • (course-sch course), which gives the “student credit hours” for a course, the product of the number of students and the number of credits.

For example,

> (course-enrollment "80495 THD-245-01 4.00 04 12 Lighting for the Stage")
8
> (course-level "80495 THD-245-01 4.00 04 12 Lighting for the Stage")
"200"
> (course-sch "80495 THD-245-01 4.00 04 12 Lighting for the Stage")
32.0
> (course-enrollment "81348 ALS-100-04 2.00 -1 08 Brazilian Portuguese I")
9
> (course-level "81348 ALS-100-04 2.00 -1 08 Brazilian Portuguese I")
"100"
> (course-sch "81348 ALS-100-04 2.00 -1 08 Brazilian Portuguese I")
18.0

Problem 3: Sorting lists of courses

Topics: Writing your own procedures, Sorting, Strings and string manipulation

We’ve been working with individual courses. But we have a list of courses. And there are, of course, many things we can do with a list.

You can find the first few courses with a command like (take courses 20). As you may have noticed, the courses are organized by course number. What if we wanted to organize them in a different way?

Write and document (with 6P’s) a procedure, (sort-courses-by-course courses), that rearranges the courses so that they are sorted by the course info. Your procedure must return a list of courses ordered by department. For courses in the same department, they must be ordered by course number. When two courses have the same department and course number they should be ordered by section number.

Hint: Recall that you can sort a list of strings with (sort courses string<=?). How can you sort using part of the string? Here’s one approach: Put the course info at the start of each string (using map), sort the new list of changed strings (using sort), and then strip out the portion that you added (using map again).

Problem 4: Filtering lists of courses

Topics: Writing your own procedures, Sorting, Lists and list operations

What else can we do with the list of courses? We might want to extract sublists of courses. For example, it would be useful to have just the courses in Theatre and Dance, or just in Computer Science.

a. Write and document (with 6P’s) a procedure, (filter-department courses department), that extracts all of the courses in a particular department. You may not use the filter operation (which you should not yet have learned).

How will you write this procedure? You can achieve these results with a combination of drop, map, sorting, index-of, reverse and take.

Hint: You may want to build some associated lists, such as a list of just the department identifier.

Hint: Break the problem down into parts and write a separate procedure for each part. Then tie them together.

b. Write and document (with 6P’s) a procedure, (filter-level courses level), that extracts all of the courses at a particular level (100, 200, 300 or 400).

Problem 5: Computing course statistics

Topics: Writing your own procedures, Lists and list operations, Numeric computation

a. Write and document (with 6P’s) a procedure, (average-course-size courses), that takes a list of courses as input and finds out the average course capacity size.

b. Using average-course-size and other procedures you have written or write, find out the average course size in five departments of your choice. (When you submit this exercise, include the expressions you typed to compute those average sizes.)

c. Using average-course-size and other procedures, find out the average courses size at each course level.

d. Write a RackUnit test suite for average-course-size. You may assume that the inputs to average-course-size have the proper types and are formatted correctly, but be sure to consider course lists different than the example given.

We are likely to run your tests on some non-working versions of instructors-of. Your tests should catch any reasonable errors. Try to think of edge cases, or reasonable ways in which someone could write a non-working version.

Problem 6: Exploring course information

Topics: Writing your own procedures, Lists and list operations, Numeric computation, Data science

Using the tools you have written and any other you choose to write, find three other interesting characteristics of the course offerings at Grinnell.

Note: This problem is intentionally left open ended. Your challenge is to think about what kinds of information you might extract or compute.

Evaluation

We will primarily evaluate your work on correctness (does your code compute what it’s supposed to and are your procedure descriptions accurate); clarity (is it easy to tell what your code does and how it achieves its results; is your writing clear and free of jargon); and concision (have you kept your work short and clean, rather than long and rambly).