Mini-Project 2: Check digits

Assigned: Wednesday, 14 April 2021
Summary: In this demonstration exercise, we will explore check digits, a technique for verifying that we correctly enter large strings of numbers, e.g., for accounts numbers or credit cards.
Collaboration: Each student should submit their own responses to this assignment. You may consult other students in the class as you review the course materials and work on the assignment. If you receive help from anyone, make sure to cite them in your responses. If you refer to reference pages on the course Web site or elsewhere, you should cite them.

Many important pieces of data are represented as large strings of digits, for example, account numbers or credit card numbers. Furthermore, these data are entered in by hand with alarming frequency. What happens if we mistype one or two characters of our credit card number into an online vendor’s website? Will we mistakenly use someone else’s account?

It turns out that these account numbers are designed so that this is not the case. Most such strings of numbers include a check digit which is an extra number calculated from the other numbers in the string. In this manner, a system can check that an account number is entered correctly because if it isn’t, then the calculated check digit will differ from the check digit present in entered account number. In this project we will work on developing your string skills by exploring the check-digit ideas described above.

Preliminaries & Background

Documentation

For each function that you write in this mini project, include a function comment that captures the types of the function as well as describes its output in a sentence or two. For example, here is a function comment for a function that finds the minimum of three numbers:

;;; (min-of-three x y z) -> real?
;;;   x : real?
;;;   y : real?
;;;   z : real?
;;; Returns the minimum of x, y, and z
(define min-of-three
  (lambda (x y z)
    (cond [(and (<= x y) (<= x z))
           x]
	  [(and (<= y x) (<= y z))
	   y]
	  [else
	  z])))

There are three important components to this function comment that your documentation should include:

(min-of-three x y z) -> real? : The signature of the function which names its arguments and describes the output type of the function. In Racket, we express the types with the predicate functions that we use in code to test whether an expression has that type. For example, this signature says that min-of-three has three arguments and that it procedures a real number (as tested by the real?` function).
x : real? ... : The types of each of the parameters mentioned in the signature.
Returns the minimum of x, y, and z : A brief sentence or two which describes the behavior and output of the function. Here, the behavior of the function is simple so we have comparatively little to say: the function returns the minimum of its arguments.

Tests

Up until this point, we have asked you to experiment with the functions that you write in the interactions pane to check for correctness. This has the upside of being fast, but if you change your code, you need to manually type in all those tests again which is tedious. A better solution is to codify your tests in your code so that you can rerun the tests at will.

We will later learn formally about a DrRacket library called rackunit, that makes test execution a breeze. For now, we’ll look at two basic RackUnit procedures: (test-true DESCRIPTION EXP) and (test-false DESCRIPTION EXP). These procedures evaluate the expression EXP and prints out the description when the expression does not evaluate as expected. For example:

;;; (even-number? num) -> boolean?
;;;    num : number?
;;; Determines whether num is an even number
(define even-number?
  (lambda (num)
    (and (number? num) (= (remainder num 2) 0))))

> (test-true "even number" (even-number? 12))
> (test-true "zero" (even-number? 0))
> (test-true "negative even number" (even-number? -8))
> (test-false "odd number" (even-number? 3))
> (test-false "large even number" (even-number? 100000))
----------
large even number
FAILURE
name:      check-false 
location:  interactions from an unsaved editor:15:2
params:    '(#t)
----------

As you can see, when the test suceeds, we get no output. When the test fails, we get a large failure notice. We can take advantage of this behavior by putting the tests immediately after our code in the definitions pane. To use these procedures we need to (require rackunit) at the top of the definitions pane.

Turn-in details

For this assignment, create one single file titled isbn.rkt. Include your answers to all parts in this single file. Your file should contain a header (see example below), and should clearly label the various parts and subproblems using comments. Make sure to organize the file so that it’s easy for another human (e.g. your professor, mentors, graders) to read it. Turn in your file on Gradescope, and attempt to fix any errors that appear from the autograder.

# lang racket
(require rackunit)
(require csc151)

; isbn.rkt
; Author: Stu Dent
; Class: 151-02 Spring 02 2021
; Mini-Project 2, Parts 1-3
; Date: April 15, 2021
; Citations:
;  XXX
;  YYY

; Code here

Part 1: String Utilities

In this problem we will write a number of procedures that we may find useful later on when working with strings.

a. Write a procedure (all-digits? str), that takes in a string as input and determines if that string contains only digits. That is, it should return #t, if str contains only digits, and #f otherwise. Include documentation for all-digits?.

Hint: You can do this in a couple of lines of code by calling just a few of the procedures we learned in the readings on basic types. If you’re stuck on this one, ask for a hint.

b. Write six or more tests (including at least two that use test-true and at least two that use test-false) that one could use to determine whether all-digits? is behaving correctly. Make sure to consider edge cases, such as when the string starts with a plus sign or includes a period.

c. Write and document a procedure (digit->integer char) that takes a character as input and prodcues the integer that corresponds to that input. If the input is not a digit character, your procedure should return -1.

> (digit->integer #\2)
2
> (digit->integer #\a)
-1

Note that simply using char->integer doesn’t get you what you might expect:

> (char->integer #\0)
48

Why 48? It’s the sequence number of the character #\0. This is certainly not what we want!

How can we convert a digit-character into a number? An important property of these sequence numbers is that #\0 through #\9 are assigned sequential sequence numbers:

> (map char->integer (list #\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9))
'(48 49 50 51 52 53 54 55 56 57)

With this fact in mind, you can derive a simple computation involving the digit-character to convert that digit-character into a number.

d. Document and write a procedure (trim str) that takes a string as input, checks whether there is space at the beginning or a space at the end, and if so, removes those spaces.

e. Document and write a procedure (remove-spaces str) that takes a string as input and removes any spaces from that string. You may find string-replace or string-split useful.

f. Document and implement two more procedures that work with characters or strings that you expect might be useful.

g. Write tests for each of the procedures you wrote in parts (c) - (e). Each procedure should have 6 tests. There are no requirements on how many times you use test-true or test-false, but you should attempt to cover a variety of edge cases.

You should feel free to use any of these procedures freely in the remaining part of this project.

Part 2: International Standard Book Numbers (ISBNs)

International Standard Book Numbers (ISBNs) serve as unique identifiers for books. For example, the ISBN-10 for The Book of M: A Novel is:

0062669613

(Note that ISBN-10 refers to the 10-digit version of these numbers. There also exists an 13-digit version of these numbers, ISBN-13, that was created to accommodate more books. In this assignment, we’ll focus exclusively on ISBN-10s.)

The first nine numbers of the ISBN contain various information, e.g., the location in which the title was registered and the publisher identity. The final digit is called the check digit. In our example above, 3 is the check digit.

Calculating check digits

The check digit is not arbitrarily assigned. It is instead calculated from the other 9 numbers of the ISBN number as follows:

First, we take each of the 9 non-check digits of the ISBN and, in left-to-right order, multiply the first number by 10, the second number by 9, and so forth. We then sum up the results. Note that there are 9 such digits, so the left-most is multiplied by 10 and the right-most number is multiplied by 2. In our example, we would compute:
```
10×0 + 9×0 + 8×6 + 7×2 + 6×6 + 5×6 + 4×9 + 3×6 + 2×1 = 184
```
Next, we take the sum s and compute s modulo 11. Recall that modulo computes the remainder of the division of the two numbers. This will result in a number in the range 0 through 11. In our example:
```
184 mod 11 = 8
```
We then subtract this result from 11 to arrive at the check digit:
```
11 - 8 = 3
```
*Note: This process could result in a value of 11 - if that happens, it should actually be a zero *

Note that the expected check digit based on the remaining digits of the ISBN is 3 which is the check digit of the actual ISBN. This means that the ISBN is correctly formed and likely doesn’t contain a typo!

‘X’ check digit values

Note that check digits are a value in the range of 0 to 10. However 10 is not a single digit—it is made up of 2 digits (in base 10)! In this case, rather than using 10, we use X for the check digit value which is why you sometimes see ISBN-10s that end in X!

For example, let’s consider checking the ISBN 123456789X:

First we compute the multiplied sum of the first 9 digits of the ISBN as described above:
```
10×1 + 9×2 + 8×3 + 7×4 + 6×5 + 5×6 + 4×7 + 3×8 + 2×9 = 210.
```
We compute the sum modulo 11: 210 mod 11 = 1
We finally subtract this quantity from 11: 11 - 1 = 10

So the expected check digit is 10 but really X in the ISBN-10. Note that this is precisely what the ISBN 123456789X ends in!

The ISBN checker program

In this problem, you will create a procedure valid-isbn?:

As input, valid-isbn? takes in a string that contains an ISBN-10 number to be verified.
As output, (valid-isbn? str) produces #t if str is a valid ISBN-10 number according to its check digit and #f otherwise.

If valid-isbn? is not given a string that is not formatted correctly, then the procedure also returns #f. An input string is formatted correctly if:

The string is 10 characters.
Each character is drawn from the digits 0–9 (#\0 … #\9) or the letter ‘X’ (#\X).

For example:

> (valid-isbn? "0062669613")
#t
> (valid-isbn? "0062569613")
#f
> (valid-isbn? "123456789X")
#t
> (valid-isbn? "123456780X")
#f
> (valid-isbn? "hello world!")
#f

Note that the input to valid-isbn? is a string! It is not a single integer—note that because one of the characters of the ISBN could be ‘X’, the input cannot be an integer.

You should use your decomposition techniques to break up this procedure into relevant sub-problems so that the program is manageable to write. In particular, you ought to write a isbn? (or, better yet, a correct-isbn-format?) procedure that checks to see if its input is a correctly format (but not necessarily valid) ISBN according to the rules above.

There are a number of procedures that we have discussed in the course so far that you may find useful for this demonstration exercise. We list some of them here for your reference:

(modulo x y): computes x mod y, the whole number reminder left over after dividing x and y.
(string->list str): returns a list that is composed of the individual characters of str.
(reverse lst): returns lst, but reversed.
(range n): returns a list of the numbers 0 … n-1.
(map f l): returns the list l but with every element of l transformed by procedure f.
(foldl f init l): returns the result of applying binary procedure f to every element of l in left-to-right order, starting with initial value init.
(list-ref lst n): retrieves nth element of lst. Note that lists are zero-indexed, so the index of the first element is 0, and the index of the last element is (- (length lst) 1).
(char->integer c) converts a character c into its sequence number. (NOTE: (char->integer #\0) does not evaluate to 0!) See below for how to use this procedure to convert characters that are digits into integers you can do computation over.)

Be sure to also include documentation as outlined at the start of this assigment.

Part 3: Testing your ISBN checker

Write at least 10 (ten) test cases for valid-isbn?.

(test-true MSG (valid-isbn? ...))
(test-false MSG (valid-isbn? ...))
...

Make sure you choose a variety of ISBNs that cover all possible cases of your procedure. In particular, make sure to include negative tests, ones where the ISBN is invalid as well as corner cases such as when the check digit is an X. Use real examples (from books you have lying around) and artificial examples (ones you make up).

Partial Rubric

Redo or above

Submissions that lack any of these characteristics will get an I.

[ ] Includes the specified file with the correct name
[ ] Includes an appropriate header which indicates the course, author, etc.
[ ] Code runs in DrRacket
[ ] Documentation for most procedures is included
[ ] 10 test cases are provided in part 3

Meets Expectations or above

Submissions that lack any of these characterstics will get an R.

[ ] Code for part one passes at least 90% of tests.
[ ] Code for part two passes at least 90% of tests.
[ ] All procedures in part one include 6 tests each.
[ ] Code is well-formatted with appropriate names and indentation.
[ ] Code has been reformatted with Ctrl-I before submitting.
[ ] Documentation for all procedures is included
[ ] Documentation for most procedures is correct / has the correct form

Exceeds Expectations

Submissions that lack any of these characteristics will get an M.

[ ] The extra procedures in part one are particularly creative or useful
[ ] Part 2 contains no fewer than three helper procedures that decompose the problem in appropriate ways
[ ] Code from part one passes 100% of the tests.
[ ] Code from part two passes 100% of the tests
[ ] Avoids repeated work, such as re-calculation of digits.
[ ] Tests are well-designed and include edge cases.
[ ] Documentation for all procedures is correct / has the correct form.

Citations

The Spring 2021 Term 2 version of this assignment is based closely on an earlier assignment distributed by Peter-Michael Osera and Samuel Rebelsky.

Copyright © Charlie Curtsinger, Sarah Dahlby Albright, Janet Davis, Nicole Eikmeier, Fahmida Hamid, Titus Klinge, Peter-Michael Osera, Samuel A. Rebelsky, Anya Vostinar, and Jerod Weinman. Selected materials are copyright by John David Stone or Henry Walker and are used with permission.

Unless specified otherwise elsewhere on this page, this work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/3.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

This website was built using Jekyll, Twitter Bootstrap, and the Bootswatch Cosmo Theme.