## Software development final exam answers: Part 1

These are the answers, along with some commentary, for part 1 (Algorithms and Data Structures) of my software development final exam.

Each question was graded out of 5, with points awarded for different parts of each question as described below; in some cases I awarded partial marks (e.g., 2 of 3 possible marks for part of a question) when someone had the right general idea but missed some details.

### Algorithms and Data Structures

1. Question: Is O(2^n) equal to O(3^n)? Why?

Answer: No. O(f(n)) = O(g(n)) if and only if f(n)/g(n) is bounded between two positive real numbers for all sufficiently large values of n; but as n increases, the ratio 2^n / 3^n tends to zero.

Grading: I awarded 2 marks for recognizing that these two sets of functions are not equal, and 3 marks for explaining why; for full marks, I wanted to see some mention of the limiting behaviour as n increases.

Commentary: The average score on this question was 2.8/5, which I found disappointingly low. Roughly a third of people thought that the two classes were equal, with the most common justification being that "we can ignore constants". While it is true that these both fall into the "exponential with linear exponent" class E, big-O notation is more precise and requires that functions are asymptotically within a constant multiplicative factor of each other.

A few people were confused by the notation, having seen "f(n) = O(g(n))" too many times; that is a misleading abuse of the "=" symbol, since it does not exhibit equivalence-relation behaviour. A better notation is "f(n) ∈ O(g(n))", since that makes it clear that O(g(n)) represents a set of functions.

I included this question as a test of whether people knew what big-O notation really means — in practice, it's very unlikely that anyone will need to compare O(2^n) and O(3^n) algorithms, but this made the two classes of functions obscure enough that people needed to rely on their understanding of big-O notation rather than memorized hierarchies.

2. Question: What is the expected run-time of quicksort? What is the worst-case run-time?

Answer: The expected run-time of quicksort (for random inputs) is O(n log n). The worst-case run-time of quicksort is O(n^2).

Grading: I awarded 2 marks for the expected run-time and 3 marks for the worst-case run-time.

Commentary: The average score on this question was 4.8/5, which I found quite surprisingly high. I have seen many people use quicksort in situations where the worst case either occurs quite often or can be easily caused by an attacker to occur, so my hypothesis was that many software developers had forgotten about the poor worst-case behaviour; in fact, more people got the worst-case behaviour correct than the expected behaviour.

Some people pointed out that using a linear-time median-finding algorithm allows quicksort to have an improved worst-case; the observation was welcome but not necessary. A few people stated that the expected running time of quicksort was O(log n); I have no idea what sort of confusion would lead to this, unless perhaps it was simply a common typo.

3. Question: What is the difference between a binary tree [EDIT: that should be binary search tree] and a B-tree? When and why does this difference matter?

Answer: Internal nodes in a binary tree have at most two children, while internal nodes in a B-tree can have many children (depending on the block size). This is useful when the tree data is stored in a manner which exhibits large read latencies (e.g., on disk) since the higher fan-out means that the height of a B-tree is less than the height of a corresponding binary tree.

Grading: I awarded 2 marks for identifying that a B-tree can have more than 2 children per internal node; 2 marks for identifying that this makes B-trees faster on disk (or generally when access times are determined primarily by latency rather than bandwidth); and 1 mark for drawing the connection from having more children per internal node to having a smaller tree height. I awarded 1 mark to people who didn't mention that B-trees have an advantage on disk but instead mentioned that they are balanced (which not all binary search trees are); while this is true, the issue of balance alone should not cause someone to reach for a B-tree, since there are balanced binary trees (e.g., red-black trees) which are much simpler to implement than B-trees.

Commentary: The average score on this question was 3.2/5; most people knew that a B-tree had more children per internal node, but only half of respondents knew that this made them useful for data stored on disk. A sizeable number of people (including several who didn't know what exactly a B-tree was) remembered that B-trees were used by databases and filesystems but weren't sure why.

I included this question because I wanted to test not only whether people knew standard datastructures, but also whether people would be able to pick the right data structure for a task — and knowing when and why one datastructure is better than another is absolutely critical for that.

Marking this question brought to mind a common remark my mother — a former high school mathematics teacher, and IB examination marker — used to make: Answer the question asked. There were many people who explained what a B-tree is and when it's useful, but didn't explain why — and while in many cases I thought they probably knew the answer to that third part, I couldn't give marks for knowledge they didn't express.

4. Question:Name the heap operations used by heapsort and the asymptotic running time of each.

Answer:There are two operations: Heapify, which takes n elements and builds a heap out of them in O(n) time; and pop-max, which deletes the largest element from the heap and returns it in O(log n) time.

Grading: I accepted a number of different formulations for heapsort and names for the functions. For the initial heap-building process, I awarded 3 marks for stating that the process took O(n) time, or 1 mark for saying that an element could be added in O(log n) time. I awarded 2 marks for stating that it took O(log n) time to remove the maximum (or minimum, in some formulations) element and make the necessary adjustments to retain the heap property.

Commentary: The average score on this question was 2.2/5; a large number of people couldn't remember what a heap was or how heapsort worked. I found this quite disappointing; while in most cases developers will want to use either quicksort or samplesort (if not concerned about worst-case behaviour) or mergesort, heaps are nonetheless a good data structure to know about for their relevance to priority queues.

Of those people who remembered about heapsort, a large majority did not know that a heap can be constructed in linear time. This is an important thing to know if you ever intend to use a heap: While it doesn't affect the asymptotic behaviour of heapsort (since extracting the elements still takes n log n time), it reduces the implicit constant; it can also be useful if you only want to extract a limited number of the largest elements.

A few people explained how linear-time heap construction works — by "building up" with progressively larger sub-heaps — which was welcome but not necessary.

5. Question:Describe an algorithm that determines if a graph is bipartite.

Answer: Attempt to 2-colour the graph as follows:

1. Pick an uncoloured vertex; if there are none remaining, return success.
2. Colour that vertex red.
3. Perform a graph traversal (depth-first search and breadth-first search both work) starting from the vertex we picked. Every time we reach a vertex, colour it green if all its coloured neighbours are red, or red if all of its coloured neighbours are green; if it has both red and green neighbours, return failure.
4. Go to step 1.

The graph is bipartite if and only if the 2-colouring algorithm did not fail.

Grading: I awarded 2 marks for showing an understanding of what "bipartite" meant (either by stating the definition, or by providing an algorithm which demonstrated that understanding). I awarded 3 marks for a correct algorithm; or 2 marks for an algorithm which worked only for connected graphs.

Commentary: The average score on this question was 2.3/5; this had the largest number of zero scores — from people who didn't know what a bipartite graph was — but of those who didn't score zero, most received either 4/5 or 5/5 (depending on whether they handled the case of disconnected graphs; the majority did not).

I decided to include this question in part because it is not something which many software developers are likely to have needed to know recently, and is not an algorithm which is often taught (it's too simple to cover in a graph theory course, but in a first algorithms course few instructors will want to spend time explaining what a bipartite graph is). As such, it acted as a test for whether people could invent an algorithm for a simple problem rather than merely reciting those they had learned in class.

I deliberately did not provide a definition for "bipartite", opting instead to use that as a rough measure for exposure (and recollection) of basic graph theory. One person admitted to consulting a dictionary and coming up with the (incorrect) definition that a bipartite graph had two connected components; I suspect he was not the only person who turned to a dictionary, since this was a common theme among those who did not know the correct definition.

For this entire part, the average combined score was 15.2/25; approximately equal numbers of people received scores in each of the ranges 0-9, 10-15, 16-21, and 22-25. The largest surprise to me was how long it took to mark each set of answers; even after selecting questions in part based on wanting to be able to mark them quickly, it took me many hours.

I will be posting my answers, along with commentary, to part 2 of the exam — Computer Architecture and Operating Systems — shortly after I finish marking the responses I have received to it.