March 29, 2010
“The programmer builds from pure thought-stuff: concepts and very flexible representations thereof. Because the medium is tractable, we expect few difficulties in implementation; hence our pervasive optimism. Because our ideas are faulty, we have bugs; hence our optimism is unjustified.”
- Frederick Brooks, The Mythical Man-Month
I was on an evening train home when, re-reading The Mythical Man Month for the first time in almost ten years, I came across the above words. It summed up my fourth attempt at the Karate Chop code kata: because our ideas are faulty, we have bugs.
My fourth algorithm idea was faulty, I just didn’t quite want to admit it. My ideas were faulty because of lack of mental preparation. I was tired (I’ve not been sleeping well for a while now), distracted, and in a hurry to get past the Karate Chop kata and move onto another challenge. I didn’t think things through when I started, I just fired up Aquamacs and began to type.
The idea was simple enough – strip the algorithm down of a lot of state and just “bounce” from position to position within the list. Starting from the 0th element, we determine an offset which is equal to half the length of the list. The offset is added to the starting point to determine the position to check.
If we don’t match, we halve the offset and either add or subtract the offset from the current position to yield our new position, depending on whether the item we checked was higher or lower in value than the one we’re searching for.
It sounds simple enough, and plausible enough, so I really should’ve run a quick check in my head or on a sticky note to verify it actually worked. The fact I don’t recall seeing this process before should’ve tipped me off to needing to double-check it.
Thing were fine to begin with, except for one failing test. It was the first off-by-one error I had encountered so far during the Karate Chop implementations, I’m rather proud to say, so it seemed to be quite a simple thing to fix. Oh was I wrong: a simple change resulted in multiple failed tests. Hasty debug statements (I’m guessing I’ll need to learn a Java debugging tool for Clojure development) began to highlight much weirdness with each loop through the search. Values didn’t seem to tally to the expected values, and the search position was often wildly off.
A hasty bit of refactoring to eliminate duplication and side-effects as possible causes didn’t offer much improvement, though I learnt to apply some interesting bits of Clojure in the process. In the end, I had a pretty good idea that the algorithm was flawed, but I kept convincing myself that things like padding the list or adding special cases would massage it into place.
Some days you just have to down tools, stand back, learn from your experiences and move on. The idea started off simple but started to grow… and smell.
In the current form, the algorithm fails on three tests:
(assert (= 0 (chop 1 [1 3 5]) )) (assert (= 2 (chop 5 [1 3 5]) )) (assert (= 0 (chop 1 [1 3 5 7]) ))
The bit that stood out straight away was that they were all tests for a value on the boundary of the list. Analysis of the code determined that the search algorithm never reaches the first and last elements in the list. And padding the list doesn’t help either.
In the first test case, our initial offset is 1 since our list consists of three elements, and integer division of three by two yields 1.
Position 1 of the list (3) is not what we’re looking for, so we halve the offset (integer halving of 1 yields 0). 3 is greater than the value we’re looking for, so we subtract 0 from the current position i.e. we stay where we are. The algorithm detects the zero offset, and evaluates to not-found. See the problem?
Searching for 5 in the same list has the same problem, except we try to add the zero offset to our current position with identical result.
What they both should do is try an offset of 1 again, but that wouldn’t involving halving the offset each time.
The four element list has an initial offset of 2, which yields the value 5 at list index 2. We know to check the lower half of the list, so we halve the offset to yield 1 and then subtract that from the current position. Our new position yields the value 3, which means we have to search left again. Alas, our offset of 1 is halved to 0 and the search terminates. We never get to check that first element in the list.
Going through the problem on a sticky note diagnosed the problem really easily, and working through for 5, 6 and 7 element lists made me realise that the maths would need a reworking to cover those edge cases. Typing up my experiences just now, I’ve even had a thought about one way to solve the problem without getting too complex… but I will revisit another time.
All-in-all, this was actually the most interesting experience so far during the kata. We often learn a great deal from our mistakes than our successes, failure can be good for us sometimes – it takes us out of our comfort zone, which is where true learning happens.
As ever, source for this and the previous solutions is available from http://bitbucket.org/metaljoe/metaljoe_codekata/ under the MIT license. If you want to know more about the Karate Chop code kata, head over to http://codekata.pragprog.com/2007/01/kata_two_karate.html and take a look.