“The programmer builds from pure thought-stuff: concepts and very flexible representations thereof. Because the medium is tractable, we expect few difficulties in implementation; hence our pervasive optimism. Because our ideas are faulty, we have bugs; hence our optimism is unjustified.”

- Frederick Brooks, The Mythical Man-Month

I was on an evening train home when, re-reading The Mythical Man Month for the first time in almost ten years, I came across the above words. It summed up my fourth attempt at the Karate Chop code kata: because our ideas are faulty, we have bugs.

My fourth algorithm idea was faulty, I just didn’t quite want to admit it. My ideas were faulty because of lack of mental preparation. I was tired (I’ve not been sleeping well for a while now), distracted, and in a hurry to get past the Karate Chop kata and move onto another challenge. I didn’t think things through when I started, I just fired up Aquamacs and began to type.

The idea was simple enough – strip the algorithm down of a lot of state and just “bounce” from position to position within the list. Starting from the 0th element, we determine an offset which is equal to half the length of the list. The offset is added to the starting point to determine the position to check.

If we don’t match, we halve the offset and either add or subtract the offset from the current position to yield our new position, depending on whether the item we checked was higher or lower in value than the one we’re searching for.

It sounds simple enough, and plausible enough, so I really should’ve run a quick check in my head or on a sticky note to verify it actually worked. The fact I don’t recall seeing this process before should’ve tipped me off to needing to double-check it.

Thing were fine to begin with, except for one failing test. It was the first off-by-one error I had encountered so far during the Karate Chop implementations, I’m rather proud to say, so it seemed to be quite a simple thing to fix. Oh was I wrong: a simple change resulted in multiple failed tests. Hasty debug statements (I’m guessing I’ll need to learn a Java debugging tool for Clojure development) began to highlight much weirdness with each loop through the search. Values didn’t seem to tally to the expected values, and the search position was often wildly off.

A hasty bit of refactoring to eliminate duplication and side-effects as possible causes didn’t offer much improvement, though I learnt to apply some interesting bits of Clojure in the process. In the end, I had a pretty good idea that the algorithm was flawed, but I kept convincing myself that things like padding the list or adding special cases would massage it into place.

Some days you just have to down tools, stand back, learn from your experiences and move on. The idea started off simple but started to grow… and smell.

In the current form, the algorithm fails on three tests:

(assert (=  0 (chop 1 [1 3 5]) ))
(assert (=  2 (chop 5 [1 3 5]) ))
(assert (=  0 (chop 1 [1 3 5 7]) ))

The bit that stood out straight away was that they were all tests for a value on the boundary of the list. Analysis of the code determined that the search algorithm never reaches the first and last elements in the list. And padding the list doesn’t help either.

In the first test case, our initial offset is 1 since our list consists of three elements, and integer division of three by two yields 1.

Position 1 of the list (3) is not what we’re looking for, so we halve the offset (integer halving of 1 yields 0). 3 is greater than the value we’re looking for, so we subtract 0 from the current position i.e. we stay where we are. The algorithm detects the zero offset, and evaluates to not-found. See the problem?

Searching for 5 in the same list has the same problem, except we try to add the zero offset to our current position with identical result.

What they both should do is try an offset of 1 again, but that wouldn’t involving halving the offset each time.

The four element list has an initial offset of 2, which yields the value 5 at list index 2. We know to check the lower half of the list, so we halve the offset to yield 1 and then subtract that from the current position. Our new position yields the value 3, which means we have to search left again. Alas, our offset of 1 is halved to 0 and the search terminates. We never get to check that first element in the list.

Going through the problem on a sticky note diagnosed the problem really easily, and working through for 5, 6 and 7 element lists made me realise that the maths would need a reworking to cover those edge cases. Typing up my experiences just now, I’ve even had a thought about one way to solve the problem without getting too complex… but I will revisit another time.

All-in-all, this was actually the most interesting experience so far during the kata. We often learn a great deal from our mistakes than our successes, failure can be good for us sometimes – it takes us out of our comfort zone, which is where true learning happens.

As ever, source for this and the previous solutions is available from
http://bitbucket.org/metaljoe/metaljoe_codekata/
under the MIT license. If you want to know more about the Karate Chop code kata, head over to
http://codekata.pragprog.com/2007/01/kata_two_karate.html
and take a look.

This is my third Clojure-based solution to the Karate Chop kata detailed at
http://codekata.pragprog.com/2007/01/kata_two_karate.html

The write up is going to be a little shorter than before, because I’m pretty tired as I type this. As always, source code to the solution is available from BitBucket at
http://bitbucket.org/metaljoe/metaljoe_codekata/
under the MIT license and feedback is definitely always welcome.

For this solution, the aim was to produce a concurrent version of the code. Surprisingly, this proved to be quite tricky to write compared to implementing an equivalent concurrent solution in a language like Python or Erlang. The main problem was that the “Programming Clojure” book apparently missed a few important details about Agents (Agents can’t call Agents, shutdown-agents needs to be called for cleaning up). I got rather frustrated with the documentation and examples available, with the code raising assorted semi-cryptic Agent-related exceptions that weren’t adequately explained anywhere.

After a considerable amount of time on Google, and a drastic scaling back of my intended implementation, I finally created a working solution this evening.

The original aim was to keep dividing the list in two, spawning a new Agent/thread to investigate each new list by repeating the splits. This would continue until a match was found or an empty list reached. Not very efficient or elegant, but a way to incorporate an element of concurrency into the kata.

Since Agents can’t call Agents, the first major hurdle I had, I ended up running just two Agents: one to check the left half of the initial list, the other to check the right. The spawning of further Agents was eliminated in favour of an inefficient recursive check of each half of these lists, and so on. I opted to keep this style of checking, to “simulate” what my original implementation would have operated like.

This worked well, but the script would not quit until killed with Control-C. After another search through the book and docs, I found a reference to calling shutdown-agents. This only worked when called at the end of the script, not within the function calling the agents for various reasons.

I also took the opportunity to try and make a little more legible approach, as well as try out the split-at function.

Now I need sleep.

This is the second attempt at a Clojure-based solution to the Karate Chop kata detailed at
http://codekata.pragprog.com/2007/01/kata_two_karate.html

Source code is available from BitBucket at
http://bitbucket.org/metaljoe/metaljoe_codekata/
and is offered under the liberal MIT License. Since the aim of a kata is to learn and become a better programmer, feedback is always welcome.

Since the previous solution became a recursive one, my aim for this attempt was to work on an iterative implementation. Concerned about the rather ugly feel to the recursive code, I was also looking to produce something a bit more compact, a bit more elegant.

The code actually came to me pretty quickly compared to the first attempt. One advantage is that the problem domain is clearer on the second attempt, my brain has had time to consider the pitfalls and map out a more concise understanding of the problem. The other factor is that writing my own Clojure code, as opposed to quick REPL exploration or fill-in-the-blank exercises, has become more natural. I have begun to start thinking in Clojure. Note that I said “have begun to” – I still have more work to do on that front.

As far as implementation went, it was pretty smooth until I hit one snag: an infinite loop. It took a while to figure out what was going on, until I spotted a glaring issue. If the slice/partition of the list I’m looking at reaches one element, I check it for a match. If it didn’t match, I would loop around and search it again. An embarrassing newbie error, but a useful mistake to make. I added the appropriate check and all the tests passed.

After completing the implementation I spotted the second mistake when I reviewed the code later on. I was testing for an empty list on each iteration. While it has no impact on the output, it’s an unnecessary check for the implementation – if the list does suddenly become empty, we have bigger problems!

One thing that did intrigue me was that with almost no effort, the solution was much smaller and more readable than the first approach. I usually find the opposite with recursive versus iterative code, but in the case it’s probably because the problem seems more natural expressed iteratively. Or maybe my recursive approach just sucked :-)

Here’s the completed algorithm, with tests removed and the code tidied for clarity. See the Mercurial repository at BitBucket for the original source.

(defn chop [item list]
  (if (empty? list)
    -1
    (loop [ min 0 mid (quot (count list) 2) max (count list)]
      (if (= item (get list mid))
        mid
        (if (or (= mid min) (= mid max))
          -1
          (if (< item (get list mid))
            (recur min (quot (+ mid min) 2) mid)       ; look left
            (recur mid (quot (+ mid max) 2) max))))))) ; look right

If I broke my rule to revisit the code, I’d change a few minor things for readability. For example: why did I put those two comments there, and not elsewhere? The loop initialisation looks a bit of a blur at first glance, the or statement might need to be wrapped in an intention-revealing function name. However, the code is good enough for the task at hand and that’s all that really matters.

I thought it might be interesting to work through the code katas created by Pragmatic Programmer Dave Thomas to assist with my Clojure learning. If you’re looking for Kata One, don’t worry about searching – I started with Kata Two as I needed an actual coding exercise.

The instructions for the Kata can be found at:
http://codekata.pragprog.com/2007/01/kata_two_karate.html

The task is to implement five different binary search algorithms, one per day. I don’t actually have five consecutive days to do this, so don’t expect the five parts to be written about each day. There will be gaps.

It’s been about 15 years since I wrote a binary search algorithm, using the language Modula-2. I was toying with the idea of going off and reading up on binary searches, but I decided the best way to approach it was from first principles – only allowing myself access to a Clojure reference manual. After all, I go on about the importance of learning basic algorithms even in this day and age… so about time I put my money where my mouth is!

The trickiest part of the kata wasn’t so much coming up with the algorithm, an iterative binary search is easy, but expressing the algorithm using Clojure’s syntax. That’s mainly the inexperience with the language showing, but the resulting code didn’t feel too elegant. In particular, the parentheses became more of a hindrance than a help – they start to look plain ugly, get in the way and I don’t think I’ve ever relied on my IDE’s parenthesis matching so much. I still don’t agree with Neal Stephenson that LISP is the only language which can truly be described as beautiful… despite all the power and flexibility it can yield in the right hands.

I may not have reached Clojure enlightenment yet, but gradually I built up a rhythm with the coding. Once the tests passed, I decided to stop rather than try to rework the code into something that looked and felt better. Take the lessons learnt and apply to the next algorithm – the work for this one has already been done.

First step with the coding was to create the tests. Although not a fan of Python’s doc tests, I opted to use Clojure’s equivalent for quickness – and I’ve decided I’m not a massive fan of them either. Moan, moan, moan. ;-)

(defn
 #^{:test (fn []
 (assert (= -1 (chop 3 []) ))
 (assert (= -1 (chop 3 [1]) ))
 (assert (=  0 (chop 1 [1]) ))

 (assert (=  0 (chop 1 [1 3 5]) ))
 (assert (=  1 (chop 3 [1 3 5]) ))
 (assert (=  2 (chop 5 [1 3 5]) ))
 (assert (= -1 (chop 0 [1 3 5]) ))
 (assert (= -1 (chop 2 [1 3 5]) ))
 (assert (= -1 (chop 4 [1 3 5]) ))
 (assert (= -1 (chop 6 [1 3 5]) ))

 (assert (=  0 (chop 1 [1 3 5 7]) ))
 (assert (=  1 (chop 3 [1 3 5 7]) ))
 (assert (=  2 (chop 5 [1 3 5 7]) ))
 (assert (=  3 (chop 7 [1 3 5 7]) ))
 (assert (= -1 (chop 0 [1 3 5 7]) ))
 (assert (= -1 (chop 2 [1 3 5 7]) ))
 (assert (= -1 (chop 4 [1 3 5 7]) ))
 (assert (= -1 (chop 6 [1 3 5 7]) ))
 (assert (= -1 (chop 8 [1 3 5 7]) )) ) }
 chop
 [item list]
 ...

I made the chop function a wrapper around the real action, meaning I could reuse it for different algorithms easily as well as keep the implementation separate. I opted to call a function called bsearch, passing the list of values, the value to search for, and a start and end index to define the “partition” under examination. The initial start and end indexes set the partition to be the whole of the list – since we have no idea where the value is to begin with.

(bsearch list item 0 (count list))

The initial check in bsearch is for an empty list. If we have nothing to search, we know for sure that the value doesn’t exist in it.

Next, we take our partition, or slice as I called it in the code, for examination. If the slice contains one item, we have a simple case to check: either that item is the one we’re looking for, or it isn’t.

If we have more than one item, our proper binary search comes into play. We find the mid-point and check the value of that item. As with the single item list, it’s either the value we want, or it isn’t. If it isn’t, we split the slice on that point and search the “left half” (values lower than the mid value) or the “right half” (values higher than the mid value) accordingly.

Although my intention was to do an iterative algorithm, Clojure’s syntax nudged me into a more recursive approach. For the tests supplied, blowing the stack is unlikely but for larger lists the algorithm won’t be the most efficient approach.

A couple of helper functions first:

(defn is-single-item? [ list ]
 (= 1 (count list)))

(defn mid-index [ list ]
 (quot (count list) 2))

These were just to make a couple of spots in the code slightly more readable at a glance.

And the bsearch function in all its nested glory:

(defn bsearch [ list value slice-start slice-end ]
 (if (empty? list)
  -1
  (let [ slice (subvec list slice-start slice-end)]
   (if (is-single-item? slice)
    (if (= value (first slice) )
     slice-start
     -1)
    (let [ mid (mid-index slice) ]
     (let [ mid-value (get slice mid) ]
      (if (= value mid-value)
       (+ slice-start mid)
       (if (< value mid-value)
        (bsearch list value slice-start (- slice-end mid))
        (bsearch list value (+ slice-start mid) slice-end)))))))))

Check out the multiple closing parentheses. That just looks wrong, but I see lots of example Clojure and LISP code like that. When writing the code, I ended up writing the parentheses in alignment for readability, then when I finish coding I reposition the closing parentheses to match the style above. I can’t help but think it’s a code smell indicator.

Certainly an interesting exercise and my first go at writing Clojure code from scratch, not following along with the Programming Clojure book. Definitely has room for improvement.

Now, how am I going to write my second binary search algorithm…?

Follow

Get every new post delivered to your Inbox.

Join 548 other followers