(This is based on a 15 minute talk for the London Python Code Dojo – slides available from SlideShare)

My interest in FluidDB began earlier this year when I attended a talk by Nicholas Tollervey at the London Clojure Dojo. I was expecting yet another talk about yet another non-relational database, but what I discovered was something different. The idea of a shared database storing “things” which anyone could tag with data seemed to be a rather powerful concept, yet simple and elegant. I thought it was a very cool and interesting idea.

But there was a problem.

How could people actually explore the “Fluidverse”? While people using FluidDB are building up conventions, such as the naming and content of tags, and there are tools out there to drill down through the hierarchies of tags and namespaces… there had to be an easier way to find the tags that were of interest to me. I decided I needed data in order to begin finding ways to explore… but what data?

So, while pondering the idea one evening, I was listening to the band Napalm Death and realised I had the answer. One of the things I love to do is find new bands, particularly extreme metal ones, and one way I do this is follow links between bands on Wikipedia. Wikipedia is a great source of band biographies, the content is under a Creative Commons license, and the band biographies often have lists of related bands and genres.

This seemed like a really good starting point to take data I’m interested in, build relationships between the data and give me something to start exploring with. I hacked together a scraper which used Napalm Death as a starting point and branched outwards in a “six degrees of separation” way, initially dumping the information directly into the FluidDB sandbox.

After a few runs, it made more sense to scrape to an intermediate file, and load that instead – allowing me to clean up typos, adjust names, amend tags and also allow me to regenerate the data in FluidDB’s sandbox without having to keep hitting Wikipedia. An example of the output format is as follows:

band:Burzum
  metaljoe/music/band_name = Burzum
  metaljoe/music/source_url = http://en.wikipedia.org/wiki/Burzum
  metaljoe/music/genre/black_metal -> Black metal
  metaljoe/music/genre/dark_ambient -> Dark ambient
  metaljoe/music/related_bands = ['Darkthrone', 'Mayhem', 'Old Funeral']

I wasn’t planning to release the source code, but have had some interest in it so I’ve decided to release it under the MIT license. You can find the code on my BitBucket account: http://bitbucket.org/metaljoe/fluidinyourear-scraper – note this is just the scraper code, not the loader.

With the data in place, I then needed to build something for exploring the relationships in the data. Enter “Fluid In Your Ear”, a very simple web application built around Python, Django and the excellent FOM (Fluid Object Manager) created by Ali Afshar. Given the nature of the bands, there is also a liberal application of Heavy Metal Umlauts – the power of which, courtesy of a particular Black Metal band, managed to crash the FluidDB sandbox a few times by exposing a unicode bug.

The application is deliberately very simple. I’m not a graphics genius (painting with real acrylic paints is my field), and at the moment it’s a basic core – you can browse genres and bands, and explore relationships between the two. I’ve already discovered some new bands through following the links, and re-discovered some older ones.

Due to the six-degrees nature, there is quite a lot that doesn’t fit into a metal or punk category which is quite cool. I’ve encountered a jazz musician called John Zorn who has crossed into hardcore punk and grindcore, to produce some outstanding music I would probably not have found before.

The source code is pretty grotty and the first casualty was a lack of tests. Shocking. In order to improve my confidence in the code and make it easier to refactor, I added unit tests using Django’s test harness and some functional testing using the Twill web testing framework. An example of the Twill test code is as follows:

# test missing genre
go http://127.0.0.1:8000/genre/progressive_vegetarian_grindcore
code 404

# test with trailing slash
go http://127.0.0.1:8000/genre/jazz/
code 200

# test without trailing slash
go http://127.0.0.1:8000/genre/jazz
code 200

# check page contents
find '<h2>Jazz</h2>'
find '<div id="related_bands">'
find '<li><a href="/band/Frank%20Zappa">Frank Zappa</a></li>'

So where next?

Well, first off is to get the application online so my plan was to port to Google App Engine. Unfortunately, I hit a few snags with the fact my app runs Django 1.2 and App Engine is using 1.1. I considered bundling Django in the app, but it became obvious that I’m not really using much of Django’s functionality – some URL routing and templates. The creator of FOM introduced me to Flask, a lightweight web framework, and it looks perfect for my needs. So I’m going to port to Flask and Google App Engine at the same time.

Another thing I want to do is add a JavaScript “social” layer over the top, allowing some of the richness of FluidDB to shine through and allow the addition of functionality not originally envisaged. I’m also hoping people will tag bands with ratings, annotations and the like with a hope to making recommendations possible.

In a similar way, I want the application code to be reusable and reskinnable so people can customise and create their own starting point. Maybe someone will produce a Classical In Your Ear in the future?

Source code is available from BitBucket, if you fancy a giggle at the clumsy bits: http://bitbucket.org/metaljoe/fluidinyourear – released under the GNU Affero GPL.

London Python Code Dojo #6

February 5, 2010

Last night was the sixth London Python Code Dojo, and the third I have attended. Organised, as ever, by Nicholas Tollervey, hosted by Fry-IT and attended by many.

Unlike the previous dojo, the evening was run with one laptop hooked up to a projector and with a pair of programmers “on stage” for ten minutes to undertake a task. The original plan was to code for a maximum of ten minutes or until a unit test passed, then the driver-programmer would step down. Somewhere along the line we lost sight of testing aims. We also lost half the group to a discussion at the back of the room, while the front rows were engrossed in the problems at hand. Volunteers to join the pair on stage were a little sparse, and I must admit I held back a bit longer than I should have – once I was up, I had a throughly enjoyable ten minutes as Ciaran’s co-pilot, and was then very sorry when my own ten minutes in the hot seat were up!

So, what happened?

Firstly, the tasks of the evening were based around integrating the best core components of the previous dojo’s many solutions. A data file format and parser from one solution, the Cmd-based command line code from another, that sort of thing. While an interesting exercise in itself, it perhaps lacked the need for the creative approaches used last month. Work was primarily a copy and paste job, followed by some smoothing of the edges, with little need to sit down and think through a higher plan of action. The test-driven development idea got lost along the way too, and I admit furthering that loss once it had happened.

Secondly, group size had its effect. I don’t have a head count, but we filled the room as usual so probably about 20 people. It was interesting that those at the back gradually drifted into their own discussions and even ended up on laptops not looking at what was going on at the front. Even with a projector, it can be difficult to see what was going on and listen in easily to the conversations at the front between driver, co-pilot and hecklers. We use the term “hecklers” in an affectionate way in the dojo – there’s no malice or negativity involved with comments from the audience. I wonder if those who wrote the code being transplanted at a given time lost interest, since they knew the code already?

Extending from group size was group dynamics. With smaller teams like last week, it was easier for people to contribute: less intimidating than being in front of everyone, easier to have your say, easier to be engaged more. However, the single laptop approach does prevent hogging of the keyboard, and we’re all working on the same thing. Swings and roundabouts.

We finished with a wrap up of the situation and where we can progress for the next dojo. The team-based approach is going to make a return, and the delegation of different tasks to different teams sounds like fun – we’re still all working on the same final goal (an adventure game), the same code base, but we’re breaking up the chunks of work. That in itself will bring in some interesting extra skills such as integration and inter-team communication.

I think it was an important learning process, and one we needed to make this early in the project. Sometimes you need a setback to make you understand the dynamics of a situation and achieve a breakthrough. When I teach snowboarding, I often find those who go through a bit of a regression phase, a bit of a setback, come out with a much better technique and understanding than those who breeze through the lesson. We learn more from our mistakes than our successes.

Despite the problems, it was still a good evening – I had fun, met loads of cool people, got taken out of my comfort zone, learnt some new things. That’s a positive result in my opinion. I was rather excitable and involved in what was going on, so didn’t really notice what was happening until I was in the hotseat and could look out on the group.

The next London Python Code Dojo will be on Thursday 4th March. If you’re a Pythonista in London, come along and take a look – there’s free pizza and beer/Coke too!

A Night At The Dojo

January 8, 2010

Yesterday evening I attended my second London Python Code Dojo, and the first where any coding was involved.

After devouring the free pizza and beer (or Coke in my case, being a teetotaller), we opted to split into equal-sized groups and tackle the rather interesting problem of building a text adventure game. Now, building such a game in the space of an hour is a little unrealistic so the decision was made to attempt the simplest thing that would work: namely, navigating around our game world using the four directions of the compass.

Despite the fact my team didn’t complete the task, due to some hasty debugging in the final few minutes of the hour, I must admit that we still produced something to be proud of. We took a couple of different approaches to the other teams, which contributed to holding us back but at the same time yielded a rather satisfying result.

Whereas the others were opting for pre-generated game maps, some teams even splitting into dedicated data and game engine teams, we decided to go with the idea of creating our map on the fly. This meant we didn’t have to bother creating any initial game data other than a starting room in our dungeon, with random doors leading off this room. As our hero progressed through the doors, we would create new rooms leading off the door – taking care to ensure the rooms matched up if we backtracked or went in a circle. Okay, a little more complex than it needed to be, but a rather bold and enjoyable solution.

Despite having pair programmed before, I did have a sudden brain failure when I found myself in the hot seat. It was the first time I’d paired up with people I didn’t know before (even though a colleague of mine was also in the same team as me), and more importantly it wasn’t so much a pair programming session as a group programming session. It was also interesting being in front of an unfamiliar development environment – I’m not a fan of VI at the best of times, and my brain totally gave up on me with the key bindings. Embarrassing! I was definitely out of my comfort zone… which is actually no bad thing, because it forces me to think and deal with the situation.

It was great mixing with different developers, bouncing ideas off new people, being taken out of my comfort zone and tackling something I wouldn’t ordinarily have thought about working on. At the end of the evening we all demoed our work (or traceback) and ran through the code really quickly. You get to see all the different approaches, pick up interesting ideas and learn about other people’s coding styles. I also learnt about the Python Cmd module! (And was a bit surprised to find a lot of people there knew it already – I felt a bit of a n00b)

Code dojos are great fun, and an excellent way to meet fellow developers, hone your skills, learn new ones and get yourself out of the comfort zone / rut.

My thanks to organiser Nicholas Tollervey, the guys at Fry-IT for providing the room (and essential pizza), and everyone who attended. Hopefully see you all at the next one! (Well, probably the next London Pyssup first)

Further reading:

Follow

Get every new post delivered to your Inbox.

Join 416 other followers