mikeash.com: just this guy, you know?

Posted at 2005-07-13 00:00 | RSS feed (Full text feed) | Blog Index
Next article: Score!
Previous article: Using FileMerge with subversion
Tags: rant university
Fun With Beowulf Clusters
by Mike Ash  

I've been working on my Master's thesis for the past four months or so, and having an interesting time of it. Today, I finally reached an important goal: running a fluid simulation split into separate simulation and visualization components, with different components running on different computers for speed. The bad news is that I'm not using a single bit of my university's clustering library which I'm supposed to be using.

This library, which shall remain nameless, is supposed to be an incredibly sophisticated piece of work that's geared directly towards this kind of heterogeneous clustering environment. Something like MPI expects all of the different nodes to be computing basically the same thing, but this library allows you to have multiple simulations components, multiple visualization components, etc., all of which communicate in a network to get things done. It's aimed at high-end graphics applications where you often have multiple back-end modules all communicating to a visualization module, running on your standard beowulf cluster.

Since this library is developed at my university, it was pretty much a given that I would use it for my project. So starting in March, I started reading up on it, looking over documentation and examples, and trying it out. There was even supposed to be a Mac OS X port!

Of course, nothing worked. I pretty much gave up on it for the time being sometime in April and concentrated on non-parallel simumlation and visualization. But this week, I had gone as far as I could go without parallelizing some code, so I hit the cluster room and buckled down to make it work.

So guess what, it still doesn't work! Now, this code is developed on Linux, and aimed for Linux. It has a Mac port, with Mac benchmarks. All of the code is supposed to work. As far as I can tell, all it does is compile. On the Mac, none of the dynamic libraries have their install names set, and so none of the binaries can execute. When I fix the install names, other issues crop up. I have no idea how this is supposed to be working code, yet that's what they say.

Monday was the big failure with the threading lib. While leaving campus, I was pretty upset with the whole deal. It's frustrating to be told to use a library which simply doesn't work no matter what you do. I'm the world's biggest advocate of using existing code and existing libraries, but I finally decided that I'd implement my own communications, at least as a stopgap. Maybe it wouldn't be as efficient or flexible, but it would work!

Monday was despair, Tuesday was implementation and Wednesday was success! Maybe it's just because my little simulation isn't big and complicated, but it really wasn't that hard to get things up and running. Of course, my thesis supervisor has no idea that I'm completely bypassing their favorite clustering library. I meet with her next Monday to give her the good news.
Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle. Click here for more information.


Alf Watt at 2005-07-14 02:47:00:
That’s linux for you. One guy gets it to work in his mom’s basement, then it’s on the net, and good luck to you! The one linux box in my posession is rigged to boot with a serial console just so I won’t have to configure X11, or the video card drivers, or figure out if the keyboard and mouse ports are swapped or whatever. I’m seriously considering a FreeBSD or Darwin install…

Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Web site:
Formatting: <i> <b> <blockquote> <code>. URLs are automatically hyperlinked.
Hosted at DigitalOcean.