Next article: Friday Q&A 2009-06-19: Mac OS X Process Memory Statistics
Previous article: Friday Q&A 2009-05-22: Objective-C Class Loading and Initialization
Tags: fridayqna valgrind
Welcome back to another late Friday Q&A. My apologies to all of my readers for missing last week's edition. Some family events beyond the scope of this blog prevented me from writing one. And I should probably point out right now that WWDC is almost certainly going to prevent me from writing one next week. This week, however, I do have a post, and I'm going to be talking about Valgrind as suggested by Landon Fuller.
What It Is
A few months ago I talked about the Clang Static Analyzer and how it could help you find bugs in your code. Valgrind is a similar sort of program except it checks for errors at runtime instead.
There's an entire class of bugs which are easy to write and difficult to track down in C-based languages, such as reading from uninitialized memory or writing past the end of an array. Reading from uninitialized memory just gives junk values and a lot of times those junk values actually work. Writing past the end of an array is frequently harmless since arrays are generally backed by storage that's larger than what was requested. Because of this, these code bugs might only show up as crashes rarely. For really bad ones, they never crash, but just cause bad behavior. Figuring out what piece of code is causing the misbehavior can be extremely difficult.
Thus Valgrind. The way it works is it essentially runs your program inside an emulator. By doing this, it has total control over everything your program does. Something that's undetectable when running on the processor, like reading from a memory location that was never initialized, suddenly becomes easy to see.
There are some downsides to this approach. The most obvious one is that the target program runs about an order of magnitude slower than it normally would, due to being run under emulation. A less obvious downside is that Valgrind needs to know the behavior of every syscall in order to make everything work properly, and right now on the Mac there are some missing ones. For example, QuickTime uses the aio
family of functions which aren't currently supported by Valgrind, so QuickTime won't work. Still, lots of things do work, and you can run an entire Cocoa application under Valgrind.
How to Get It
Valgrind's Mac support has only recently been merged into their main code repository, and is not yet available as an official release. This means that, for now, the only way to get it is by pulling down their subversion repository:
$ svn co svn://svn.valgrind.org/valgrind/trunk valgrind
README
or just do this:
$ cd valgrind
$ ./autogen.sh
$ ./configure
$ make
$ sudo make install
valgrind
in the shell. Note that as far as I know, Valgrind for Mac only works on Intel machines. If you have a PowerPC Mac you're probably out of luck, although there's no harm in trying.
$ sudo /usr/local/hermes/bin/hermesctl unload
$ sudo /usr/local/hermes/bin/hermesctl load
Finding Bugs
Let's take a look at this example program:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
char *bad_strdup(char *s)
{
char *ret = malloc(strlen(s));
strcpy(ret, s);
return ret;
}
int main(int argc, char **argv)
{
char *str = "hello world";
char *str2 = bad_strdup(str);
int i;
printf("%s\n", str2);
printf("%d\n", i);
free(str2);
return 0;
}
i
at the end, even though that variable was never initialized. One of them is more subtle: bad_strdup
doesn't allocate enough memory to hold the NUL
byte at the end of the string. This would normally go undetected, because memory allocations are padded, and that extra byte is often available. It would only fail when the string length were a nice round number, and even then it might simply fail by overwriting something else and causing corrupted data far later.
Let's compile and run with Valgrind:
$ gcc -g valgrind.c
$ valgrind ./a.out
==4296== Memcheck, a memory error detector.
==4296== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==4296== Using LibVEX rev 1899, a library for dynamic binary translation.
==4296== Copyright (C) 2004-2009, and GNU GPL'd, by OpenWorks LLP.
==4296== Using valgrind-3.5.0.SVN, a dynamic binary instrumentation framework.
==4296== Copyright (C) 2000-2009, and GNU GPL'd, by Julian Seward et al.
==4296== For more details, rerun with: -v
==4296==
==4296== Invalid write of size 1
==4296== at 0x18B9E: strcpy (mc_replace_strmem.c:303)
==4296== by 0x1F8C: bad_strdup (valgrind.c:8)
==4296== by 0x1FB6: main (valgrind.c:15)
==4296== Address 0x3ec35b is 0 bytes after a block of size 11 alloc'd
==4296== at 0x15516: malloc (vg_replace_malloc.c:193)
==4296== by 0x1F77: bad_strdup (valgrind.c:7)
==4296== by 0x1FB6: main (valgrind.c:15)
==4296==
==4296== Invalid read of size 1
==4296== at 0x17BB1: strlen (mc_replace_strmem.c:275)
==4296== by 0x268125: puts (in /usr/lib/libSystem.B.dylib)
==4296== by 0x1FC4: main (valgrind.c:17)
==4296== Address 0x3ec35b is 0 bytes after a block of size 11 alloc'd
==4296== at 0x15516: malloc (vg_replace_malloc.c:193)
==4296== by 0x1F77: bad_strdup (valgrind.c:7)
==4296== by 0x1FB6: main (valgrind.c:15)
hello world
==4296==
==4296== Conditional jump or move depends on uninitialised value(s)
==4296== at 0x1F8E5E: __vfprintf (in /usr/lib/libSystem.B.dylib)
==4296== by 0x22CE66: vfprintf_l (in /usr/lib/libSystem.B.dylib)
==4296== by 0x251FBA: printf (in /usr/lib/libSystem.B.dylib)
==4296== by 0x1FD9: main (valgrind.c:18)
==4296==
==4296== Conditional jump or move depends on uninitialised value(s)
==4296== at 0x2C9A66: __ultoa (in /usr/lib/libSystem.B.dylib)
==4296== by 0x1FA305: __vfprintf (in /usr/lib/libSystem.B.dylib)
==4296== by 0x22CE66: vfprintf_l (in /usr/lib/libSystem.B.dylib)
==4296== by 0x251FBA: printf (in /usr/lib/libSystem.B.dylib)
==4296== by 0x1FD9: main (valgrind.c:18)
...
After that you can see it successfully printing "hello world", then it tries to print the uninitialized i
, which it immediately catches and complains about. Valgrind appears to cascade the uninitialized state of memory as that memory moves around, as it complains about uninitialized memory access many, many times during the course of printing (most of which I cut out for the sake of brevity). This bug manifests in an obvious way here, but it's not uncommon to have uninitialized variable reads which cause much more subtle bugs than this.
Conclusion
It's easy to write extremely difficult bugs in C and C-based languages, and Valgrind is an incredibly useful tool for discovering and tracking down these bugs, and we're fortunate to have a tool of this caliber available on the Mac.
That wraps up this edition of Friday Q&A. Come back... well, probably in two weeks for another exciting installment.
As always, Friday Q&A is powered by your suggestions. If you have a topic you would like to see discussed here, post it below or e-mail it to me.
Comments:
Also, forgive the annoyance, but shouldn't the second hermesctl invocation supply the "load" argument and not "unload" again?
I have never noticed this in the years I've used this code, and obviously no one else has either, or at least its never been fixed by Apple. The code is Copyright 2002! It has no real affect except probably causing some excess redraws which no one ever noticed, but still, very impressive!
How to get the full error even the process is get's killed
Comments RSS feed for this page
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.