mikeash.com: just this guy, you know?

Posted at 2015-07-03 13:52 | RSS feed (Full text feed) | Blog Index
Next article: Friday Q&A 2015-07-17: When to Use Swift Structs and Classes
Previous article: Friday Q&A 2015-06-19: The Best of What's New in Swift
Tags: c fridayqna valgrind
Friday Q&A 2015-07-03: Address Sanitizer
by Mike Ash  

Outside of Swift 2 news, one of the exciting announcements to come out of WWDC was that clang's Address Sanitizer is now available directly in Xcode 7. Today I'm going to discuss what it is, how it works, and how to use it, a topic suggested by Konstantin Gonikman.

A Most Precarious Situation
C is a great programming language in many ways. The fact that it's still going strong after more than four decades is a testament to that. It's not the first (or second) programming language I learned, but it is the one that first made me truly understand what was going on in these mysterious computer machines, and it's the first language I learned that I still use.

C is also a frighteningly dangerous programming language that's responsible for many woes in the world, and allows for the casual creation of bugs so ridiculous that most other languages can't even express them.

A major problem is memory safety. C has none. Code like this will compile and likely run without a problem:

    char *ptr = malloc(5);
    ptr[12] = 0;

This just allocates an array of five bytes, then writes into the 13th byte of that array, silently corrupting whatever happens to be in memory at that location. Probably nothing. (On Apple platforms, malloc always allocates at least 16 bytes even if you asked for less, so this should always work there. Do not write code that relies on this fact.) Maybe something unimportant. Maybe something important.

More sane languages keep track of array sizes and validate indexes before allowing an operation to go through. The equivalent Java code, for example, will reliably throw an exception. When you can count on this, it makes debugging mysterious problems a lot easier. For example, if a variable should contain the value 4, but actually contains the value 5, you know that some piece of code that modifies that variable is to blame. (At least until you reach that stage of debugging where you start looking carefully at the compiler.) In C, you can't assume this. It could be a piece of code that deliberately modifies that variable, or it could be a piece of code that accidentally modifies that variable by using a bad pointer or a bad index.

A whole cottage industry has sprung up to produce solutions to this problem. Clang's static analyzer, for example, can detect certain types of memory safety problems in source code. Programs like Valgrind detect unsafe memory accesses at runtime.

Address Sanitizer is another one of these solutions. It uses a new approach which has some advantages and disadvantages, but it can be a valuable tool for discovering problems in your code.

Memory Access Validation
Many of these tools work by validating memory access at runtime. The theory is that if you can validate accesses as they happen by comparing them against the memory actually allocated by the program, these bugs can be discovered as they happen, rather than being discovered through their side effects long after.

Ideally, every pointer would include data about the size and location of the overall memory region it belongs to, and each access could be validated against that. There's no particular reason a C compiler couldn't be built that does that, but the extra metadata attached to each pointer would make its programs incompatible with code compiled by normal C compilers. That means system libraries couldn't easily be used, and that would severely limit the code that could be tested with such a system.

Valgrind approaches this problem by running the entire program in an emulator. This allows it to work with binaries produced by a normal C compiler without any changes. It then analyzes the program as it runs and keeps track of the state of each chunk of memory as the program manipulates it. This allows it to work with essentially any program without modifications, and system libraries too. This comes with a huge speed penalty, which can make it impractical to run on performance-sensitive code. This approach also requires deep understanding of the semantics of each system call on the platform so that their changes to memory can be appropriately tracked, which requires tight integration with the hosting OS. As a result of that, Valgrind support on the Mac has been hit-or-miss over the years, and as of this writing it does not support 10.10.

Guard Malloc takes advantage of the CPU's built-in memory checking facilities. It replaces the standard malloc function with one that marks the memory off the end of each allocation as unreadable and unwriteable. When the program attempts to access memory off the end, the program traps predictably.

The problem with this is that hardware memory protection is relatively coarse. Memory can only be marked as readable or unreadable with page-level granularity, and a memory page on any modern system is at least 4kB in size. That means that each allocation uses at least 8kB of memory: one page for the allocation itself, and one forbidden page off the end, even if the allocation is just a few bytes. It also means that small overruns may not be detected. Memory needs to be allocated on a 16-byte boundary in order to preserve the guarantees of standard malloc, so any allocation which isn't a multiple of 16 bytes will have a few bytes off the end which aren't marked as forbidden.

Address Sanitizer attempts to make this concept of forbidden memory more granular. It's essentially a slower but more practical approach to how Guard Malloc works.

Tracking Forbidden Memory
If hardware memory protection can't be used, then it must be tracked in software. Since extra data can't be passed around with a pointer, it must be tracked in some sort of global table. This table needs to be fast to read and fast to modify.

Address Sanitizer uses a simple but brilliant approach. It reserves a fixed section within the process's address space called the shadow memory. In Address Sanitizer terms, a byte that is marked as forbidden is "poisoned," and the shadow memory tracks which bytes are poisoned. A simple formula translates each address within the process's address space into a spot in the shadow memory. Each eight-byte chunk of regular memory maps to a byte of shadow memory, which tracks the poison state of those eight bytes.

Since eight bytes of memory maps to eight bits of shadow memory, it would be natural to think that each byte's poison state is tracked by one bit in the shadow memory. However, Address Sanitizer actually keeps a single integer in the shadow memory byte. It's assumed that all poisoned memory within an eight byte chunk is contiguous and at the end, so the shadow byte describes the number of unpoisoned bytes within the chuck. A value of 1 through 7 indicates that the corresponding number of bytes at the beginning of the region are unpoisoned. A value of 0 indicates that the entire region is unpoisoned. A negative value indicates that the entire region is poisoned. This slightly odd scheme allows for simpler computations when checking accesses against the shadow memory. Allocations are never that close together to begin with, so the assumption that poisoned bytes are contiguous and at the end doesn't cause any trouble.

With this table structure in place, Address Sanitizer generates extra code in the program to check every read and write through a pointer, and throw an error if the memory in question is poisoned. This is the advantage of being integrated into the compiler and not merely existing as an external library or runtime environment: every pointer access can be reliably identified and the appropriate checks added into the machine code.

Compiler integration also allows neat tricks like the ability to poison and guard local and global variables, not just heap allocations. Locals and globals are allocated with a bit of extra padding in between them, and the padding is poisoned to catch any overflows. This is something that Guard Malloc can't do, and that Valgrind has difficulty with.

Compiler integration has downsides as well. In particular, Address Sanitizer can't catch bad memory accesses in system libraries. It is compatible with system libraries, in that you can turn on Address Sanitizer, build a program that links against Cocoa (for example) and have it work, but it won't catch bad memory accesses performed by Cocoa, or performed by your code on memory allocated by Cocoa.

Address Sanitizer also helps to catch use-after-free errors. When memory is freed, it's all marked as poisoned, so subsequent accesses will be trapped. Use-after-free errors are particularly nasty when the memory is reused for a new allocation first, because then you corrupt unrelated bits of data. Address Sanitizer defends against this by placing newly freed memory into a recycling queue that keeps it unallocated for a while before it can be reused.

Adding a check for every single pointer access carries substantial overhead, of course. It depends heavily on just what your code is doing, since different types of code may access pointer contents much more or less frequently. On average, expect a roughly 2-5x slowdown. This is significant, but usually not enough to make a program unusable.

How to Use
With Xcode 7, using Address Sanitizer is simple. When compiling from the command line, add -fsanitize=address to the clang invocation. Here's a program that exercises it:

    #include <stdlib.h>

    void Write(char *ptr, size_t index, char value) {
        ptr[index] = value;
    }

    int main(int argc, char **argv) {
        char *ptr = malloc(12);
        Write(ptr, 12, 42);
    }

Compiling and running with Address Sanitizer:

    $ clang -fsanitize=address test.c
    $ ./a.out

It quickly crashes and produces a bunch of output:

    ==18186==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000df9c at pc 0x000101025efc bp 0x7fff5ebda8a0 sp 0x7fff5ebda898
    WRITE of size 1 at 0x60200000df9c thread T0
        #0 0x101025efb in Write (/Users/mikeash/Dropbox/shell/asan/./a.out+0x100000efb)
        #1 0x101025f46 in main (/Users/mikeash/Dropbox/shell/asan/./a.out+0x100000f46)
        #2 0x7fff940025c8 in start (/usr/lib/system/libdyld.dylib+0x35c8)
        #3 0x0  (<unknown module>)

    0x60200000df9c is located 0 bytes to the right of 12-byte region [0x60200000df90,0x60200000df9c)
    allocated by thread T0 here:
        #0 0x101070960 in wrap_malloc (/Applications/Xcode-beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib+0x42960)
        #1 0x101025f2d in main (/Users/mikeash/Dropbox/shell/asan/./a.out+0x100000f2d)
        #2 0x7fff940025c8 in start (/usr/lib/system/libdyld.dylib+0x35c8)
        #3 0x0  (<unknown module>)

    SUMMARY: AddressSanitizer: heap-buffer-overflow ??:0 Write
    0x1c0400001ba0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x1c0400001bb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x1c0400001bc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x1c0400001bd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x1c0400001be0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    =>0x1c0400001bf0: fa fa 00[04]fa fa 00 06 fa fa 00 00 fa fa 00 04
    0x1c0400001c00: fa fa 00 06 fa fa 00 07 fa fa 00 fa fa fa 00 00
    0x1c0400001c10: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
    0x1c0400001c20: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
    0x1c0400001c30: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
    0x1c0400001c40: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
    Shadow byte legend (one shadow byte represents 8 application bytes):
    Addressable:           00
    Partially addressable: 01 02 03 04 05 06 07 
    Heap left redzone:       fa
    Heap right redzone:      fb
    Freed heap region:       fd
    Stack left redzone:      f1
    Stack mid redzone:       f2
    Stack right redzone:     f3
    Stack partial redzone:   f4
    Stack after return:      f5
    Stack use after scope:   f8
    Global redzone:          f9
    Global init order:       f6
    Poisoned by user:        f7
    Container overflow:      fc
    Array cookie:            ac
    Intra object redzone:    bb
    ASan internal:           fe
    Left alloca redzone:     ca
    Right alloca redzone:    cb
    ==18186==ABORTING
    Abort trap: 6

This is a wealth of information, and in a real-world scenario it would be enormously helpful in tracking down the problem. It not only shows where the bad write occurred, but where the memory was originally allocated, and a bunch of extra data besides.

Using Address Sanitizer from within Xcode is just as easy: edit your scheme, click the Diagnostics tab, and check the box labeled "Enable Address Sanitizer." Then just build and run as usual, and watch the diagnostics roll in.

Bonus Feature: Undefined Behavior Sanitizer
Bad memory accesses are just one of the many entertaining undefined behaviors offered by C. Clang offers another sanitizer which catches many instances of undefined behavior. Here's an example program:

    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char **argv) {
        int value = 1;
        for(int x = 0; x < atoi(argv[1]); x++) {
            value *= 10;
            printf("%d\n", value);
        }
    }

Let's run it:

    $ clang undefined.c 
    $ ./a.out 15
    10
    100
    1000
    10000
    100000
    1000000
    10000000
    100000000
    1000000000
    1215752192
    -727379968
    1316134912
    276447232
    -1530494976

That got a little weird at the end. No surprise: signed integer overflow is undefined behavior in C. It would be great to catch that instead of just producing bad data. Undefined behavior sanitizer to the rescue! This is enabled by passing -fsanitize=undefined-trap -fsanitize-undefined-trap-on-error:

    $ clang -fsanitize=undefined-trap -fsanitize-undefined-trap-on-error undefined.c
    $ ./a.out 15
    10
    100
    1000
    10000
    100000
    1000000
    10000000
    100000000
    1000000000
    Illegal instruction: 4

This doesn't give any additional information like Address Sanitizer does, but it does stop execution right at the point where the undefined behavior occurred and the problem can easily be inspected in the debugger.

The undefined behavior sanitizer is not integrated with Xcode at the moment. You can enable it for your app by adding the above compiler flags to your project's build settings directly.

Conclusion
Address Sanitizer is a great piece of technology that can catch a lot of problematic errors in C code. It's not perfect and it won't find all errors, but even so it provides some extremely useful diagnostics. I highly recommend that you try it on your code base and see what it finds. The results might surprise you.

That's it for today. Come back next time for more gooey goodness. Friday Q&A is driven by reader suggestions, as always, so if you have a topic you'd like to see me discuss here, please send it in!

Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle. Click here for more information.

Comments:

If you compile with -fsanitize=undefined, you get diagnostics (like with ASan), but the program continues running:

$ clang -fsanitize=undefined undefined.c && ./a.out 15
10
100
1000
10000
100000
1000000
10000000
100000000
1000000000
undefined.c:7:19: runtime error: signed integer overflow: 1000000000 * 10 cannot be represented in type 'int'
1410065408
1215752192
-727379968
1316134912
276447232
-1530494976
Shameful that it took Apple *years* to merge ASan into their fork of clang. I wonder if it was NIH syndrome, what with ASan coming out of Google...

Also: worth noting that UBSan isn't new to Xcode 7, it's there in earlier versions.
Nice article. Address Sanitizer is excellent. A couple of comments:

It's essentially a slower but more practical approach to how Guard Malloc works.

I'm not sure that Address Sanitizer is actually slower than Guard Malloc. Guard Malloc (or pageheap, the Windows equivalent) can be incredibly slow, for a few reasons.

1) Allocating a page for every allocation spreads out the address space which can cause paging, or at least TLB thrashing, and allocating pages is typically much more costly than suballocating small amounts of memory.

2) Guard Malloc has horrible cache efficiency. All small allocations are clustered at the end of pages. That means that they all use the same cache sets. If most allocations are <= 64 bytes then the cache sets corresponding to page offsets from 0 to 4031 bytes will be unused - that's 98.4% of the cache that is unusable. Not surprisingly, most programs run *much* slower when their cache size is ~1.6% of normal.

However there is also one area in which this article exaggerates the costs of Guard Malloc:

That means that each allocation uses at least 8kB of memory

Each allocation needs at least 8kB of *address space*. It should be possible to use 4kB of memory for the allocation and 4kB of address space reserved-but-not-mapped for the guard page. At least, that's how it works on pageheap on Windows.
Compiler integration has downsides as well. In particular, Address Sanitizer can't catch bad memory accesses in system libraries. It is compatible with system libraries, in that you can turn on Address Sanitizer, build a program that links against Cocoa (for example) and have it work, but it won't catch bad memory accesses performed by Cocoa, or performed by your code on memory allocated by Cocoa.

I don't think the very last part is true. ASan *will* catch bad memory access performed by *your code* on memory allocated by Cocoa. All allocations go through ASan's allocator and they get poisoned redzones even when allocated by Cocoa.
It's assumed that all poisoned memory within an eight byte chunk is contiguous and at the end, so the shadow byte contains the number of unpoisoned bytes within the chuck. A value of 0 indicates that all memory is unpoisoned, 1 indicates that the last byte is poisoned, 2 indicates that the last two bytes are poisoned, etc. and 7 indicates that all bytes are poisoned. For the case where all 8 bytes are poisoned, the value is negative.

This seems contradictory. If the shadow byte contains the number of unpoisoned bytes within the chunk, then why would a value of 0 – meaning "zero unpoisoned blocks" – indicate that all memory is unpoisoned? And then at the end, you say that both 7 and a negative value indicate all 8 bytes are poisoned.

Clang's documentation says:
There are only 9 different values for any aligned 8 bytes of the application memory:

All 8 bytes in qword are unpoisoned (i.e. addressible). The shadow value is 0.
All 8 bytes in qword are poisoned (i.e. not addressible). The shadow value is negative.
First k bytes are unpoisoned, the rest 8-k are poisoned. The shadow value is k

Source: https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
This makes more sense to me.
Thank you, I screwed that description up a bit. I've rewritten it a bit and I think I got it right now. Note however that the odd meaning of 0 is real. For reasons I don't fully understand, a positive number means that many bytes are unpoisoned, but zero means all bytes are unpoisoned. This is inconsistent, but must make things simpler somehow.
You should (almost) always add "-g" to invocations of "clang -fsanitize=address" to get even more useful error reports.
Any idea why Apple clang doesn't support

-fsanitize=undefined?

When I try that I get:

clang -fsanitize=undefined ub.c
...
ld: file not found: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/
bin/../lib/clang/8.0.0/lib/darwin/libclang_rt.ubsan_osx_dynamic.dylib
If you use -fsanitize=address,undefined it works. Hopefully this helps anyone else who finds this post search for that error like I did.

Compiler integration also allows neat tricks like the ability to poison and guard local and global variables, not just heap allocations. Locals and globals are allocated with a bit of extra padding in between them, and the padding is poisoned to catch any overflows. This is something that Guard Malloc can't do, and that Valgrind has difficulty with.

Sorry, I don't quite understand why locals and globals are allocated with a bit of extra padding in between them. In my opinion, local variables such as those variables allocated in stack frame can be next to each other without paddings in them. How can address sanitizer detect such memory access problem?
Mike --

I have a class data member, SomeClass * mFooPtr.

mFooPtr gets deallocated as expected.

After that point, I have code that sets:

mFooPtr = NULL;

So there can be no doubt that mFootPtr has been deallocated.

With address sanitizer turned on, it halts at my assignment to mFoorPtr:

AddressSanitizer report breakpoint hit. Use 'thread info -s' to get extended information about the report.

(lldb) thread info -s
thread #1: tid = 0x6ca2f7, 0x9caf3d2f libsystem_c.dylib`__abort + 230, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)

Does that seem normal? It's pretty common to clear out a pointer data member after deallocation.

Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Name:
The Answer to the Ultimate Question of Life, the Universe, and Everything?
Comment:
Formatting: <i> <b> <blockquote> <code>.
NOTE: Due to an increase in spam, URLs are forbidden! Please provide search terms or fragment your URLs so they don't look like URLs.
Code syntax highlighting thanks to Pygments.
Hosted at DigitalOcean.