mikeash.com: just this guy, you know?

Posted at 2010-12-03 16:53 | RSS feed (Full text feed) | Blog Index
Next article: Friday Q&A 2010-12-17: Custom Object Allocators in Objective-C
Previous article: Friday Q&A 2010-11-19: Creating Classes at Runtime for Fun and Profit
Tags: accessors fridayqna memory objectivec threading
Friday Q&A 2010-12-03: Accessors, Memory Management, and Thread Safety
by Mike Ash  

It's once again time for a brand new edition of Friday Q&A. This week, I'm going to talk about accessors, and how to properly deal with memory management and thread safety when creating them, a topic suggested by Daniel Jalkut.

More Complicated Than You Might Think
Cocoa's reference counting memory management usually works pretty well, and for the most part accomplishes its goal of making memory management decisions a local, rather than global, affair.

There's a big exception. Spot the bug in this code:

    NSMutableDictionary *dict = ...;
    id obj = [dict objectForKey: @"foo"];
    [dict removeObjectForKey: @"foo"];
    [obj something];
This bug isn't too hard to find. By removing obj from dict, you've potentially destroyed the object, which then causes your program to crash on the last line.

Of course, this bug isn't always this easy to find. The removal may be buried several method calls deep, and the object may sometimes be retained elsewhere, hiding the bug and making your program crash only inconsistently.

The same problem can happen with a regular object and accessors:

    id obj = [otherObj foo];
    [otherObj setFoo: newFoo];
    [obj something]; // crash
This is not necessarily a problem. It all depends on how exactly -foo and -setFoo: are implemented.

Basic Accessors
The most basic accessors look something like this:

    - (void)setFoo: (id)newFoo
    {
        [newFoo retain];
        [_foo release];
        _foo = newFoo;
    }
    
    - (id)foo
    {
        return _foo;
    }
With these basic accessors, the example is buggy and must be avoided.

This does not mean that these basic accessors are wrong. It does mean that you need to be more careful when writing code that uses them:

    id obj = [[otherObj foo] retain];
    [otherObj setFoo: newFoo];
    [obj something]; // safe
    [obj release];
Since most code doesn't set new values before using the old ones, most code won't need to worry about the problem at all. However, since you do sometimes write code like this, you do need to keep this problem in mind, and code accordingly.

Autoreleasing Accessors
Rather than make callers retain and release temporary objects, another approach is to modify the accessors to use autorelease. You can do this in the setter:

    - (void)setFoo: (id)newFoo
    {
        [_foo autorelease];
        _foo = [newFoo retain];
    }
However, this still leaves you vulnerable to a crash if there's a temporary autorelease pool set up when the setter is called. It's better to do it in the getter instead:
    - (id)foo
    {
        return [[_foo retain] autorelease];
    }
This retain/autorelease combo keeps the object alive even after the setter releases it.

This solves the problem, but there are a couple of downsides. The obvious one is that it's less efficient. Using autorelease is a bit slower than release, and it keeps the target object alive longer, which could lead to more memory usage. This isn't usually very important, but it's something to keep in mind.

More importantly, pervasive use of autorelease can make it harder to track down memory management errors. If you over-release an object, you want to crash as soon as possible. Using autorelease can often cause the crash to be in NSAutoreleasePool code, making it considerably harder to track down. Using zombies and Instruments will help a lot, but it can still make your job harder.

Thread Safe Accessors
When writing thread safe accessors, the autorelease getter becomes mandatory. Another thread could call the setter after it returns, and the retain/autorelease dance is the only way to ensure that this does not destroy the previous object in the middle of being used.

Thread safe accessors, like any access to shared data, need to use a lock. You can use @synchronized(self), an instance of NSLock, or whatever other locking mechanism you prefer.

(Lockless access to shared data is, of course, possible, but far too tricky and difficult to cover here. Better to skip it and use locks.)

By locking all shared access, and by using the autorelease version of the getter, you end up with thread safe accessors:

    - (void)setFoo: (id)newFoo
    {
        [newFoo retain];
        @synchronized(self)
        {
            [_foo release];
            _foo = newFoo;
        }
    }
    
    - (id)foo
    {
        id returnFoo;
        @synchronized(self)
        {
            returnFoo = [_foo retain];
        }
        return [returnFoo autorelease];
    }
The locking adds a performance penalty and increases the complexity of the code, so you should only use this style when you need it.

Do You Need Thread Safe Accessors?
In almost every case, the answer to this question is "no". Accessors are usually not the right place to worry about thread safety.

The problem is that thread safety isn't a composable attribute. If you take a bunch of thread safe components and bundle them together, the result may not be thread safe.

For a simple example, imagine a Person class with properties for the first and last name of the person it represents. One thread does this:

    [person setFirstName: @"John"];
    [person setLastName: @"Doe"];
While another thread does this:
    NSString *f = [person firstName];
    NSString *l = [person lastName];
    NSString *fullName = [NSString stringWithFormat: @"%@ %@", f, l];
Thread safe accessors will keep this code from crashing, but it does not protect against an inconsistent result! If the previous name was "Bob Smith", this code could easily result in a fullName of "John Smith", which doesn't correspond to either person.

In order to make this code safe, thread safety needs to be applied at a higher level. For example, you might have a PersonDatabase object which could be locked and unlocked for any manipulations. You might decree that all Person access go through a single serial Grand Central Dispatch queue. You might just make all Person access happen on the main thread.

No matter which solution you pick, once you've picked it, the thread safe accessors are no longer necessary. The larger-scale thread safety takes care of problems, and all the thread safe accessors do is make your code unnecessarily slow and complex.

There are cases where a thread safe accessor is useful. You may have a single property that's accessed from many threads and which won't experience problems being inconsistent with other properties, in which case you'd want a thread safe accessor. However, 99% of the time there is no point in making them.

Properties and nonatomic
Given the above, it's extremely puzzling that Apple has made @property declarations default to atomic. Most of the time it's pointless, and it can give the mistaken impression that the programmer doesn't have to worry about thread safety anymore, because the @property handles it all.

The @property and @synthesize constructs can be a good way to generate accessors without writing code, just be aware that the default atomic behavior is not all that useful, and doesn't mean you can forget about thread safety.

Garbage Collection
If you're using garbage collection, this whole question becomes vastly simpler. Here's what a correct, thread safe accessor pair looks like in garbage collected code:

    - (void)setFoo: (id)newFoo
    {
        _foo = newFoo;
    }
    
    - (id)foo
    {
        return _foo;
    }
The lack of explicit memory management calls really simplifies accessors here.

Conclusion
Writing accessors is easy, but there are some subtleties. The vast majority of the time, a plain retain/release setter and a return _ivar getter is all you need. However, depending on your situation and your individual taste, you may want to put retain/autorelease into the getter in order to simplify the code that calls it.

When it comes to thread safety, accessors are usually the wrong place to worry about it. While there are legitimate cases where it's useful, most of the time you should back up and take on the problem of thread safety at a higher level of your code. And don't let the atomic @property keyword fool you: it doesn't do anything special in this regard.

If you're coding exclusively for garbage collection, you can pretty much forget about this whole business and write much simpler code.

That's it for this edition of Friday Q&A. As usual, another one will be posted in two weeks. Until then, you can pass the time by sending me suggestions for topics. If you have an topic that you would like to see covered here, please send it in.

Did you enjoy this article? I'm selling a whole book full of them. It's available for iBooks and Kindle, plus a direct download in PDF and ePub format. It's also available in paper for the old-fashioned. Click here for more information.

Comments:

Brian at 2010-12-03 21:35:03:
In the section where you use the Person object, you mention that your first example that @synchronized would keep the code from crashing. Can you please explain why it would crash without it? Would it be when a thread uses the getter in between another thread calling release and assigning the new value? Thanks for all the great info.

mikeash at 2010-12-03 22:29:38:
Yes, that's exactly right. Imagine this sequence of events:

// thread 1
[_foo release];

// thread 2
id foo = [obj foo]

// thread 1
_foo = [newFoo retain];

Thread 2 ends up with a reference to a released, potentially destroyed object.

foobaz at 2010-12-04 22:48:05:
I wish -[NSDictionary objectForKey:] returned a retained+autoreleased object, to avoid the bug in your first example. Most Cocoa methods return autoreleased objects, so this behavior would be more consistent.

Tommy at 2010-12-18 01:47:28:
In the thread safe accessor section, why is the synchronize call necessary in the getter function?

mikeash at 2010-12-18 02:13:22:
The getter needs to be synchronized because [_foo retain] is not an atomic operation. It's split into two steps: retrieving the value of _foo, and then sending retain to it. If the setter were to run in between those two steps, the retain would be sent to a dangling pointer and can crash.


Comments RSS feed for this page

Add your thoughts, post a comment:

Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.

Name:
Web site:
Comment:
Formatting: <i> <b> <blockquote> <code>. URLs are automatically hyperlinked.
Code syntax highlighting thanks to Pygments.
Hosted at DigitalOcean.