Next article: Cocoa Unbound
Previous article: Friday Q&A 2010-08-12: Implementing NSCoding
Tags: cocoa defensive fridayqna objectivec
Welcome back to another word-laden edition of Friday Q&A. About a year ago, I wrote a post on defensive programming. That post covered defensive programming in a general sense, and Scott Gould has requested that I write one specific to various standard Cocoa practices, which is what I will be talking about today.
Recap
Defensive coding essentially boils down to constantly asking yourself, "what if it fails?" and coding appropriately. Your code's response to failure can be ranked:
- Corrupt/delete user data
- Crash/freeze
- Fail silently
- Display an error
- Work around the failure
When it comes to defensive Cocoa programming, a lot of failures are really just unanticipated changes. What happens if something makes a subclass of your class? What happens if your superclass's dealloc
implementation changes? What happens if the behavior of an object changes but remains within the API contract?
Initializers
What better place to start than at the beginning of an object's lifecycle?
I wrote a full post on implementing Cocoa initializers, so I won't go into too much detail here. The main things you need to do to write a defensive initializer are to always ensure that your superclass's initializer gets called (even if it doesn't do anything, it could start doing something later), always assign self
to the result, and always check the result for nil
. Thus:
- (id)init
{
self = [super init];
if(self)
{
// ... initialize instance variables here
}
return self;
}
When your object is destroyed, it must clean up after itself. Naturally, you need to be careful when doing this.
First, always write your dealloc
implementation to tolerate an uninitialized or partially initialized object. It's possible that your initializer, or a superclass initializer, will encounter an error and decide to destroy your object before it's fully formed, and your code should tolerate this. You can take advantage of the fact that Objective-C objects start out zero-filled. For example:
- (void)dealloc
{
// if statement guards against uninitialized object
if(someStructPointer)
[someStructPointer->object release];
[super dealloc];
}
nil
do nothing, and not have to write any extra code to handle the case of an uninitialized object.
Setters
The same thing goes for setters, especially those which are overridden from a superclass. The superclass may call the setter with nil
to destroy an object rather than directly using release
. Always write your code to tolerate this. For example:
- (void)setFoo: (id)newFoo
{
// make sure we don't do something bad if newFoo is nil!
if(newFoo)
[[NSNotificationCenter defaultCenter]
addObserver: self
selector: @selector(_fooChanged)
name: FooDidChangeNotification
object: newFoo];
[super setFoo: newFoo];
}
nil
. It's just good practice, and doesn't hurt.
+initialize
Just as you have to code defensively when initializing your objects, so do you have to for initializing your classes.
The +initialize
method is extremely useful for doing basic setup of class-wide data. It's invoked automatically by the runtime the first time any message (such as alloc
) is sent to your class.
For example, say you need a global dictionary to hold certain items:
static NSMutableDictionary *gDictionary;
+ (void)initialize
{
gDictionary = [[NSMutableDictionary alloc] init];
}
+initialize
. The runtime still sends the message, and so it ends up invoking your version instead. Suddenly you've leaked the old dictionary and lost all of the data in it. Whoops.
The fix is simple: just check the identify of self
before you do your initialization.
static NSMutableDictionary *gDictionary;
+ (void)initialize
{
if(self == [MyClass class])
gDictionary = [[NSMutableDictionary alloc] init];
}
+initialize
implementation should always be a check of self
to ensure that it's the correct class.
You might be thinking that you didn't write any subclasses and don't plan to, so you don't need to do this. There are two problems with that approach.
First is the obvious one that you might simply be wrong about your future plans. If in a year you change your mind and decide that you do need a subclass, you don't want this code to suddenly break in mysterious and hard-to-debug ways.
Less obvious is the fact that subclasses of your class can be created dynamically at runtime. This is done by Cocoa's key-value observing system, a key component of Cocoa bindings. These dynamic subclasses still cause +initialize
to execute, so be prepared for it.
Memory Management
If you have an object pointer instance variable, never assign to that variable except in init
, dealloc
, and a setter method. In other words, don't write code that does this directly:
[myObjIvar release];
myObjIvar = [[self makeNewObj] retain];
[self setMyObjIvar: [self makeNewObj]];
On the subject of setters, if you're writing your own setter, remember that you must either do an if
check to make sure that the object you have is genuinely new, or you must do a retain
(or copy
) before you do a release
, to make your code robust against calling the setter with the same object as is already held in the variable. In other words, don't do this:
- (void)setMyObjIvar: (id)obj
{
[myObjIvar release]; // could destroy obj!
myObjIvar = [obj retain];
}
- (void)setMyObjIvar: (id)obj
{
if(obj != myObjIvar)
{
[myObjIvar release];
myObjIvar = [obj retain];
}
}
- (void)setMyObjIvar: (id)obj
{
[obj retain]; // or obj = [obj copy];
[myObjIvar release];
myObjIvar = obj;
}
Copying
Object copying is a source of horrors when it comes to subclassing Cocoa objects, due to the two different ways that exist to create a copy of an object in the top-level implementation of copyWithZone:
.
The sane way to implement copyWithZone:
is to either return [self retain]
(only if the object is immutable) or to create a new instance of the object's class and populate it to hold the same values using accessors or direct instance variable access:
- (id)copyWithZone: (NSZone *)zone
{
// note use of [self class] rather than MyClass
// this is defensive programming against subclassing!
MyClass *newObj = [[[self class] alloc] init];
[newObj setFoo: _foo];
[newObj setBar: _bar];
return newObj;
}
copyWithZone:
is to use the NSCopyObject
function. This function allocates a new object of the same class and returns it. The problem is that it also performs a bitwise copy of all instance variables. These instance variables include things like pointers to objects. It does not retain them, merely copies their value.
Never, ever, ever use NSCopyObject
. Easy, right? The problem is that Cocoa uses it, and if you ever subclass Cocoa objects, you have to be aware.
Even worse: you often can't count on the Cocoa implementation using one technique or another. It could use either one, and which one it uses could switch at any time. So if you ever subclass a Cocoa object that implements NSCopying
, and you add instance variables, you need to write a copyWithZone:
override that handles both cases correctly. Fortunately this is easier than it sounds.
First, let's examine the problem in more detail. Here's an example NSCell subclass:
@interface MyCell : NSCell
{
NSString *_someString;
}
- (void)setSomeString: (NSString *)newString;
@end
@implementation MyCell
- (void)setSomeString: (NSString *)newString
{
[newString retain];
[_someString release];
_someString = newString;
}
- (void)dealloc
{
[_someString release];
[super dealloc];
}
@end
MyCell *cell = [[MyCell alloc] init];
[cell setSomeString: string];
MyCels *cell2 = [cell copy];
[cell release];
[cell2 release];
The implementation of -[NSCell copyWithZone:]
uses NSCopyObject
. This means that it copies the _someString
pointer, but does no memory management on it. You end up with two pointers to the string but only one of which is retained. They both get released in dealloc
. Crash.
The solution is simple: override copyWithZone:
and retain or copy the instance variable:
- (id)copyWithZone: (NSZone *)zone
{
MyCell *newObj = [super copyWithZone: zone];
[newObj->_someString retain];
return newObj;
}
NSCell
implementation changed to no longer call NSCopyObject
, this implementation will fail, because newObj->_someString
will be nil
. It won't crash, but the copy won't be a proper copy either. And you can't just switch to using [newObj setSomeString: _someString]
because that crashes the NSCopyObject
case. We need code that works equally well for both.
The answer is to directly assign to the instance variable of the other object and do memory management at the same time, like so:
- (id)copyWithZone: (NSZone *)zone
{
MyCell *newObj = [super copyWithZone: zone];
newObj->_someString = [_someString retain];
return newObj;
}
NSCopyObject
case, it overwrites the existing pointer with a new one, but without releasing the old one.
Note that NSCell
is by far the most commonly subclassed class that conforms to NSCopying
. If you ever subclass NSCell
or one of its subclasses, you must implement copyWithZone:
, and you must do it correctly using the above technique. Otherwise you leave yourself open to extremely mysterious crashes and corruption problems. I've been there, it's no fun.
Error Returns
A lot of Cocoa methods take NSError **
parameters to tell you why something went wrong. Always check the retun value from these methods! And always at least log the error if there is one.
Don't do this:
NSString *string = [NSString stringWithContentsOfFile: path encoding: NSUTF8StringEncoding error: NULL];
[self corruptImportantDataIfNil: string];
string
and use the error if it's nil
. At the very least, use an assert to nicely stop the current operation rather than continuing with bad data:
NSError *error;
NSString *string = [NSString stringWithContentsOfFile: path encoding: NSUTF8StringEncoding error: &error];
NSAssert(string, @"Could not load string, error: %@", error);
[self corruptImportantDataIfNil: string];
Always at least check for failure and log the error. Even if subsequent code won't corrupt data in the error case, it's still much easier to figure out why your code isn't working if you can catch the failure as early as possible.
A Note on Checking Errors
This has been repeated in many other places, but bears another mention. Always check the return value of such a method, and not the NSError
variable directly. For example, this is wrong:
NSError *error = nil;
NSString *string = [NSString stringWithContentsOfFile: path encoding: NSUTF8StringEncoding error: &error];
NSAssert(!error, @"Could not load string, error: %@", error);
error
variable with junk upon success. This is a stupid policy, but it's the policy which is there, so you must take it into account.
Weak References
The standard way to set up an NSTableView
using a data source is to create an object to be the data source, implement the required methods, and then hook up the dataSource
outlet of the table view. However, if you don't do anything else, this is actually a dangerous setup.
The problem is that the order of object destruction isn't defined. If your data source gets destroyed before the table view, the table view could potentially message the data source after it's been destroyed, causing your program to crash.
To ensure that this never happens, you must zero out the data source in your dealloc
. However, simply adding that isn't safe either! It's possible for the table view to be destroyed first, and then your message to it will crash. What to do?
The answer to this conundrum is to retain the table view outlet. Don't rely on direct instance variable twiddling, but rather write a setter for it that retains it. Then you can write your dealloc
like this:
- (void)dealloc
{
[tableView setDataSource: nil];
[tableView release];
[super dealloc];
}
Most weak references are subject to this problem, including most delegate references from Cocoa objects. Most of the time you can get away with ignoring it, but it's much safer to do it right.
For a more comprehensive solution to the problem of weak references, check out MAZeroingWeakRef. By using the MAZeroingWeakProxy
class and the Objective-C associated object API, it becomes trivial to create delegate and data source connections which are completely safe.
Categories
Categories are a great feature of Objective-C. It's really handy to be able to add new methods to Cocoa classes that you can then use anywhere in your program. However, it can also be dangerous.
The danger comes when your method name clashes with one that Apple has put in the class. Your method will override the existing one. If they do different things, trouble will ensue. Even worse, your category method could be just fine today, but conflict with a method added by Apple in the next release of the OS.
For example, say you add a map:
method to NSArray:
@interface NSArray (MyAdditions)
- (NSArray *)map: (id (^)(id obj))block;
@end
map:
method in 10.7, and it doesn't have identical semantics, you'll be in big trouble.
The only way to avoid this is to ensure that your category methods on Apple classes will never have a name conflict. The most obvious way to ensure this is to add a prefix to the method name:
- (NSArray *)ma_map: (id (^)(id obj))block;
Another way to ensure that you never have a conflict is to give your method an extremely specific name that you can be confident will never be used by Apple. For example:
- (NSArray *)arrayByMappingMyFooElements: (id (^)(MyFoo *foo))block;
Conclusion
Cocoa is a complex system that can have some hidden gotchas. However, for the most part, all you need to do is be aware of the fact that your application is really just one component in a larger system. Write your application to be a good citizen and cooperate nicely with the other components that get loaded into your process. Be particularly careful when subclassing and when making modifications to framework classes. The cases presented above are just a sampling of useful defensive programming techniques in Cocoa. Keep defensive programming on your mind no matter what you write, and your code will be better for it.
That's it for this round of Friday Q&A. I will probably be taking a few weeks off for (good) personal reasons, so don't panic if two weeks go by and I haven't posted another one. I will resume before too long, but won't make any promises as to when just yet.
As always, if you have ideas for topics that you would like to see covered here, send them to me. I may not get to it right away, but I will certainly put your suggestion on my list for the future.
Comments:
I personally prefer to use a method suffix instead of a method prefix. That way I can still name the method something logical and find it in the autocompletion list, without having to remember that it's a category method. So instead of doing:
- (void) dd_map:(id (^)(id obj))block;
I do:
- (void) map_dd:(id(^)(id obj))block;
1st:
I use the
- (id)init
{
if(! (self=[super init]) )
{
return nil;
}
// put your code herebut remember
if( somethingfails )
{
[self release];
return nil;
}
return self;
}
this makes it easier to understand in my opinion as you don't have a mile long open/close bracket if your init routine is long. You don't have to keep in mind that open bracket the whole time.
Furthermore remind people to use [self release] in the init method once they have self filled with something but can't initialize the object anyways.
2nd: the link to http://www.mikeash.com/pyblog/the-how-and-why-of-cocoa-initializers.html is broken as it contains a URI which is too long.
(http://developer.apple.com/iphone/library/documentation/Cocoa/Conceptual/CodingGuidelines/CodingGuidelines.pdf page 11)
Rob Wills: I've always found that warning to be a bit strange. Using an underscore prefix is at the very least no more dangerous than not using a prefix at all. Apple can and does add un-prefixed methods to their classes, so unprefixed names can clash too. Adding an underscore at the very least does not increase the probability of breaking something in the future.
I would think colliding with a public Apple method would be preferable to colliding with a private one. In the case of colliding with an underscore prefix method, you're guaranteed to hit a private method which will be trickier to catch (I realize I might be missing something here).
Suffixes would seem to solve this problem as Dave DeLong mentioned above. Although using suffixes when a method takes an argument (or multiple arguments in particular) looks very ugly in Objective-C to my eyes.
In fact, I'd say that you should ALWAYS implement -hash, if only to make sure that you know you've thought of the implications.
Anyone: Related: I've always wondered if it would be fine to use two or maybe even three underscores as a method prefix rather than initials or something. Granted this would only work if by convention it was only used in application code and never framework code, so that nobody had the monopoly on multi-underscores thus disabling anyone else from using them when adding a framework to their app.
Dave DeLong: Oh that's even uglier (IMHO). It comes right before the argument! And what happens when your method has multiple arguments?
Karan: Do you mean checking if an ivar of the newly copied object is nil, or checking if the newly copied object itself is nil? If the latter, I agree that it would be safer to do that, but I can't think of a single time where -copyWithZone: has failed or would ever fail or was documented to fail, nor have I ever seen an implementation where it would fail and return nil. But yes, technically it should be safer.
Patrick Stein: As a style, I personally find that one too repetitive to use, because you return several times when you really only need to return twice (self at the end of the method, or nil after an error).
But on the note of different coding styles for -init, if only you could use "return self" or "break" as the last clause of a for() loop, you could do something like this:
- (id) init {
for (self = [super init]; self; break) {
/* ... */
if (something_goes_wrong) {
[self release], self = nil;
break;
}
/* ... */
}
return self;
}
Granted it would be more likely if Mike Ash submitted this as an actual bug report to Apple (I would but it's just too early in the morning).
obj
can be freed even if it's unequal to the released ivar, if the ivar holds the only reference to obj
and releases it during dealloc. (For example, suppose you're implementing a linked list.) The second pattern you propose is safer: always retain before releasing.
With both patterns, the ivar is temporarily assigned to a value which could have been freed, so I'm not sure that either is thread-safe. (I may be wrong -- I'll admit I was releasing first before I saw this post.) Wouldn't the following be safer?
NSObject * temp = myObjIvar;
myObjIvar = [obj retain];
[temp release];
Or, more elegantly,
[myObjIvar autorelease];
myObjIvar = [obj retain];
It is possible to do a lockless thread-safe setter but it is an enormous amount of work, something like 100-200 lines of code last time I tried it. Basically, if you need thread-safe setters (and that's usually the wrong place for thread safety, but not always) then use a lock.
"[self dealloc] or [self release] are bad because they might call some subclass's -dealloc method even though the subclass's -init hasn't done anything yet."[1] Remember method dispatch is always dynamic in Obj-C.
[1] http://lists.apple.com/archives/objc-language/2008/Sep/msg00133.html
[super dealloc]
directly has been in vogue lately, it's contradicted by the documentation:
Subclass implementations of this method should initialize and return the new object. If it can’t be initialized, they should release the object and return nil.
(From the docs for
-[NSObject init]
.)
I mention that this is why you should always write your
dealloc
implementation to tolerate an uninitialized object. I would go so far as to say that not doing so would be a bug in the subclass.
Personally, I trust Greg Parker's advice (in the link I posted above) more than the docs. Of course, it doesn't contradict your advice to 'always write your dealloc implementation to tolerate an uninitialized object', one should do that _also_.
Anyway, this is another nice thing about GC... no release/dealloc anyway. :)
NSCopyObject()
problem that doesn't involve tricks with C pointers. The Smalltalk idiom for Object copy is to implement it as follows:
copy
^self shallowCopy postCopy
shallowCopy
is a primitive that behaves essentially the same way as NSCopyObject()
. Apple's documentation states that it will be deprecated at some point, but I think the alternatives are worse.
in Objective-C syntax the most abstract class might contain something like:
-(id)copyWithZone:(NSZone *)zone {
// shallow copy
id copy = NSCopyObject(self, 0, nil);
[copy postCopy];
return copy;
}
As a rule, subclasses should not overrride
copy
, but they could implement postCopy
to maintain reference counting semantics:
-(void)postCopy {
[super postCopy];
[someString retain];
}
typo:
s/The superclass my call the setter/The superclass may call the setter/
Copy:
copyWithZone: shouldn't call alloc, as you've written, it should call allocWithZone: That's what allocWithZone: is for.
Regarding
copyWithZone:
, while you need to support the zone APIs when overriding in case someone decides to call them directly, I don't think it's important to actually respect the zone parameter. Zones are pretty much unused and useless these days. Certainly it would do no harm to use allocWithZone:
in your implementation, but I don't think it's worth any bother.Comments RSS feed for this page
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.
There is a lot of practical wisdom here. I know the future me really appreciates all of the debugging time and anguish you have likely saved him. Enjoy your time off.