Next article: Friday Q&A 2009-06-05: Introduction to Valgrind
Previous article: Use NSOperationQueue
Tags: fridayqna objectivec
Welcome back to another cromulent Friday Q&A. After taking a few weeks off I intend to resume the regular schedule. We'll see how far that intention takes me, but I'm hopeful. This week I'm going to take Daniel Jalkut's suggestion to discuss class loading and initialization in Objective-C.
How classes actually get loaded into memory in Objective-C aren't anything that you, the programmer, need to worry about most of the time. It's a bunch of complicated stuff that's handled by the runtime linker and is long done before your code ever starts to run.
For most classes, that's all you need to worry about. But some classes need to do more, and actually run some code in order to perform some kind of setup. A class may need to initialize a global table, cache values from user defaults, or do any number of other tasks.
The Objective-C runtime uses two methods to provide this functionality: +initialize
and +load
.
+load
+load
is invoked as the class is actually loaded, if it implements the method. This happens very early on. If you implement +load
in an application or in a framework that an application links to, +load
will run before main()
. If you implement +load
in a loadable bundle, then it runs during the bundle loading process.
Using +load
can be tricky because it runs so early. Obviously some classes need to be loaded before others, so you can't be sure that your other classes have had +load
invoked yet. Worse than this, C++ static initializers in your app (or framework or plugin) won't have run yet, so if you run any code that relies on that it will likely crash. The good news is that frameworks you link to are guaranteed to be fully loaded by this point, so it's safe to use framework classes. Your superclasses are also guaranteed to be fully loaded, so they are safe to use as well. Keep in mind that there's no autorelease pool present at loading time (usually) so you'll need to wrap your code in one if you're calling into Objective-C stuff.
An interesting feature of +load
is that it's special-cased by the runtime to be invoked in categories which implement it as well as the main class. This means that if you implement +load
in a class and in a category on that class, both will be called. This probably goes against everything you know about how categories work, but that's because +load
is not a normal method. This feature means that +load
is an excellent place to do evil things like method swizzling.
+initialize
The +initialize
method is invoked in a more sane environment and is usually a better place to put code than +load
. +initialize
is interesting because it's invoked lazily and may not be invoked at all. When a class first loads, +initialize
is not called. When a message is sent to a class, the runtime first checks to see if +initialize
has been called yet. If not, it calls it before proceeding with the message send. Conceptually, you can think of it as working like this:
id objc_msgSend(id self, SEL _cmd, ...)
{
if(!self->class->initialized)
[self->class initialize];
...send the message...
}
+initialize
happens once per class, and it happens the first time a message is sent to that class. Like +load
, +initialize
is always sent to all of a class's superclasses before it's sent to the class itself.
This makes +initialize
safer to use because it's usually called in a much more forgiving environment. Obviously the environment depends on exactly when that first message send happens, but it's virtually certain to at least be after your call to NSApplicationMain()
.
Because +initialize
runs lazily, it's obviously not a good place to put code to register a class that otherwise wouldn't get used. For example, NSValueTransformer or NSURLProtocol subclasses can't use +initialize
to register themselves with their superclasses, because you set up a chicken-and-egg situation.
This makes it a good place to do virtually everything else as far as class loading goes, though. The fact that it runs in a much more forgiving environment means you can be much freer with the code you write, and the fact that it runs lazily means that you don't waste resources setting your class up until your class actually gets used.
There's one more trick to +initialize
. In my pseudocode above I wrote that the runtime does [self->class initialize]
. This implies that normal Objective-C dispatch semantics apply, and that if the class doesn't implement it, the superclass's +initialize
will run instead. That's exactly what happens. Because of this, you should always write your +initialize
method to look like this:
+ (void)initialize
{
if(self == [WhateverClass class])
{
...perform initialization...
}
}
+initialize
method. This is not just a theoretical concern, even if you don't write any subclasses. Apple's Key-Value Observing creates dynamic subclasses which don't override +initialize
.
Conclusion
Objective-C offers two ways to automatically run class-setup code. The +load
method is guaranteed to run very early, as soon as a class is loaded, and is useful for code that must also run very early. This also makes it dangerous, as it's not a very friendly environment to run it.
The +initialize
method is much nicer for most setup tasks, because it runs lazily and in a nice environment. You can do pretty much anything you want from here, as long as it doesn't need to happen until some external entity messages your class.
That wraps up Friday Q&A for this week. Come back next week for another exciting edition. As always, e-mail your suggestions or post them below. Without your valuable contribution of ideas, Friday Q&A can't operate, so send yours in today!
Comments:
Mike: I love your Q&A's! Didn't know about +load, thanks :)
Back when I "coded" in C++, I wrote a StaticInitialization class that one could inherit from to get the equivalent of +load (it used a macro to define a static int being initialized by a function call). It had a function called require() that would throw an exception if the required class hadn't been +load'ed, which would be caught by that global initialize function and the throwing initializer would be put on a queue to be checked again once the require()'d class was +load'ed. Phew! It actually worked, too!
Hm, come to think of it, I even have the source here: http://tr.im/static_h http://tr.im/static_cpp
http://www.friendlystapler.se/browser/RMS/trunk/Source/Include/Utility/StaticInitializer.h
http://www.friendlystapler.se/browser/RMS/trunk/Source/Implementation/Utility/StaticInitializer.cpp
Sorry for the double post
Thanks
More info here if anyone wants to look at the code:
https://github.com/billymeltdown/nsdate-helper/issues/7
Cheers!
My recommendation would be to use dispatch_once to lazily initialize your static variable. The additional overhead is insignificant and you completely avoid problems like this.
static WhateverClass *sharedInstance;
+ (void)initialize
{
if(self == [WhateverClass class])
{
sharedInstance = [[self alloc] init];
}
}
It would mean we can get rid of all these ugly dispatch_once! It looks too good to be true!
"You really can't call out to any other Objective-C classes, because they may rely on +load too, and yours might run first. "
However, that seems to contradict what you said in your article above: "The good news is that frameworks you link to are guaranteed to be fully loaded by this point, so it's safe to use framework classes." Since Billy Gray's category is in his framework, and it links against Foundation framework, shouldn't that mean NSCalendar's currentCalendar stuff (which is in Foundation) should have already been "+load"ed?
That just points to NSCalendar not setting currentCalendar via +load.
+ (void) initialize {
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
// <do stuff here once, Johnny-Dangerously style>
});
}
Great work btw.
./scc
There is no need to add dispatch_once(). It is already provided at least since iOS 6 and presume the same for Mac OS X. You can put a breakpoint in +initialize and see it in the stack trace.
(I recall testing iOS 5 and found that it had acquired a pthread_mutex lock before calling +initialize)
My case: A class that needs to register itself to some other class from the same library, so in +load, I can't be sure that this other class is loaded yet, and I can't use +initialize becuase the class might never get messages then.
+load
that then use dispatch_async
or CFRunLoopPerformBlock
to schedule the real code for execution once the main runloop starts up. It should be safe to use these APIs from +load
since they're lower level, and frameworks that you link to are guaranteed to be initialized first.Comments RSS feed for this page
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.
For example:
-------
static NSDateFormatter *dateFormatter = nil;
static NSCalendar *gregorian = nil;
@implementation WhatEver
+ (void)initialize {
if(self != [WhatEver class]) { return; }
dateFormatter = [[NSDateFormatter alloc] init];
[dateFormatter setDateFormat:@"MMM dd, yyyy"];
gregorian = [[NSCalendar alloc] initWithCalendarIdentifier:NSGregorianCalendar];
}