Next article: Friday Q&A 2010-03-05: Compound Futures
Previous article: Friday Q&A 2010-02-19: Character Encodings
Tags: blocks fridayqna futures
Welcome back to another shiny edition of Friday Q&A. Guy English suggested taking a look at implementing futures in Objective-C using blocks, and for this week's post I'm going to talk about the futures implementation that I built.
Futures
A future is, in short, an object which hides a calculation. When a future is resolved, the resolution blocks until the calculation has finished. My code involves implicit futures: a future is a proxy for the calculated result and, when messaged, transparently resolves the future and then passes the message on to the result. To code which uses it, a future is essentially indistinguishable from the object that it represents.
In Objective-C, it's natural to represent the calculation using a block. Futures are created by simply calling a function which takes a block representing the calculation. They return a proxy object which captures messages sent to that object and resolves the future as needed. My code has two kinds of futures.
Background futures begin the calculation immediately on a background thread as soon as the future is created. When the future is resolved, if the calculation is not yet complete, the call blocks until it's done. If the calculation finishes first, then resolution completes immediately. Background futures provide a way to write parallel code without needing to worry about details of synchronization. For example, if you're going to pass an NSData
to another object that won't actually use that NSData
for a while, you can use a background future to allow other code to run concurrently with the disk access with very little effort:
NSString *filename = ...;
NSData *future = MABackgroundFuture(^{ return [NSData dataWithContentsOfFile: filename]; });
[object doSomethingLaterWithData: future];
NSString *filename = ...;
NSData *future = MALazyFuture(^{ return [NSData dataWithContentsOfFile: filename]; });
[object doSomethingOrNotWithData: future];
Getting the Code
As usual, I'm just going to cover the highlights of the code here, but the library is available from my subversion repository:
svn co http://mikeash.com/svn/MAFuture/Or just click on the hyperlink above to browse it.
A Custom Proxy Class
When building object proxies, Cocoa provides a convenient root proxy class to subclass in the form of NSProxy
. The idea is that NSProxy
provides a minimal implementation, which allows almost all messages to be captured and proxied.
Unfortunately, NSProxy
doesn't quite live up to its promise. It implements a whole bunch of unnecessary methods, including important ones like -hash
and -isEqual:
. This is a problem because I don't want NSProxy
's implementations of these, I want the implementations in the proxied object. To make that happen, I'd have to either manually override these methods, or play runtime tricks to make them hit the forwarding path. Neither alternative is particularly appealing.
Instead, I chose a third route: implement my own proxy class. MAProxy
is a true minimalistic proxy class. It implements memory management methods and -isProxy
. It also implements +initialize
, which the runtime requires to exist in every class. The header is really basic and the implementation nearly as simple. The only tricky stuff at work is the inline reference count using atomic functions for thread safety.
Future Basics
To make implementing futures easier, I created a base class called MABaseFuture
which provides common facilities. A basic future needs to be able to store a value, to store whether the future has been resolved yet or not, and a condition variable to make it all thread safe:
@interface MABaseFuture : MAProxy
{
id _value;
NSCondition *_lock;
BOOL _resolved;
}
- (id)init
{
_lock = [[NSCondition alloc] init];
return self;
}
- (void)dealloc
{
[_value release];
[_lock release];
[super dealloc];
}
- (void)setFutureValue: (id)value
{
[_lock lock];
[self setFutureValueUnlocked: value];
[_lock unlock];
}
- (id)futureValue
{
// skip the usual retain/autorelease dance here
// because the setter is never called more than
// once, thus value lifetime is same as future
// lifetime
[_lock lock];
id value = _value;
[_lock unlock];
return value;
}
- (void)setFutureValueUnlocked: (id)value
{
[value retain];
[_value release];
_value = value;
_resolved = YES;
[_lock broadcast];
}
- (BOOL)futureHasResolved
{
return _resolved;
}
- (id)waitForFutureResolution
{
[_lock lock];
while(!_resolved)
[_lock wait];
[_lock unlock];
return _value;
}
-resolveFuture
method which is abstract. Subclasses must override it and do whatever they need to do:
- (id)resolveFuture
{
NSLog(@"-[MABaseFuture resolveFuture] called, this should never happen! Did you forget to implement -[%@ resolveFuture]?", NSStringFromClass(isa));
NSParameterAssert(0);
return nil;
}
-class
. Normally this implementation wouldn't be necessary, as the proxy mechanism will proxy that method just fine. The problem arises in the implementation of -[NSCFString isEqual:]
, part of the CFString
toll-free bridging, and with other bridged classes. That code checks the class of the other object, and if it's an NSCFString
as well, hits a fast path that depends on internal implementation details of NSCFString
. If -class
returns NSCFString
when the object is really a proxy, that code fails and the two strings will never compare as equal, even when they are.
The fix is simple, if bizarre. Get the real class, check to see if it starts with NSCF
, and return the superclass if it does. If the real class is NSCFString
, this will return NSString
, the code goes through the general equality path, and all is well. This is the implementation of -class
:
- (Class)class
{
Class c = [[self resolveFuture] class];
if([NSStringFromClass(c) hasPrefix: @"NSCF"])
return [c superclass];
else
return c;
}
MABaseFuture
.
Deepening the Hierarchy
Building on MABaseFuture
, I want to then create a tree of subclasses. First, _MASimpleFuture
will contain some more common facilities for "simple" futures (futures which immediately resolve when accessed), then I'll create two subclasses of that for background and lazy futures.
_MASimpleFuture
is very, well, simple. It implements -forwardingTargetForSelector:
to resolve the future and return the object that resulted:
- (id)forwardingTargetForSelector: (SEL)sel
{
LOG(@"%p forwardingTargetForSelector: %@, resolving future", self, NSStringFromSelector(sel));
return [self resolveFuture];
}
-resolveFuture
, and they get forwarding for free.
Forwarding to nil
There's a bad corner case here, which happens if the future returns nil
. Messaging nil
is no problem, but forwardingTargetForSelector:
takes a nil
return as meaning that there is no forwarding target, and the runtime should start on the slow forwarding path instead.
And here there's a major problem, because the slow forwarding path requires a method signature, but it's impossible to get one from nil
. I've partially solved this in an extremely brute-force fashion by writing a class called MAMethodSignatureCache
, which will check every class registered with the runtime for a selector and return whatever method signature it can dig up. (I didn't write it solely for this, there's more handy stuff to do with it in another post.) I can use this class to implement the slow forwarding path of _MASimpleFuture
to return zero:
- (NSMethodSignature *)methodSignatureForSelector: (SEL)sel
{
return [[MAMethodSignatureCache sharedCache] cachedMethodSignatureForSelector: sel];
}
- (void)forwardInvocation: (NSInvocation *)inv
{
// this gets hit if the future resolves to nil
// zero-fill the return value
char returnValue[[[inv methodSignature] methodReturnLength]];
bzero(returnValue, sizeof(returnValue));
[inv setReturnValue: returnValue];
}
nil
".
If that's not enough, there's another problem with futures that return nil
. This problem is quite simple: although the futured value may be nil, the future object itself is not nil. Any code which checks the object pointer for nil before using it will fail in weird ways. Imagine this code using an NSData:
NSData *data = ...;
if(data)
[self doSomethingWithBytes: [data bytes]];
data
is a future that resolves to nil
, then the if
check will pass, but [data bytes]
will return NULL
, causing a crash.
Because of these two problems, you should avoid futuring any computation which might return nil
.
Background Futures
To implement background futures, I created a class called _MABackgroundBlockFuture
. Since computation is supposed to begin immediately in the background, I create an initializer which takes the block to compute, and uses Grand Central Dispatch to execute it in the background. Once the computation is finished, it simply calls -setFutureValue:
to set the computed value and mark the future as resolved:
- (id)initWithBlock: (id (^)(void))block
{
if((self = [self init]))
{
dispatch_async(dispatch_get_global_queue(0, 0), ^{
[self setFutureValue: block()];
});
}
return self;
}
-resolveFuture
is then extremely simple. Since the future is already being computed, it just waits for it to finish, then returns the result:
- (id)resolveFuture
{
return [self waitForFutureResolution];
}
I created
_MALazyBlockFuture
to implement lazy futures. A lazy future doesn't begin computation right away, so it just needs to store a copy of the block when initialized, and release it when deallocating:
- (id)initWithBlock: (id (^)(void))block
{
if((self = [self init]))
{
_block = [block copy];
}
return self;
}
- (void)dealloc
{
[_block release];
[super dealloc];
}
- (id)resolveFuture
{
[_lock lock];
if(![self futureHasResolved])
{
[self setFutureValueUnlocked: _block()];
[_block release];
_block = nil;
}
[_lock unlock];
return _value;
}
This code now has all the functionality that's needed, but I want a couple of wrappers to make it nicer to use:
id MABackgroundFuture(id (^block)(void))
{
return [[[_MABackgroundBlockFuture alloc] initWithBlock: block] autorelease];
}
id MALazyFuture(id (^block)(void))
{
return [[[_MALazyBlockFuture alloc] initWithBlock: block] autorelease];
}
id
, the compiler won't be able to catch mistakes like:
NSArray *array = MALazyFuture(^{ return [self somethingThatReturnsNSString]; });
NSString *
instead of id
) even though they're completely compatible.
I worked around both of these problems by using two really scary-looking macros:
#define MABackgroundFuture(...) ((__typeof((__VA_ARGS__)()))MABackgroundFuture((id (^)(void))(__VA_ARGS__)))
#define MALazyFuture(...) ((__typeof((__VA_ARGS__)()))MALazyFuture((id (^)(void))(__VA_ARGS__)))
First, they take variable arguments, because block syntax doesn't play completely nice with the preprocessor. If you write a block which contains a comma that isn't inside parentheses (which can be written completely legally), the preprocessor won't realize that it's inside a block, and will use that as an argument separator. The preprocessor will therefore see two (or more) arguments, and a single-argument macro will fail. By making the macros take ...
, that problem is avoided. Thus, the block parameter is represented in the macro by __VA_ARGS__
.
__typeof
is a gcc language extension which pretty much does what it says. You give it an expression, and it gives you the type of the expression.
The argument to __typeof
is (__VA_ARGS__)()
. Remember that __VA_ARGS__
is the block being passed to the macro. This expression calls the block. But since it's an argument being passed to __typeof
, which is a compile-time construct, it doesn't really call the block. Put the two together, and you get the return type of the block.
Next, the whole __typeof
combination is wrapped in another set of parentheses and put right before the call through to the real function, which casts the return value of the function.
Finally, the function argument is cast to id (^)(void)
, because gcc is too stupid to understand that a block which returns NSString *
is compatible with a block type that returns id
.
Potential Uses
Background futures are useful any time you have computation which can happen asynchronously. In essence, you can think of the future as a synchronization mechanism, like a lock, which ensures that the job is completed before the result is used. Because these futures automatically forward requests, you can pass the future into code that doesn't know what it is, and it will be resolved automatically.
For example, let's say you're building a composite image by loading one image from disk, shrinking another image that already exists, and then putting the result into an image view:
NSImage *image1 = [[NSImage alloc] initWithContentsOfFile: ...];
NSImage *image2 = [self shrinkImage: existingImage];
NSImage *composite = [[NSImage alloc] initWithSize: ...];
[composite lockFocus];
[image1 drawAtPoint: NSZeroPoint fromRect: NSZeroRect operation: NSCompositeSourceOver fraction: 1.0];
[image2 drawAtPoint: NSZeroPoint fromRect: NSZeroRect operation: NSCompositeSourceOver fraction: 1.0];
[composite unlockFocus];
[imageView setImage: composite];
image1
and image2
to run in parallel, and there's at least the possibility that some of the work for composite
could run in parallel with main thread work too:
NSImage *image1 = MABackgroundFuture(^{ return [[[NSImage alloc] initWithContentsOfFile: ...] autorelease]; });
NSImage *image2 = MABackgroundFuture(^{ return [self shrinkImage: existingImage]; });
NSImage *composite = MABackgroundFuture(^{
NSImage *img = [[NSImage alloc] initWithSize: ...];
[img lockFocus];
[image1 drawAtPoint: NSZeroPoint fromRect: NSZeroRect operation: NSCompositeSourceOver fraction: 1.0];
[image2 drawAtPoint: NSZeroPoint fromRect: NSZeroRect operation: NSCompositeSourceOver fraction: 1.0];
[img unlockFocus];
return [img autorelease];
});
[imageView setImage: composite];
imageView
never has to know that it's getting a proxy instead of the real thing.
Lazy futures are useful any time you have objects that may never be needed, or simply may not be needed for a long time. Even if the object is used, deferring computation can spread out the load and improve responsiveness and startup times.
As an example, imagine some code which sets up a bunch of data file contents to be accessed through a dictionary:
gGlobalDictionary = [[NSDictionary alloc] initWithObjectsAndKeys:
[NSData dataWithContentsOfFile: ...], @"dataFile",
[NSData dataWithContentsOfFile: ...], @"anotherDataFile",
[NSData dataWithContentsOfFile: ...], @"moreDataFile",
[NSData dataWithContentsOfFile: ...], @"fourthDataFile",
nil];
gGlobalDictionary = [[NSDictionary alloc] initWithObjectsAndKeys:
MALazyFuture(^{ return [NSData dataWithContentsOfFile: ...] }), @"dataFile",
MALazyFuture(^{ return [NSData dataWithContentsOfFile: ...] }), @"anotherDataFile",
MALazyFuture(^{ return [NSData dataWithContentsOfFile: ...] }), @"moreDataFile",
MALazyFuture(^{ return [NSData dataWithContentsOfFile: ...] }), @"fourthDataFile",
nil];
Conclusion
Futures are an interesting technique which can make it much easier to use lazy evaluation and parallel computation. Futures make it easy to use lazy evaluation even when you have no control over (or no desire to change) the code that will eventually use the value in question. They also make for a handy synchronization mechanism for performing heterogeneous parallel computations. The dynamic nature of Objective-C makes it possible to mostly hide the existence of the future from code that isn't involved in creating it.
That's it for this week's edition. Come back in a week for the next one. As always, Friday Q&A is driven by user suggestions, so if you have a topic that you'd like to see covered here, send it in!
Comments:
Also, for the problem of nil resolved values, wouldn't it be possible to use some kind of default value ? You add a parameter to your method that allows the implementor to provide a value (for example if an NSString is required you put a @"" as default value) so in case the resolved value is nil it will return that object instead.
On a side note, when you use an instance of NSString class itself (the abstract class, which is illegal), the abstract methods throw an exception like that if that interests you :D :
@throw [NSException exceptionWithName:NSInvalidArgumentException reason:[NSString stringWithFormat:@"*** -%s cannot be sent to an abstract object of class %@: Create a concrete instance!", sel_getName(_cmd), [self class]] userInfo:nil];
In other posts with other interesting constructs you made, you have sometimes warned against not using them in "production code". What would be your advice with your futures implementation?
It seems the main concern is the 'nil' value. Remy's suggestion solves both issues of method signature and messaging to nil. But it can often only push the problem further away: the code that relies on checking for nil (which may not even be your code) may not crash or raise an exception, but will still be handling an unexpected result without realizing it, and it is likely to result in other problems down the road. Failing early might be a better option, and this can be done within MABaseFuture.
Regarding the default object, I don't see much point in making that part of the API, because it's so easy to just write your block to handle it:
id foo = MALazyFuture(^{
id obj = [self mayReturnNil];
if(!obj)
obj = placeholderValue;
return obj;
});
More useful, I think, would be extending the API to take a placeholder class that could be used to look up method signatures, e.g.:
id foo = MALazyFuture(SomeClass, ^{ return [self mayReturnNil]; });
However, this still suffers from the problem of nil checks.
Charles: I always expect my readers to use their own judgement, of course, but if I don't make the warning then I generally consider it to be safe. The nil thing is an annoying problem, but only if you're writing code which might actually return nil. If you know it can't return nil (and most code has tons of places where an unexpected nil would be fatal anyway) then everything should be safe.
There is the potential for trouble if you pass futures to code that is written in such a way as to not deal properly with proxies (like the one my
-class
override hack works around) but I expect that sort of thing to be extremely rare, and could be considered a bug in Cocoa, depending on the exact circumstances.The documentation for -isProxy says this method is required because -isMemberOfClass: and -isKindOfClass: methods do type-check on their proxied object. I would suppose then that -class would return the class of the proxy itself.
So you could do the same thing in your code, supposing people always do actual type-check using -isMemberOfClass: and -isKindOfClass: and not [obj class] == [MyClass class].
-[NSProxy class]
returns the proxy class.
Which behavior is correct is definitely debatable. Code which checks
class
and then starts accessing instance variables directly if it matches will fail hard if you proxy that message. On the other hand, code which checks class
as a first check for equality will fail if you don't proxy it.
Ultimately I lean towards making the proxy look as much like the original object as possible, which means proxying
class
.In order to maximize parallelism, the Future has to be created as early as possible and used as late as possible. That encourages developers to restructure code for performance instead of legibility. This could be done by the compiler, but obviously the compiler doesn't know about the internals.
It's a bit of a pity, because it's a very nice approach.
MALazyFuture((id) ^ () { return title; });
Comments RSS feed for this page
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.
As usual a very interesting article that I've already begun to think of how to use Futures in code I'm working on today!
-john