Next article: Friday Q&A 2009-01-02
Previous article: Friday Q&A 2008-12-19
Tags: blocks fridayqna
Welcome to another Friday Q&A. This week I thought I would take fellow amoeboid Jeff Johnson's suggestion and talk about blocks in Objective-C.
The word "blocks" is kind of ambiguous, so to clarify, I'm not talking about the compound statement structure which has existed in C since the beginning of time. I'm talking about a new addition to the language being created by Apple which adds anonymous functions to the language.
Since they're not available to the public in finished form yet, the discussion is going to be a bit imprecise in terms of syntax. But since I mainly want to talk about what they will do for us and not the absolute precise details of how to type them out, that's not a big problem. First let's see how they look:
x = ^{ printf("hello world\n"); }
x();
x = ^(int a, char *b){ printf("a is %d and b is %s", a, b); }
x(42, "fork!");
int a = 42;
char *b = "fork!";
x = ^{ printf("a is %d and b is %s", a, b); }
x();
int a = 42;
char *b = "fork!";
callblock(^{ printf("a is %d and b is %s", a, b); });
callblock()
function calls that block, the block will still get access to our local variables a and b even though we never passed them to the function explicitly.
We're just about done with the basics of what blocks are. One more quick example, a block that returns a value:
x = ^(int n){ return n + 1; };
printf("%d\n", x(2));
So what's the big deal? A major advantage of blocks is that they essentially allow you to write your own control structures in the language without having to alter the compiler. As one example, take the for(... in ...)
syntax that appeared in Leopard. This syntax is a wonderful addition to the language. Previously we had to write a bunch of code just to iterate over an array:
NSEnumerator *enumerator = [array objectEnumerator];
id obj;
while((obj = [enumerator nextObject]))
// finally we can do something with obj
for(id obj in array)
my_for(array, ^(id obj){ /* loop body goes here */ });
[array do:^(id obj){ /* loop body goes here */ }];
-do:
method is left up to the reader, but rest assured that it's relatively simple.
As another example, consider the @synchronized
directive. This could be redone using blocks too:
[obj synchronized:^{ /* this is protected by the lock */ }];
for/in
and @synchronized
are already part of the language, why would you rewrite them?
Of course you wouldn't. That would be silly. Those examples serve only to illustrate the idea: that you can build your own control structures. But of course it's only interesting to build control structures that are new! So here are some ideas.
- Open a file and ensure that it gets closed when you're done:
[[NSFileHandle fileHandleForReadingAtPath:path] closeWhenDone:^(NSFileHandle *handle){ /* use handle here */ }];
- Build a new array by working with the objects of an existing one:
newArray = [existingArray map:^(id obj){ return [obj stringByAppendingString:@"suffix"]; }];
- Filter the contents of an array:
newArray = [existingArray filter:^(id obj){ return [obj hasPrefix:@"my"]; }];
- Main thread synchronization:
/* threaded code */ PerformOnMainThread(^{ /* synchronized code */ }); /* more threaded code */
- Delayed execution:
PerformWithDelay(5.0, ^{ /* will run 5 seconds later */ });
- Parallel enumeration:
[array doParallelized:^(id obj){ /* will get executed on all of your CPU cores at once */ }];
And many other examples abound.
Another place where blocks will make things much nicer is when dealing with callbacks. If you've ever written much Cocoa code you've probably had to write a sheet callback, and it's a pain in the ass. If you need to pass variables through to the other side then it gets really frustrating with code like this:
- (void)method {
int foo;
NSString *bar;
/* do some work with those variables */
NSDictionary *ctx = [[NSDictionary alloc] initWithObjectsAndKeys:
[NSNumber numberWithInt:foo], @"foo",
bar, @"bar",
nil];
[NSApp beginSheet:sheet
modalForWindow:window
modalDelegate:self
didEndSelector:@selector(methodSheetDidEnd:returnCode:contextInfo:)
contextInfo:ctx];
}
- (void)methodSheetDidEnd:(NSWindow *)sheet returnCode:(int)code contextInfo:(void *)ctx {
NSDictionary *ctxDict = ctx;
[ctxDict autorelease];
int foo = [[ctxDict objectforKey:@"foo"] intValue];
NSString *bar = [ctxDict objectForKey:@"bar"];
/* do some more stuff with those variables
}
- (void)method {
int foo;
NSString *bar;
/* do some work with those variables */
[sheet beginSheetModalForWindow:window didEndBlock:^(int code){
/* do stuff with foo */
/* do stuff with bar */
/* do stuff with code, or sheet, or window, or anything */
}];
}
Let's take another example, sorting an array with a custom comparison function using some variables that you pass in. NSArray has functionality for this, with the -sortedArrayUsingFunction:context:
method. The old-style code is annoying, and I'm not going to write it. It's much like the sheet method above. You have to define a separate function, way outside of your code where it's not really visible. You have to set up the context to pass into it. If you're passing more than one thing then you have to pass a dictionary (and unpack it) or a pointer to a struct. Now here's the blocks version of a custom comparator:
sorted = [array sortedArrayUsingBlock:^(id a, id b){
/* compare, use local variables to decide what to do, run wild */
}];
Callbacks are one of the most powerful things in C and Objective-C but in many situations their use can be extremely difficult and unnatural. Blocks promise to allow callbacks and custom control constructs to be created and used in a much more natural fashion.
So far I've only shown examples of using a blocks API, but how about creating one? Well, it's a little worse, but not much. The only problematic thing is that the syntax for declaring a block type is kind of ugly, as it's modeled after function pointer syntax. But it's not too bad, and the rest is nice and simple. For example, here's how you could write that -map:
method from above:
- (NSArray *)map:(id (^)(id))block { // takes an id, returns an id
NSMutableArray *ret = [NSMutableArray array];
for(id obj in self)
[ret addObject:block(obj)];
return ret;
}
Information on Apple's implementation of blocks is still a bit sparse. Some more details can be found in a mailing list post to the Clang development list. For more purely conceptual ideas on how blocks can be used, check out the Smalltalk language, where blocks are used for virtually every control structure right down to if/then and basic loops. Here's hoping that blocks allow for some major changes in how we work on Snow Leopard!
That wraps it up for this Friday Q&A. Be sure to come back next week for another round. Keep those suggestions coming. Post suggestions in the comments or e-mail them directly. I may use your name unless you tell me otherwise, so say so if you want to remain anonymous.
Comments:
Typo apart, great piece, thanks!
Pity they're just blocks, not true closures, but that'd be a bit harder to jam on to C ;)
All of the great reasons to use closures in other languages that support them applies to Objective C. It's about time we get more truly dynamic programming constructs.
http://en.wikipedia.org/wiki/Closure_(computer_science) [WikiPedia]
If blocks in c are infact just anonymous functions then it'll take away a bit of the dynamic aspect of blocks, but it's still better than nothing.
meh i already see the many compiler warnings i'll get when I keep forgetting the return statement at the end of a block :-)
anyway, it's great seeing objective-c becoming an even more smalltalkish c :-)...but then... there's already f-script :-D
Paste the smalltalk below into a workspace (command k), select an example (all 3-4 lines each) and press command-d to "do it" (which by the way was the original OK button in Mac OS but people read it as 'dolt' so they changed it to OK)
Blocks are wrapped in brackets []. "[:value" defines a parameter for the block named value, then a | to separate the parameters from the code.
"Begin Smalltalk"
"Example 1: Pass block to do: method of any collection"
"This rocks because you can pass any block of code around and never write an iteration loop again"
total := 0.
{1. 2. 3. 4. 5.} do:[:value | total := total + value].
total inspect.
"Example 2: get the value of a block"
blockExample := ['hello ','world'].
blockExample value inspect.
"Example 3: Use a block to detect an object in a collection. Can resuse block with different values for a scoped local variable"
detectString := 'hello world'.
detectExample := [:value | value = detectString].
detected := {'oh my'. 'wow!'. 'hello world'} detect: detectExample.
detected inspect.
"Now this will cause an error because no match was found. (detect: ifNone: is how you handle this case"
detectString := 'hello there'.
detected := {'oh my'. 'wow!'. 'hello world'} detect: detectExample.
Blocks go so well together with Cocoa, too! Almost every place where we today use a delegate or callback, a block would do just as nicely or even more so. They'll also make for some beautiful parallel code, as you touch on with -doParallelized:.
sitharus: They *are* true closures. They're doing some very fancy magic.
One note though, blocks aren't an ObjC feature; they can be used in .c files as well, afaik.
Karsten: The return statement only returns an object from the block. In this respect, the block fails to work exactly like a built-in control construct. It's impossible to return from the enclosing method from block code. This is unfortunate but, in my opinion, not particularly limiting. It's something you'll have to be aware of and ensure you work with in your code.
Given the way blocks work, I think you can see that it must be this way. You can pass a block off and it can get copied and invoked later, potentially multiple times. What would it mean to return from the enclosing method from inside a block, when the block is executing ten minutes after the enclosing method finished executing? They're more general purpose, so ironically they lose this capability.
charles: Thanks very much for your praise and your typo-finder. I've fixed it.
Jonathan: The reason I didn't mention closures is because I don't really fully understand what sets apart a closure from all the other stuff we can have, and I was sure that the moment I mentioned the idea I'd get people like sitharus coming in and telling me that they aren't really closures. I have no idea which one is right. Honestly I don't care. I just want to know how they work and what I can do with them. What the theoretical construct is called is less interesting.
To everyone else, thanks much for writing. Don't forget to keep those suggestions coming, This stuff has sparked such great conversation that I want to be sure to keep it going! I have material for a few more already, but more is always better.
sitharus, how are they not true closures?
It's not clear what this would do:
void foo()
{
int a = 42;
char *b = "fork!";
x = ^{ printf("a is %d and b is %s", a, b); }
store(x);
}
void bar()
{
a = 9;
char *b = "if you see this, it's not a closure";
x();
}
void baz()
{
x(); // will this bomb if a and b aren't on the stack?
}
foo(); bar(); baz();
Or that Lisp has been having for over half a century. Here's a dime kid. Go get yourself a better programming language.
iphone dev: No idea what you mean by "pass a block as an argument to a selector", I'm afraid. A selector is just a way to identify the name of a message in a fast way. It doesn't have arguments. Did you mean can you pass a block as an argument to a method? If so, I'd hope that my examples make it abundantly clear that you can. Otherwise if you can rephrase your question I will do my best to answer it.
And methodSheetDidEnd:returnCode:contextInfo: should be sheetDidEnd:returnCode:contextInfo:, right? Just wanting to make sure I understand this correctly.
x = ^{ printf("a block called x.\n"); };
First, the callback block most likely won't get called until after the stack frame where it's defined has been released. In this case (according to Lattner's post), the block needs to be saved to a global using _Block_copy() and later released using _Block_release().
Second, any local objects the block will access (NSString* bar in the example) need to be retained, since the block is unlikely to be called before the autorelease pool is refreshed.
- the type of a block is like the type of a corresponding function pointer, but with the * replaced by a ^. So (id (*)(id)) (a function taking an id and returning an id becomes (id (^)(id)).
wtd's type would be (void (^)(void))
- in Objective-C, the type of a block is also id. That means you can send it messages (but only -copy and -release).
- if you want to keep a block that's been passed to you, you'll have to send it a -copy message. That's like any other Objective-C object.
- The block knows which objects are accessed in it, and will retain/release them in its -copy/-dealloc method.
- function-local blocks start out as light-weight stack objects and get promoted to regular heap objects on -copy. Global blocks (i.e. those assigned to global variables) remain global; they're like static NSStrings in that regard.
- variables accessed from within a block have pass-by-value semantics normally, and pass-by-reference semantics if declared with __block. That means that all instances of the block as well as the current stack scope share the same variable; it will be transparently moved to the heap if one of the blocks survives the current stack scope.
Grady: No, why would I use NSApplication? The method I'm using is an instance method, not a class method. Using NSApplication would result in a compiler warning and a runtime error because I'm trying to send this message to the class. And no, the method name is exactly what I want it to be. Perhaps you're reading in the documentation and they mention a callback method called "sheetDidEnd:", but that's just a model. The whole point of the SEL argument is to be able to specify a method of your choice.
tedge: The copy/release is not an omission on my part, because it would happen inside Cocoa. Any API which is going to keep a block past the current function call is going to copy it internally, rather than force callers to copy it. As for retaining objects, I'd say you're right about that except that the next post claims that the runtime does this for you, which is neat. If that's true than my code will work 100% as-is, assuming an appropriately written NSApplication method of that particular name.
(int (^)) newCounter(int t0) {
int t = t0;
return ^{ return t++; };
}
x = newCounter(10);
printf("%d", x()); // 10?
printf("%d", x()); // 10 or 11?
__block int t = t0;
Then your code would compile and would print 10 followed by 11.
This requirement for an extra type decorator is kind of annoying but it seems that it's done for efficiency reasons, as it sounds like block-mutatable variables are considerably more expensive than const-copied ones.
The important thing is code legibility - being able to look at a line of code and know immediately and unambiguously exactly what it's going to do. If you need to keep asking yourself questions like "what does that extension do", "where's that defined", or "was the behaviour of this construct that I understand overridden in some other file to do something completely different here" then that introduces so many obscure and difficult to track down bugs that it far, far outweighs any benefit it might bring in the short term.
Things like this are great for academic computer scientists and people trying to prove how clever they are, but an utter disaster for real world software engineering projects with a steady stream of different people joining and leaving the team.
There are lots of things that limit my productivity as a developer but the extra typing time it takes to write out a loop long hand sure isn't one of them!
Closures are not a loop replacement. If that's all they were then you'd be right. Read the rest of the examples, there's a ton of stuff there besides loops.
Also I don't know what you mean by the questions you ask. Closures *avoid* that by putting the code inline, so you *don't* need to go hunting around for definitions.
Calling it a "new" feature is also a bit silly; closures are decades old, and almost every higher level language has them these days (ruby, python, javascript, any functional language, C#, java is getting them, etc...).
But even more importantly, as Mike mentioned, this allows for clearer code to be written. Specifically, when reading this article, I was reminded of KBCollectionExtensions, a very nifty HOM implementation for Cocoa written in February by Guy English (yet another amoebian).
Using KBCollectionExtensions is nifty because it cleans up code clutter without sacrificing the "every line tells" idea. Specifically, having taken this example right off of Guy's blog, this would be the code to enumerate an array named allEmployees and return the names of any employees who have been idle for more than 5 minutes:
NSArray *names = [allEmployees valueForKeyPath: @"[collect].{idleTime>5}.name"];
This is a nifty implementation, but can be redone in blocks without swizzling out the NSObject class and overriding -valueForKeyPath: which to me, is very exciting!
Will there be a reflection api for blocks? Like asking blocks for there number of arguments etc?
Earlier in the comments, Johannes Fortmann mentioned that as currently implemented, blocks behave as objects in Objective-C, but only respond to -copy and -release. You already can tell a block to execute with certain arguments, though:
multiplyByFive = ^(int toMultiply){return toMultiply * 5};
int theNumber = multiplyByFive(10); // and theNumber is 50!
Thinking of Objective-C variables as separate from C isn't exactly accurate. Essentially "id" variables (or Objects) are just C pointers, and as such can be passed around just like a float, int, or char*, et al.
I don't have anything specifically against closures as such, it's the general tendency of people to keep trying to invent clever hacks and extensions that I'm suspicious of. If you introduce a new framework or bit of technology to save you 5 minutes' work and then need to spend a week fixing some obscure deployment issue or memory leak as a result then it's not a sensible trade off!
The expressiveness of a language is extremely important. When you can do more work and express more meaning with less code that means that you can spend more time thinking about what your program does and less about the boring details of how it does it. Programming is all about automating boring tasks to make people more productive. If we want to make ourselves more productive, then getting the machine to automate our boring tasks is the way to do it.
I understand if you don't think closures are important. Opinions differ. But a wholesale rejection of all language additions and features as being only for academics and people who don't write real software is an idea which I reject in the strongest possible terms!
Steven Degutis: Good point mentioning HOM. HOM is essentially a poor man's substitute for blocks/closures/whatever. (The guy who invented HOM will disagree with me and say that HOM is often better, but we're all entitled to our opinions.) HOM is fairly limiting and in Objective-C doubly so due to the primitive/object type disconnect, and blocks pretty much completely remove the need for HOM by providing a more generalized solution that enables even more and better capabilities.
I.e.
^{ return 1; }
versus
static int one() { return 1; }
Great for hacking, perhaps nice when stuffed inside fancy header files, but in general it means a great deal more ad-hoc code that probably ought to be reusable and reused but isn't. In a way it's the equivalent of using a number instead of a manifest constant.
more interesting: the closure thing, using variables from enclosing scope. Is there an easy way to do more real closures, i.e. the equivalent of:
qsort(base, nel, width, ^(const void *l, const void *r) { return memcmp(l, r, width); });
like you would get with some languages:
qsort(base, nel, width, memcmp(,,width));
I'm a bit worried about the "infers type of function)" thing. Obviously, it's usually passed as a function pointer to something that defines what type it takes, so the compiler just has to confirm that the anonymous function matches the prototype. But there are weird corner cases I suspect it's hard to reason about with conviction.
enum Alignment { Short, Integer, Long };
static short shortArr[10];
static int intArr[10];
static long longArr[10];
x = ^(Alignment a) { switch(a) { case Short: return shortArr; case Integer: return intArr; case Long: return longArr; } return 0; }
arguably x returns void* and casts the 0 to NULL, but my real question is: how smart is the compiler? How smart do we want?
int * getInts() { return x(Integer); }
As for your question with the qsort example, I'm not sure I understand what you're asking that hasn't already been answered several times in the comments above. If you mean literal qsort, which takes a function pointer rather than a block, then the answer is no, you cannot do this. Block pointers are conceptually similar to function pointers but implementation-wise are completely different. You couldn't pass a block to the built-in qsort function. What you could do is write an adapter, say qsort_block, which would then call through to the built-in qsort_r with a comparator that calls through to your block. Then your example would work, using qsort_block instead of qsort.
As for type inference, if Apple is smart they are requiring identical return types, not merely compatible ones, and are erroring out (or at least putting up a very loud warning) if you have multiple return statements which don't match. If you want to return different types and have them be converted, you should either have to declare an explicit return type (the inference thing is optional) or explicitly cast enough of your return statements to make them all match. I don't know if that's how it works but that is how it ought to work.
(Back in the mid-'80s I was working on productivity apps in Smalltalk-80, one of the ancestor languages of both Ruby and Objective-C, and it definitely made for very quick and readable code.)
Consider that any language feature you're not used to will seem confusing and hard to read at first. That's just the nature of learning. I can assure you that blocks/closures are quite readable, and lead to extremely clear and concise code if they're used correctly. Of course you can do crazy stuff with them (and there are a few APIs in Rails that I think go too far that direction) but that's true of anything powerful...
While HOM was invented to solve the same kinds of problems that blocks solve, my inspiration were ideas such as Backus's FP and APL, not blocks, which I actually found rather distasteful.
Why distasteful? It's difficult to explain, but distaste really best describe how it felt having to deal with (name, proces, return) individual elements after previously just dealing with entire collections in one fell swoop. It's just a lower-level of abstraction.
Blocks are undeniably the more powerful construct, just like goto is more powerful than structured constructs such as if, for and while. There are also some legitimate applications such as nested HTML generation in Seaside that don't seem to be doable with HOM, or at least I haven't been able to figure it out yet.
Just as undeniable, though, is the fact that HOM-based code can do things that blocks can't do, and is frequently even more concise and readable, giving you even better abstraction abilities than blocks do without some of the inherent drawbacks, such as a strong incentivizing of very, very badly structured code.
And of course the typing issues are purely a factor of HOM implementations to date not having any compiler support, they would be solvable with a fraction of the infrastructure required by the block implementation.
For more details, see:
http://www.metaobject.com/papers/Higher_Order_Messaging_OOPSLA_2005.pdf
Note that I never talked about why HOM was invented, only what it is, and that most certainly is a matter of opinion. Marcel, you could do with being a bit less combative on this sort of thing, it will give you a better chance at recruiting people....
As for the rest, I shall let it stand. Of course I disagree, but I think all of my points have already been adequately made. I certainly encourage everyone to read the linked PDF, as it's very interesting, and if you aren't already familiar with HOM it will teach you some good things.
In fact, I pretty much made the same mistake, dismissing my discomfort with blocks as an obvious symptom of inventor's pride. It was only gentle but insistent nagging over many years (Hi Philippe!) that got me to finally consider publishing the paper linked above.
I do have one big ally in my discomfort with blocks:
But later I decided I didn't like blocks as values because they are
super time bombs when passed around and completely violate
encapsulation. I really wanted external objects (not internal blocks)
to be passed around, and wanted a simpler way to think of contexts
internally. So we left them out of the first few Smalltalks. (I still
don't like them ...)
http://lists.squeakfoundation.org/pipermail/squeak-dev/2007-September/120493.html
While he may be just as wrong as the idiot (that would be me), I do believe Alan is hard to dismiss as just being ignorant on the matter.
Happy New Year!
Mike Ash clever spam-filtering system. I'll have to steal it.
Everyone Happy New Year (soon)!
You're perfectly entitled to your opinion about HOM and blocks and I have no real problem with people who think that blocks don't survive on their own merits. What I do have a problem with is someone criticizing me for "getting [something] wrong" when I have merely stated an opinion. If you persist in thinking that I have gotten something wrong then please quote the wrongness verbatim and explain it. However you will find that I have merely stated my opinion as to the merits of each, and never stated any factual information (right or wrong) about the origins of HOM or its relationship with blocks.
As a matter of fact, that essence is not even achievable with blocks, so saying that HOM is essentially a poor man's substitute for blocks is not just false, it is almost non-sensical.
Similarly, if "GPS Enthusiast" were to evaluate the iPhone and say that an iPhone is "essentially a poor man's substitute for a real GPS receiver", then I would not have qualms with their saying that a "real" GPS receiver makes for a better GPS unit, but I would object to any claims that that is what an iPhone "essentially" is. It does a lot of other things, and does things differently, because the aims of the developers of the iPhone (and its GPS capabilities) were different from those of a GPS receiver.
I hope that makes things clear.
Cheers,
Marcel
I welcome your continued contribution of ideas and differing opinions, but if you're going to persist in calling my opinions wrong and carrying on arguments long after they've reached their sell-by date I will have to ask you to refrain from posting.
Furthermore, blocks are not "Ruby features", unless by "Ruby features", you just mean "features that Ruby happened to borrow from other languages while it was being designed". Might as well call them C++ or Java features if you're going to do that.
The biggest problem of ObjC is that it's between two separate families... It's too high-level to seem suitable for resource-intensive processes, and it's too low-level to be as flexible as scripting languages.
So to me, those new features, instead of pushing the language to one side or the other, increases its range of action.
With blocks you're getting a high-level feature, a high-level abstraction allowing you to literally customize code written by others without being able to even see that code, but also a very low-level system as it is made to be as fast as normal functions.
Comments RSS feed for this page
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.
newArray = [existingArray map:^(id obj){ return [obj stringByAppendingString:@"suffix"]; }];
if return would make the calling method return, this wouldn't assign anything, it would just return the first object of the block.
If you consider a call like:
myObject = [someObject doSomething:^{...} onErrorDo:^{return nil}]; this return is ment to make the method return nil instead of assigning nil to myObject. How do c-blocks distinguish one return from the other? Do c-blocks have implicit return values? I'm guess they don't, but how could you make a block return from a method?
Karsten