Next article: An Objective-C Continuations Library
Previous article: Friday Q&A 2010-04-02: OpenCL Basics
Tags: fridayqna nsfastenumeration objectivec
Welcome back to another edition of Friday Q&A. Preston Sumner has suggested that I talk about different ways of enumerating over collections in Cocoa, and how to implement Fast Enumeration. This will be a two part series: this week I will look at the different enumeration techniques and their pros and cons, and then next week I will take you through implementing Fast Enumeration on a custom object.
A Baseline
To establishe a baseline for comparison, consider iterating over a C array of Objective-C objects:
id *array = ...;
NSUInteger length = ...;
for(NSUInteger i = 0; i < length; i++)
// do something with array[i]
For single-threaded code, this is about as fast as you're going to get for object enumeration. You have a small amount of overhead from the loop, and it's pretty much the minimum possible. (If you really want to get radical, you can try manually unrolling the loop, but that begins to get crazy, and could actually hurt performance.)
Of course, this sort of object enumeration is not usually practical in Cocoa apps. It's rare that we encounter a C array of objects. Much more common is an object that implements a collection.
NSEnumerator
In times past, the standard way of enumerating over a collection in Cocoa was to use NSEnumerator
:
NSEnumerator *enumerator = [collection objectEnumerator];
id obj;
while((obj = [enumerator nextObject]))
// do something with obj
objectAtIndex:
For those who preferred something more traditional, or who disliked creating a whole new object just for enumeration, another way to enumerate over an array was to simply call objectAtIndex:
on it repeatedly:
NSUInteger length = [array count];
for(NSUInteger i = 0; i < length; i++)
{
id obj = [array objectAtIndex: i];
// do something with obj
}
NSEnumerator
object so it can be a bit of a win, depending on just how fast objectAtIndex:
is for that particular array. (The performance characteristics of objectAtIndex:
compared to objectEnumerator
aren't always completely obvious, especially on very large arrays, due to how NSArray
is implemented internally.)
A big disadvantage, besides being verbose and error-prone, is that it simply doesn't work for enumerating NSSet
or NSDictionary
. Conversely, a big advantage is that, with careful management of the loop index, it's safe to mutate the array inside the loop, something that's not true of any other enumeration technique (unless you do something like enumerate over a copy instead).
NSFastEnumeration
In 10.5, Apple finally solved this problem. They solved the verbosity problem by introducing the for
/in
syntax. And they solved the speed problem by building for
/in
on top of a protocol called NSFastEnumeration
.
for(id obj in collection)
// do something with obj
NSFastEnumeration
works by fetching objects in bulk whenever possible. The compiler generates code that calls the collection and asks the collection to return as many objects as possible. For collections that store objects contiguously, the collection is able to return an interior pointer directly to those objects. If every object in the array is contiguous, the loop turns into something very much like the baseline, and with the same overall performance. If there are multiple contiguous object stores, NSFastEnumeration
allows the collection to return interior pointers one after another, allowing for a quick loop implementation over each store, and requiring an Objective-C message only for getting the next interior pointer. For collections without contiguous storage, NSFastEnumeration
allows the collection to copy objects out to temporary storage in bulk, reaping many of the same benefits. For collections where none of this works, NSFastEnumeration
still allows a collection to efficiently return objects one by one.
Nice syntax, good performance, it's a great combination.
Blocks-Based Enumeration
With 10.6, Apple introduced blocks into Objective-C, and also introduced blocks-based enumeration. Blocks are a natural fit for creating new control constructs like enumeration, and Apple added blocks-based enumeration methods to their collections as well:
[array enumerateObjectsUsingBlock: ^(id obj, NSUInteger index, BOOL *stop) {
// do something with obj
}];
for
/in
syntax. The syntax is a bit clumsier, and iteration is a bit slower. The code has to call your block for every object. This overhead is less than that of a message send, as in the NSEnumerator
case, but is more than the simple C for
loop of NSFastEnumeration
. There are two places where the block syntax is useful.
First is when you need something more than simple enumeration. Apple gives two enumeration options, to enumerate concurrently and to enumerate in reverse. Neither is directly supported by for
/in
syntax. Concurrent enumeration is very difficult to do any other way, so if your enumeration can take advantage of multithreading, this is extremely useful. Reverse enumeration can be done by sending reverseObjectEnumerator
to an array and then using that as the target of a for
/in
, but this still has the overhead of creating an NSEnumerator
and indirecting through it for enumeration, so the blocks-based method is probably a win.
Second is when you're enumerating over a dictionary and need both keys and objects. The for
/in
syntax can only give you one object at a time. This means that you have to enumerate over keys, then ask the dictionary for objects as a separate step:
for(id key in dictionary)
{
id obj = [dictionary objectForKey: key];
// do something with key and obj
}
for
/in
, it's also much slower. The extra message send and dictionary lookup will kill the nice performance characteristics of NSFastEnumeration
.
NSDictionary
provides a blocks-based enumeration method that passes both key and object directly to the block:
[dictionary enumerateKeysAndObjectsUsingBlock: ^(id key, id obj, BOOL *stop) {
// do something with key and obj
}];
for
/in
loop.
Conclusion
With any code, you should always prefer the technique which is easiest to maintain and read unless you know for sure that there's a speed problem which would benefit from a more difficult approach. This is especially true with collection enumeration, where the work that you do inside the loop is virtually certain to dwarf the work that's done by the loop itself.
Fortunately, Apple has made it so that we don't have to make any tradeoffs in most cases. The for
/in
syntax is simultaneously the nicest and fastest code for enumerating over a collection in the majority of cases. For the rare cases where it's not the best, 10.6 provides blocks-based enumeration constructs which fill in the holes. You should pretty much never write an NSEnumerator
loop unless you have to support 10.4. Manually fetching objects using objectAtIndex:
can be handy if you need to mutate the array while enumerating, but besides that has no advantage over for
/in
.
That's it for this week. Come back next week, when I'll talk about how to build your own implementation of the NSFastEnumeration
protocol. Until then, keep sending me your ideas for topics to talk about. Friday Q&A is driven by reader submissions, so if you have a topic you'd like to see discussed here, send it in! Next week is already booked, but after that, the future is wide open.
Comments:
NSDictionary *myList = [MyDataGrabber activeList];
fields = [NSMutableDictionary dictionaryWithCapacity: [myList count]];
MyField *field;
[myList enumerateKeysAndObjectsUsingBlock: ^(id myID, id myData, BOOL *stop) {
field = [[MyField alloc] initWith: self fieldID: myID andFieldData: myData];
[fields setObject: field forKey: myID];
[field release];
}];
(Note that fields is an ivar for the object this code is in a method for)
If I get rid of the "MyField *field" declaration before the enumeration and put the "MyField *" in front of the use of "field" within the enumeration the compiler error goes away!
In other words, this compiles without an error:
NSDictionary *myList = [MyDataGrabber activeList];
fields = [NSMutableDictionary dictionaryWithCapacity: [myList count]];
[myList enumerateKeysAndObjectsUsingBlock: ^(id myID, id myData, BOOL *stop) {
MyField *field = [[MyField alloc] initWith: self fieldID: myID andFieldData: myData];
[fields setObject: field forKey: myID];
[field release];
}];
__block
qualifier to your declaration.Comments RSS feed for this page
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.
-Krystof