Before there were events, there were callbacks.
I remember finally grasping callbacks… it gave me a perspective on my code where different components could have an asynchronous conversation — you know, just like we do every day with email (and any other form of messaging).
Finally getting events, and in particular AS3′s event model, was a similar revelation. Components could be oblivious to each other, or even how far away they were from each other in the giant graph of your application, and still carryout a conversation of signaling and acknowledging various messages.
Of course, this isn’t new to anyone who learned to code in SmallTalk — which seems to be the language that nailed every good thing about OOP without tripping over any of the bad things. No surprise, it came out XEROX PARC. Damn, did those guys invent all of modern computing like 30 years ago?
Naturally, when we built Mockingbird we used events to notify the view of changes in the model. Very standard MVC stuff, folks have been doing it for decades. And like everyone doing it with AS3, we used the Flash Player’s native EventDispatcher class. In fact, since we were using the Flex Framework, we even used the [Bindable] tag to do all of the work for us.
Well, everything has a catch. And even though EventDispatcher is a native part of the Flash Player and an integral aspect of AS3 coding, it still has overhead that you may not need and can cut out by crafting your own solution.
Up front, let me say that premature optimization is the root of all evil. And by evil, I mean you’ll waste your time on it with no apparent benefit. The following information is really only useful if you’re dispatching tens of thousands of events per second. You’re probably not, or at least not very often. We were, but Mockingbird is a special case.
One thing bugged me about using EventDispatcher: it made a copy of my event every time I called dispatchEvent(). Now, my pesky lizard brain that I honed while working on the PS2 taught me that lots of little memory allocations every frame — in a garbage collected environment — is not the ideal for steady, consistent performance.
If EventDispatcher was a standard class we could hopefully just subclass it and try to plug in a pooled memory allocator or similar. But it’s native, so we can’t do that.
So, our only choice is to create our own event dispatching mechanism.
It’s actually not that hard, particularly if you don’t need features like bubbling or propagation control. We didn’t even need priorities as it was purely notification of change, usually to just one or two listeners. I ended up with a couple of methods for adding and removing functions to an array and a dispatch method that simply iterated through the array calling each function in turn (callbacks!).
And boy, did it work. Zero memory allocations. Even simplified the API compared to IEventDipsatcher. Here’s a quick performance test I built to show the benefits in a theoretical context.
Source | SWF
It runs 100,000 iterations on each test, and each iteration “dispatches” an Event object that has 3 registered listeners. I don’t time registrations, just the dispatch. The listeners are actually do-nothing functions, but since mxmlc doesn’t dead-strip empty functions the test should be representative.
On my machine, comparing dispatchEvent vs. a Function callback is 1275ms vs. 80ms. That means the callbacks are 15.9x faster. I didn’t test the memory usage, but there would be a significant difference there as well, which means eventually you’ll have to pay the garbage collector his timeslice, compounding the overall performance cost for your application.
Damn, I was proud of myself. I just found a 15.9x performance gain in an integral part of our technology that’s been a hotspot on the profiler since forever. So, I went and swapped all of our event dispatch code for callbacks, tweeted several times about my success, and patted myself on the back.
Then I started writing this blog post. And in the process of consolidating my thoughts and cleaning up my source code I had an idea.
What if EventDispatcher didn’t do any memory allocations? How much faster could it run?
You’re probably asking yourself, didn’t we already go over this a few paragraphs up? You can’t hook EventDispatcher‘s allocator. Or can we?
How does EventDispatcher allocate new instances of your custom events? It calls the clone() method, which you’re supposed to override if you need those clones to be typed as your custom event. The default clone() returns a new Event instance. How about a clone() that didn’t allocate any memory? What if we returned this?
Source | SWF
This performance test compares dispatchEvent with custom events that don’t allocate memory when cloned vs. Function callbacks. You can view it here.
Results? 7 ms vs. 83 ms That’s right, when you take the new out of dispatchEvent() it becomes 182x faster… 11.8x faster than a callback!
How fast is it? Well, the theoretical maximum speed for calling functions can be determined by inline the callbacks (instead of using an array of Functions, which have closure overhead). This runs around 2ms.
What does this mean? Well, it means that I need to go back and revert some of my commits from this afternoon. It also means you should check out your profiler and see if event dispatch is a hotspot for you, and if it is, try subbing in custom events that don’t allocate.
What do we lose by not allocating? No idea. I’ve not done that testing yet. Some suggested (based on the docs, I believe), that there’s something with EventDispatcher not being able to change the target property after an event has been dispatched, so it has to create a clone if you re-dispatch the event, but that just doesn’t make sense — I can’t think of an implementation that would require that or benefit from it. It could affect bubbling, not sure how though.
In the end, it may just be a safety measure to ensure that events can’t tamper with the original event before other events get a chance to consume it. Kinda like cloning a private array before returning it through a public method (for example, in collection classes). I’d be curious to know.
NOTE: My timings were done in the standalone Flash Player. Testing out this post I’m noticing that the numbers aren’t quite the same in the browser (not surprising).