Exupery at: speedup plans
Last updated at 12:46 pm UTC on 17 January 2006
at: and at:put: are rather slow in Squeak. This is a consequence of the image format design where the format was designed to be space efficient. The problem is both the class and the size of the object could be encoded in multiple places.
To speed up at: and at:put: much more will probably require inlining the calls. I think I can probably get at: down to around 16 instructions by inlining specialised primitives. This is a nice initial use of inlining from polymorphic inline caches.
The next at: optimisation is likely to be based on polymorphic inline caches and specific at: code specialised for each receiver. My estimates show that this would take about 14 instructions plus call overhead.
After that it may need either a image format change or a global optimiser (probably SSA).
Adding global optimisations would have two nice effects. First it would allow at: and at:put: to be optimised without needing an image change. Second it would provide the basic platform needed to compete with C for speed. The basic plan is to move the object size calculations out of the loop so they will not have a noticable cost. Global optimisation is a long way off, but I'm currently thinking about the options for at: and at:put: optimisation.
The two options to go further are either changing the image or adding global optimisations. Both have advantages.
Changing the image could speed up individual at: or at:put operations. But there would still always be some overhead inside the loop.
Adding global optimisations would speed up at:s in loops by a lot more. Global optimisation would provide the framework to get near C's speed. Global optimisation should be able to fully optimise at: overhead out of loops without an image change. There would still be the overhead of decoding the object size once per object.