Exupery 0.06 release
Last updated at 4:56 am UTC on 8 March 2005
There are really two major directions for the 0.06 release. The first is finishing the core architecture, this would provide faster sends and blocks. The second is improving the reliability so it would be sensible to run Exupery as a JIT.
It wouldn't be wise to run Exupery as a JIT because it's currently missing a few features:
- Single stepping a compiled method will confuse the compiled code when it continues normal compiled execution.
- Compiled code doesn't support interrupts. Alt-. will not interrupt compiled code.
- Compiled code is known to crash the image. Göran managed this using the pystone benchmarks.
- The at: and at:put: implementation only handles object arrays. I'm not sure what it would do if it was used to index into byte storage.
The options are:
- Support for blocks.
- Specialising primitives (especially at: and at:put)
- Inlining specialised simple primitives.
- Specialising primitives that have a method after them.
- Inlining methods. This should make sends even faster than 0.05
- Improving the register allocator. Adding interference range spilling, spilling context registers (stack values, temporaries, and arguments) directly into the context, moving temporaries and arguments into registers.
- Tuning and general reliability improvements.
- Build the JIT infrastructure. This should be easy, just a profiler and a background process that compiles. The hard part will be making sure Exupery is reliable enough to run without supervision.
The issues are:
- What would it take to make Exupery really useful? Useful enough for a few people to use it seriously.
- What is the minimal architecture that is worth tuning? It's not really worthwhile tuning critical primitives like at:, at:put:, and new without first adding inlining. This is because the best way to tune them is to generate specialised versions for each receiver then inline them.
- Improving the register allocator would speed up bytecode performance. It's really a self contained change. I can't realistically move temporaries into machine registers without doing more work on the register allocator because register pressure on the x86 will become more of an issue.
- Tuning and general reliability improvements would be worthwhile for users. There are probably a few bugs still in Exupery. The Python benchmarks crash.
-
Support for blocks, improving at:, and inlining require a single architectural change however this change could be introduced gradually.
It would be nice to have a fast flexible implementation for at: and at:put: because they tend to be used in inner loops. The problem is finding a quick way to select the appropriate at: code and execute it quickly. My best idea is to use PICs to select a specialised version of the primitive. Compile a version of the primitive for each receiving class then use that.