Last updated at 10:20 pm UTC on 3 September 2004
Dan Ingalls June 25, 2004 Status Report:
A while ago I promised a status report on the 64-bit project. Well, here it is.
The first thing I learned is that this is actually much harder than I thought. The good news is that it's definitely doable, and the VM will be much better for the changes. Moreover, we've already made good progress. I say "we" because, Ian Piumarta has been of great help, and has actually done half the work so far.
What Ian has done
Ian is strongly rooted in the world of C and C compilers (I am not), and he has performed a wonderful service that I could not. If you are not familiar with the Slang code for the interpreter, I will tell you that it has always been full of coercions that allow complete anarchy to rein. This was never elegant, but it also never caused a lot of trouble when almost everything was 32-bit pointers. Now, however, in addition to bytes (and halfwords), there are things that are supposed to be 32 bits, things that are either 32 or 64 bits depending on the machine, and things that are supposed to be 64 bits.
Ian's great contribution was to re-cast the basic operations in the Interpreter and ObjectMemory in terms of these different types, in a way that is consistent and that isn't everywhere undermined by coercions to 'char '. This has produced two results of immediate value. The first is that, with these changes and a few load and store macros, it is now possible to build a VM that will run a 32-bit image on a 64-bit machine. The second is that now, if you make a mistake in usage, e.g., between 32-bit quantities and things that are the size of a native machine word, the compiler will diagnose it before you crash and have to debug it. Ian's work was done almost entirely by eliminating the damaging coercions, and then using well-typed access routines until the compiler errors went away. Ian's goal is that, simply by changing load and store macros, it should be possible to generate VMs that will run any of the 4 combinations of 32- or 64-bit images on 32- or 64-bit platforms.
In addition to the Interpreter and ObjectMemory changes, Ian has also rewritten parts of his Unix support code to make it all 64-bit clean. The end result of all of these changes is that Ian is now able to generate and compile a 32-bit VM that runs on both 32- and 64-bit platforms.
What I have done
On my side, I have written a true 64-bit image in a format that makes the minimum possible changes that can possibly work. I have also made changes equivalent to Ian's throughout the Simulator and Interpreter, but mine have to address all the changes related to different word sizes in the image. While some of these details were anticipated in symbolic constants, many are pragmatic. There are no automatic ways to find the problems, other than, e.g., searching for the integer 4. Or 32. Or 2 (the amount by which something may have been shifted to get a word offset). Or 3 (a baaad folding of wordsize-1). And so on. Then run the simulator and see where it breaks. It is a deep immersion in the debugger. That said, today I can run over 12000 bytecodes, up from only 300 a couple of days ago, so I think I'm getting close.
What is yet to be done
I am now in the process of merging Ian's and my changes, after which I will verify that the result works as well as before in both environments (running 32 on 64 natively and running 64 on 64 in the Simulator). After this, I need to finish finding the 64-bit offset bugs, and produce a VM that will run 64 on 64 natively.
The VM that Ian has made to run on a 64 host, and the Simulations that I have made of running a 64-bit image, are both "kernel" VMs. That is, they include the kernel primitives and a couple of the heavily used plugins, but there are a number of large and important plugins that have not yet been touched by our conversions. We simply let them fail, and they run the failure code as an emulation.
Calling All Cars
We are coming to a time when the 64-bit image runs, and we will have finished the kernel conversion, but there will still be much work to do on the plugins. It is my hope to enlist some help from the larger Squeak community to complete this task. Ian and I will document precisely the new conventions for data access in the VM, along with an example conversion of a couple of simple plugins. As a test-bed, we will produce a 64-on-32 release whose kernel works, but whose plugins need conversion. A 64-on-32 is a Simulator and VM that can run a 64-bit image on a 32-bit machine. It is important because it can be tested on any old 32-bit machine, and yet, because the image word size is different from the machine word size, we believe that if code works in that configuration, it is very likely to work in any of the other combinations.
So stay tuned. If you know any of the plugins well, or if you know the VM generally and would like to help us finish the job, please let us know (reply to me) and we'll plan to include you in the big plugin party, probably in about a month.
Remember the Version 4 format changes? Well, I haven't even thought about them since I wrote my first 64-bit image. But I can tell you that they are mostly small compared to this task, and it should be easy to fold them into this project toward the end. My plan is to work on these during the period after the kernel works fully and before all the plugins have been converted.
We talked about this all getting folded into the 3.8 changes so that 3.8 could essentially be the same as 4.0 except for the VM changes for 64 bits and associated image-side tweaks. I originally thought that we might have to rush 3.8 to sync the two schedules, but it now looks like a fairly consistent time frame. I think it will take until the end of August, and maybe even into the fall for the last of the plugin conversions to be done and the bug tail to have died down.