=================================================== =================================================== Draft for Block Closure Semantics for Squeak v0.8.2 =================================================== =================================================== Author: Stephan Rudlof (sr@evolgo.de) Date: 20.02.2002 Changes: v0.8.2: review of v0.8.1 with further rework v0.8.1: big rework (to be reviewed) v0.8.0: Bytecode for reaching outer read/write temporaries: semantics has changed! Semantics: -> big rework of this document necessary (coming soon!) v0.7.0: Evaluation semantics has changed for *not* sharing locally created temps (special thanks to Andreas Raab for triggering this) v0.6.0: other and more (needed) BCSBlockContext inst vars v0.5.0: Exploiting read only property of temps (typically method/block arguments). To the Reader ============= Please mail any - detected errors, - suggested corrections, and - questions you may have to me! Open Questions ============== Allen Wirfs-Brock; Comments on Smalltalk block closure designs, part 1: - Should they have consequences for this design? - If so, which ones? New Ideas not worked out ======================== Basic Ideas =========== - Make it as simple as possible first, improve it later if necessary (this principle has seen some iterations yet); - introduce local block variables, - with direct references (oops) for - block arguments (as used to for BlockMethods), - locally created temps (by '| |' expressions), - readonly temporary variables from outer contexts; - containing outer contexts for realizing indirect references (see below) - to read/write temporary variables from outer contexts; - store these variables in the indexable fields of a new block context class. Legend ====== - temps -> temporary variables residing in indexable fields holding oops of - readonly method/block arguments, or - readonly outer temps, or - read/write temps created by '| |' expressions; - local temps -> temps in current method or block context; - locally created temps -> temps created by '| |' expression in current context; - outer temps -> temps in outer method or block contexts; - o/i pairs -> an o/i pair represents the indirect reference to a *writable* temp in an outer context o: the i is the index into the indexable fields of o; - indexable fields for outer references -> indexable fields containing readonly or r/w references (readonly oops or outer context oops accessed by o/i pairs) to outer temps; - indexable fields for pushed outer contexts: indexable fields containing outer contexts accessed by o/i pairs; - indexable fields for indirect references -> indexable fields for pushed outer contexts. BCSBlockContexts ================ BCS stands for 'block closure semantics'. BCSBlockContexts have some properties of MethodContexts: their indexable fields layout regarding temps is like the one from MethodContexts, which has place for storing temporary variables. But in addition to MethodContexts there are also references to outer temps stored in the indexable fields before the first dynamically used stack entry, if needed. Indexable Fields Layout of BCSBlockContexts =========================================== Currently there are just block arguments stored in a BlockContext's indexable fields. Let's enhance the indexable fields functionality as follows for BCSBlockContexts: Example ------- In this example the block has - 2 arguments (a's); - 3 read only outer references (r's); - 2 outer contexts for accessing read/write outer temps (o's); - 5 locally created temps (l's); After block copy: ???????rrroo ----- ^ At begin of evaluating the block: ???????rrrooaa ----- | After initialization of locally created temps with nil: ??00000rrrooaa ----- | After storing block arguments into local temps: aa00000rrroo ----- ^ After assigning to all locally created temps: aalllllrrroo ----- ^ ?: uninitialized; a: read only block argument, stored as temp at begin of evaluating the block; 0: locally created temp (by '| |' expression) initialized with nil; l: oop of locally created temp; r: readonly outer temp, stored directly as oop here; o: oop of outer block or method context to be referenced by o/i pairs; i in o/i pair: index of local temp (a or l) location in indexable fields of outer context o; -: outer references, direct (r's) and indirect (o's accessed by o/i pairs) ones; ^: stack pointer after copying outer references to the indexable fields - top of stack after the BCS blockCopy operation before evaluation, and - top of stack directly after storing block args into local args while evaluating; |: stack pointer just after copying the block arguments from the stack of the outer context to the current one while evaluating the block (in the next step these block arguments are pop'ed and store'd into the first indexable fields). r's have to be globally read only during the lifetime of this block. Typically they are (outer) block/method arguments stored as temps in their indexable fields, but with a smart compiler there may be more. An o/i pair represents the indirect reference to a *writable* temp l in the outer context o: the i's are not visible here; they are in the - compiler generated - bytecodes for accessing temps in outer contexts o. On the stack there are just the o's. Note: a simplified implementation may use indirect references (o/i pairs) for readonly outer vars, too. Block copy with outer temps --------------------------- A block copy has to generate a BCSBlockContext with space for - block arguments (a's) (like for MethodContexts (not BlockContexts!)), - locally created temps (l's) (like for MethodContexts), - read only outer temps (r's), - outer contexts (o's) (for accessing read/write outer temps (o/i pairs)), and - the stack (like for Block/MethodContexts), in its indexable fields. Before a block copy the readonly oops of local (args) or outer temps are pushed in one and the contexts of locally created (this context) and outer read/write temps (outer contexts) in another bunch (later followed by their number) onto the stack. A block copy copies them as readonly oops for the first and as outer contexts to be accessed by o/i pair bytecodes for the second case to the indexable fields for outer references of a newly created BCSBlockContext then. In the current context there may be locally created (directly referenced) *and* outer (indirectly referenced) r/w temps, seen from the new one these are directly referenced oops (r's) or outer context oops (o's) accessed by o/i pairs for r/w temps. Stack of current context before block copy: ...rrroooCALRO. o: oop of outer or current context, may be taken from indexable fields - as intermediate store - here; r: oop of readonly temp in current context; C: current Context; A: number of block arguments; L: number of locally created temps (l's); R: number of outer readonly temps of to be created BCSBlockContext; O: number of outer contexts o for accessing r/w temps (by o/i pairs) of to be created BCSBlockContext. The indices i have to be generated by the compiler to access the right indexable field i in an outer context o stored in the inner context: both together are representing an o/i pair. The same pushed o may be used for different o/i pairs (the i differs then) in the inner context. After copying an outer context o to an inner BCSBlockContext it is always used for indirect references via a bytecode using it as part of an o/i pair there! Additional Instance Variables ----------------------------- There are the following additional (to these of BlockContext) inst vars for BCSBlockClosures: - numLocalTemps (L), - numDirectOuterTemps (R), - numPushedOuterContexts (O). They serve the following purposes: - nil'ing locally created temps; - to know the contexts containing outer temps; - to know where the stack pointer has started. The following condition holds: - stackP := numArgs + numLocalTemps + numDirectOuterTemps + numPushedOuterContexts is for computing the stack pointer value after a blockCopy. Important: All outer contexts o have to be copied to the outer contexts (part of the) indexable fields of the newly created BCSBlockContext, which are used - inside the blockContext of the actually blockCopied block, and - in later created blockContexts, created while evaluating the actual block. The latter means that an outer context has to be propagated from the outer BCSBlockContexts to the inner ones until the last block is reached, which uses it. Evaluation ========== Semantics as currently ---------------------- Arguments are pushed onto the local stack if the block has to be evaluated like it is done currently. Semantics changes ----------------- The semantics of the #value method family has to be changed as follows: 1. Read/write locally created temps (l's) are always nil'ed if the block has to be evaluated. 2. If there is a second evaluation of a BCSBlockContext with args and/or locally created temps while another one hasn't been finished yet, there has to be a block cloning before. This block cloning creates a new BCSBlockContext, which is evaluated then instead. Why? Since otherwise we would share arg and locally created temps by inner BCSBlockContexts, and avoiding this is one of the reasons to introduce these beasts! And arg and locally created temps are just the temps, which we don't want to share, in opposite to the outer ones. Implementation idea (from Andreas Raab): the IP of the to be evaluated BCSBlockContext could be used to check if it is *not* already in evaluation to avoid cloning it then. After evaluation the IP has to be nil'ed (or 0'ed) to be able to reuse the BCSBlockContext for a new evaluation. Performance consideration: Use a free contexts list like it is made for MethodContexts for recycling of contexts. What the Compiler has to do =========================== - differentiating between different kind of variables: - intelligent naming scheme, - qualifying as - local and outer context, - readonly and read/write temps; - generating bytecodes (see below) for - accessing (storing/getting) indirect referenced outer context temps (represented by o/i pairs), - block copy operations generating BCSBlockContexts; - dealing with the propagation of outer contexts from outer BCSBlockContexts to inner ones; - renaming shadowed variables. Note: these extensions are not trivial as the implementation has shown... What the VM has to do ===================== - clearing locally created block temps in the #value family of primitives (by switching there between BlockContext and BCSBlockContext (as long as there isn't a clean BCS image)); - doing a BCSBlockContext cloning inside the #value family of primitives, if - the block is in use already (called recursively or from other processes), and - has arguments and/or locally created temps. Misc ==== - Outer temps living in MethodContexts are treated as outer temps in BCSBlockContexts (and not accessed as used to by their home pointer (containing the MethodContext)). Used current bytecodes ---------------------- ( 16 31 pushTemporaryVariableBytecode) (104 111 storeAndPopTemporaryVariableBytecode) They are usable for BCSBlockContexts, since they behave like MethodContexts here. Proposed new bytecodes ====================== Bytecode for reaching outer read/write temporaries -------------------------------------------------- Bytecode 126 (currently unused) extendedOuterAccessBytecode "This 3-byte code performs access to read/write temporary variables in outer contexts. This access is an indirect one: varIndex refers to the indexable fields of the outer BCSBlockContext residing as oop in the current context at contextIndex. This oop at contextIndex is an outer Block/MethodContext accessed by varIndex to get a local temporary residing in their (the outer context) indexable fields. The 2nd byte consists of a 2-bit operation type, and a 6-bit context index into the indexable fields of the current context. The 3rd byte contains the temporary var index into the indexable fields of the outer context." | descriptor varIndex opType contextIndex | descriptor _ self fetchByte. varIndex := self fetchByte. self fetchNextBytecode. "as fast as possible" opType := (descriptor >> 6) bitAnd: 2r11. "high 2 bits shifted down" contextIndex := descriptor bitAnd: 2r111111. "all but high 2 bits" opType = 0 ifTrue: [^ self pushContext: contextIndex at: varIndex]. opType = 1 ifTrue: [^ self storeContext: contextIndex at: varIndex]. opType = 2 ifTrue: [^ self storePopContext: contextIndex at: varIndex]. opType = 3 ifTrue: [self error: 'unknown bytecode'] So we are able to address 64 possible outer contexts and 256 variables inside them. We don't need a lexical level here, since access to the r/w outer temps is realized by indirect references over outerContexts stored in the indexable fields of the current BCSBlockContext. The varIndex here refers to an indexable field in the outerContext; this is an argument or locally created temp there. But how do outer contexts reach the newly created BCSBlockContext? They have to be copied to its indexable fields for pushed outer contexts after its direct referenced outer vars by new Interpreter block copy methods. These methods create the BCSBlockContexts. So there is a need for more bytecodes: Bytecodes for block copying --------------------------- Bytecode 138 (currently experimental) bytecodePrimBCSBlockCopyClean "clean block" ... and Bytecode 139 (currently experimental) bytecodePrimBCSBlockCopy "unclean block" ... which are calling the corresponding: primitiveBCSBlockCopyClean "clean block" | context methodContext newContext initialIP numOfArgsOop numOfLocalTempsOop methodContextSize numOfLocalTemps numOfArgs blockContextSize | numOfLocalTempsOop _ self stackValue: 0. numOfArgsOop _ self stackValue: 1. numOfLocalTemps _ self integerValueOf: numOfLocalTempsOop. numOfArgs _ self integerValueOf: numOfArgsOop. context _ self stackValue: 2. ... and primitiveBCSBlockCopy "unclean block" | context methodContext newContext initialIP numOfArgsOop numOfLocalTempsOop methodContextSize numOfLocalTemps numOfArgs blockContextSize numOfPushedContextsOop numOfPushedContexts | numOfPushedContextsOop _ self stackValue: 0. numOfLocalTempsOop _ self stackValue: 1. numOfArgsOop _ self stackValue: 2. numOfPushedContexts _ self integerValueOf: numOfPushedContextsOop. numOfLocalTemps _ self integerValueOf: numOfLocalTempsOop. numOfArgs _ self integerValueOf: numOfArgsOop. context _ self stackValue: 3. ... . Interpreter>>primitiveBCSBlockCopy has to copy the outer references (pushed as oops onto the stack) as readonly oops (missing here) or outer contexts - to be accessed by o/i pair bytecodes - to the indexable fields of a newly created BCSBlockContext. Note: These methods are written to get the idea, the concrete implementation may differ. Limitations =========== Unknown so far. Compatibility and Upgrading =========================== This proposal is as compatible as possible with the old scheme: - (old) BlockContext's generated by already translated methods behave as before; - a new VM is needed for BCS (block closure semantics); - if the (modified) compiler starts to translate with BCS the semantics change for newly compiled methods -> this affects recompiling; - older images are usable with a BCS VM; - after upgrading to a BCS compiler the image isn't usable by a pre BCS VM! There has to be caution in the process of upgrading!