Finalization is not resource management


	links to this page:

Last updated at 3:43 pm UTC on 14 January 2006

From: Michael Roberts Sent: January 22, 2004 Subject: Sockets and finalisation

I have been playing around with Sockets lately. I was wondering what was the best way to handle cleaning them up? . . . I have an object aNode that holds onto aSocket then I can do all sorts of things with the socket. . . . I can clean up orphaned open sockets by doing something like

   Socket allInstances do:[:s | s closeAndDestroy]

As long as I'm sure that I've created all the sockets. If I wanted the semantics that after dropping aNode I wanted to send closeAndDestroy to its socket I feel I might want to head in the area of finalisation but I don't know how to do this.

. . . What is the best, or considered, way to handle processes in objects when you then forget about the object without cleaning up the process?

The Short Answer: Finalization is not resource management

From: Andreas Raab
The thing to keep in mind for all of the places where finalization is provided in the image is that finalization is ALMOST NEVER a replacement for proper resource management. In short: The only reason why we provide finalization support for some objects (such as files and sockets) is that without them, any "forgotten" file/socket would leave to dangling OS resources and therefore get problematic before too long. As with all resource-containing objects, having proper resource-deallocation is critical as well as application dependent. The only thing the system can do in general is to make sure that it closes the appropriate resources when it can prove that this object is no longer used (e.g., GCed). Finalization is sometimes understood to be "sort of" resource management but in almost all "real world applications" this won't work too well.

Therefore, if you write an app which uses resource-allocating objects, it is YOUR task to make sure they are closed properly. Which is one of the reasons why patterns such as:

    file := self openFile: 'foo.txt'
   [self doSomethingWith: file] ensure:[file close].

are used. IOW, the above ensures that the opened file is indeed properly closed after we are finished with it, finalization or not.

(Completely OT, but I have long thought that it might be wiser not to provide the current finalization facilities as they typically lead to worse code since people are relying on it as the default rather than the exceptional facility that it actually constitutes)

[If by "dropping" aNode,] you mean "wait until it is GCed" rather than somehow explicitly "shut down", then you _may_ want to use finalization. If so, have a look at Object>>toFinalizeSend:to:with:.

From: John McIntosh
[Seconding Andreas]. I personally have received phone calls from the far side of the world when folks have abused finalization, then on production day realize it's "sort of"

I always say: objects get finalized "someday" in the future in a different order than you expect, at a different rate than you expect, and at a different time than you expect. Each of those 'expects' is cause for disaster.

The Long Answer: A discusion

From: Avi Bryant
Given [Andreas' advice to use [] ensure:[], why does FileStream not have methods something like?

FileStream class>>fileNamed: fileName do: aBlock
	|file|
	file := self fileNamed: fileName.
	^[aBlock value: file] ensure: file close

So that you could do things like

    contents := FileStream fileNamed: 'foo' do: [:f | f contentsOfEntireFile].

This is quite common in other languages that make heavy use of closures, i.e. Lisp (with-file 'foo' (f) ...) and Ruby (File.open('foo'){|f|...}), but for some reason it doesn't seem to be idiomatic in Smalltalk.
Also useful would be #readOnlyFileNamed:do:, #newFileNamed:do:, etc.

From: Julian Fitzell
And that combinatorial explosion is probably why the methods don't exist... :) Still, I agree they're useful.

From: Craig Latta
It should also do the creation inside the ensured block. As it is, the process running the code could get stopped before the ensured block is evaluated, leaving the file dangling.
[Julian Fitzell:In that case, you'd presumably need to add an #ifNotNil: check before closing the file?]
[Craig:] I've always just implemented UndefinedObject>>close, doing nothing.

From: Andreas Raab
Putting the creation in the ensured block doesn't prevent [leaving the file dangling]. It is quite possible for #fileNamed: to be terminated before returning to the sender and since the file variable hasn't been assigned yet this wouldn't prevent you from leaving a dangling file any more than the above. [Craig agrees.]
And the chance of the process being terminated in between the assignment and entering the ensured block is - compared to the probability of being terminated somewhere inside #fileNamed: - so remote it isn't even worth mentioning. [Craig disagrees. He's seen it.]

Really, using #ensure: in the above way is about resource management, not dealing with horrifically complex exceptional situations []. For dealing with the resource it doesn't matter at all if the file creation is inside or outside the block - the basic deal is that once you're done with the block we assume that you don't need the resource any longer and close it.

[] And for all but the most horrific situations we still have finalization to rely on. This is one of the few places where "finalization for resource management" is actually helpful - since somehow the system broke the contract to hand you the file handle, yet created it, it's the system's responsibility to clean up afterwards. [Craig: No, it's about ensuring things that you want to have happen.]

From: Tim Rowledge
[Back to Avi's original query about why does FileStream not have methods something like . . ] Others have pointed to some of the problems with this when opening the file goes wrong; I'd add that there should be a check for that before the block is evaluated, possibly with an exception being raised. That would make it important to consider what response the exception might elicit so that we could continue cleanly if appropriate.

Another problem was alluded to in the swamp of methods needed to mirror all the FileStream creation methods. My response would be along the lines of "don't do that". Instead, factor it out better. Keep the FileStream creation separate and make the block handling an instance method:-

   myFile := FileStream newFileNamed:'wibble').
   myFile ifNil:[FileStream newFileCreationException raise].
   myFile doBlock:[:fl|
	fl nextPutAll: war and peace' asBook entireContents.
	fl thingy].

Or something a bit like that. I don't like the name #doBlock: but I can't think of anything better right now. #do: and #perform: are almost right but have rather specific meanings already.

From: Craig Latta
[Seconding, the idea that an exception being raised.] Indeed; I think the file creation behavior should raise an exception instead of answering nil. I guess there wasn't exception-handling support when it was (first) written.

From: Andreas Raab
What I'm trying to say here is that there's a fundamental uncertainty principle when you put the "creation part" for some resource inside a block that should ensure the release of said resource. Let's look at this more closely. You may have something that says:
[resource :=
"... more stuff ..."
] ensure:[resource release "close, whatever"].
If, for whatever reason, the "resource creating expression" fails to complete (raising an exception, just terminating whatever) there is no way by which you can release the resource in the above code simply because the "resource" variable hasn't been assigned yet. Meaning that it is basically useless to put the creation part inside the ensured block. The only thing you gain by doing this is that you cover the microscopically small chance that the computation gets terminated right after assigning to the resource variable but before entering the ensured block. One way to make it "safer" would be to use a pattern where you say (going back to the file example):

     file := FileStream new.
    [file openFileNamed: 'foo.bar'.
    "... do stuff ..."
    ] ensure:[file close].

since here you'd have in fact a valid "file" variable when you open the file, but then again, you could not ensure the actual creation (FileStream new) as the same uncertainty principle applies (e.g., if #new fails you wouldn't have a valid file variable and therefore your release operation cannot handle the failure).

That's why I'm saying putting the resource creation inside the ensured block doesn't give you anything, and that's why you don't need explicit guards nor stub methods in UndefinedObject. You simply can't handle failures that arise from resource creation contained inside some ensured block that should cover the release of that resource.

From: Lex Spoon
[Regarding Craig’s report that although “the chance of the process being terminated in between the assignment and entering the ensured block is - compared to the probability of being terminated somewhere inside #fileNamed: - so remote it isn't even worth mentioning”, he in fact has experienced it] Where is the terminate coming from? If it's from an exception internal to File, then you shouldn't need to #close the file as a user. If it's from another thread, well, don't do that. :) You'll have many other difficulties than this one.

From: Tim Rowledge
[Regarding Andreas’s statement “You simply can't handle failures that arise from resource creation contained inside some ensured block that should cover the release of that resource.”]
Not quite completely true, though it would require a change in the resource allocation behaviour to be clean.
a) One could simply do the ugly thing and make UndefinedObject respond to the resource release message (yuck)
b) One could return an instance of FailedResourceAllocationDoohickey rather than nil when the file open fails and provide it with protocol to do various useful things. Like maybe retry under some situations, release 'cleanly' and so on.

With b) we would anticipate an exception from the "do stuff" part of the ensured block which would then unwind and the 'file' would handle #close by doing nothing much. Is this a good pattern? Not sure right now.

[In a latter message:] Note that I'm not claiming any rightness or wrongness in the approach, simply that saying "you can't do it" isn't correct. Besides, we haven't had an argument about something fun in ages. :-)

From: Andreas Raab
[Don’t think Tim’s enhancements would...] matter one bit. (Am I really explaining myself so poorly?) All I'm saying is that there's no reason to put the resource creation into the ensured block, as, no matter what you do, if the resource allocation fails (and returns anything but that allocated resource) you don't need to handle it (the resource hasn't been allocated so why free it?). Given that this is the case, it seems that for reasons of clarity alone the ensured block can (and should) only cover the portion of code where you can rightfully assume that you do in fact have a valid resource. For example, even with your FailedResourceAllocationDoohickey I would claim that it's simpler and more easy to understand if you have code that says, e.g.,

    file := self openFile:'foo.bar'.
    file == FailedResourceAllocationDoohickey
        ifTrue:[^self error: 'could not open file']. "or whatever"
    [self doSomethingWith: file] ensure:[file close].

From: Tim Rowledge
I agree that [the pattern Andreas just suggested] is pretty much how I would choose to do it too but if someone really preferred the pattern with the allocation inside the ensured block then I feel sure it could be made to function adequately. And I'd rather see a pattern I don't favour actually working than not working; at least it doesn't mean broken functionality.

From: Richard O Keefe
[Regarding Andreas Raab's recommendation against putting resource creation inside an #ensure: block. ... That“it is basically useless to put the creation part inside the ensured block.”]
What if you do

 	resource := nil.
	[resource := .
	 "... more stuff ..."
	] ensure: [resource ifNotNil: [resource release]]

instead? The only thing you gain by doing this is that you cover the microscopically small chance that the computation gets terminated right after assigning to the resource variable but before entering the ensured block.

"A microscopically small chance" is usually defined as "it has never happened to me and I don't care if it happens to you." Let me give an anecdote with names removed:
sold a mainframe, together with a number of machines to act as terminal controllers. had serious problems; the machines had to be rebooted at least once a day, sometimes several times, to unwedge them. eventually traced this to a timing window in the OS kernal for the machines. Their decision: "nobody else tries to put this much load on their machines so we're not going to fix the bug, you will have to buy some more of our machines." Oh yes, it was who recommended that configuration in the first place. I _think_ they eventually provided an extra machine free, but I'm not sure about that.

The lesson I learned from that is "microscopically unlikely bugs WILL happen" (or to put it in Discworld terms, one in a million chances come off nine times out of ten).

A similar anecdote: a Prolog system I once maintained (NOT Quintus Prolog) had a subtle timing window: if you pressed the interrupt key twice in quick succession you could trigger a bug that left some core data structures (think database, think stdio) scrambled. I did my best to fix that bug, but the stdio problem needed source access which I didn't have.

The

	resource := nil.
	[resource := allocate....]
	ensure: [resource ifNotNil: [resource release]]

pattern can't do, and isn't intended to do, anything about errors in the allocation code, but it DOES close a timing window which WILL give rise to an observed failure some day.

From: Finalization is not resource management
Andreas, doesn’t [your approach] assume that the only thing that happens inside the resource allocation method is in fact the resource allocation. For instance, when opening a file stream, the file is also registered for finalization. If this fails (for whatever reason), the resource has been allocated, but will possibly cause a walkback, and closing the walkback will not release the resource.

From: Andreas Raab
[Jon’s ]You are precisely right. But if you look at, say:

    [file := FileStream newOnlyFileNamed: 'foo.bar'.
    ] ensure:[file close].

then, if the registrations fails inside #readOnlyFileNamed: you wouldn't be able to release the resource by doing "file close" simply because the variable "file" has not been assigned yet. In other words, at the point where your registration fails and you get your walkbalk the value of the variable "file" is nil - and therefore the operation "file close" can't do anything either.

You can try it for yourself if you wish - just evaluate something like

    [file := (FileStream newFileNamed: 'foo.bar') halt.
    ] ensure:[file close].

which will simulate a "right hand side error" (e.g., any error inside the resource creation method) and look at the "file" variable inside the debugger. As you will see it's nil (since the RHS expression hasn't been completed yet) and closing the debugger will get you a DNU for nil>>close. And yet, the resource has been allocated, but the above pattern (e.g., putting resource creation inside the ensured block) is simply unable to handle this kind of problem.

That's why I keep saying that there's really no point in putting the resource creation inside the ensured block. And yes, I must be explaining myself really, really poorly here ;-)

From: Brent Pinkney
This is what I often do:

  [file := FileStream newFileNamed: 'foo.bar'.
     ] ensure:
		[ file ifNotNilDo: [ :x | x close ] ].

Does this not catch all the possibilities ?

From: Andreas Raab
No. That's why I find the style of putting the resource creation inside the ensured block so terribly misleading. It puts you under the impression that you caught all the possibilities, where it is easy to see from an example like

    [file := (FileStream newFileNamed: 'foo.bar') halt.
    ] ensure:[file close].

that you do not catch anything that happens on the right hand side of the assignment. IOW, execute the above and close the resulting debugger - it will end up with a DNU for file close which illustrates that - although the file has been created - it is not being cleaned up after encountering the halt.

From: Brent Pinkney
Ahhh - I think a light has just gone on.

Am I correct in understanding that the assignment operator should instead be thought of as a function call (as it can be in C++) which itself can fail AFTER the rhs has been successfully evaluated but before the lhs has been set to point to the result of said rhs computation ?

In the above example, the rhs whould have created a file but the lhs (file) would never be set to reference it.
So the ensure: block would have no reference to the file which still needs to be closed.

From: Jesse Welton
Andreas, you correctly point out that this failure mode [“anything that happens on the right hand side of the assignment”] cannot be caught and handled. If I understood Craig, the failure mode he was referring to would be simulated by

    (file := (FileStream newFileNamed: 'foo.bar')) halt.
    [ file doStuff ] ensure: [ file close ].

Here, the failure can be caught by moving the file creation into the block. You argue that this is statistically unlikely, but as Craig and Richard point out, this kind of thing can and does happen occasionally. So, while moving the creation code into the block cannot catch all possible failure modes, it can catch a failure mode which having the creation outside the block cannot, albeit a rare one. The cost for this is having to be able to cope with a nil file variable in the ensure block, in case of the more likely failure mode you mention. Apparently, the two sides in this argument differ in the weight they give to each of these factors.

From: Lex Spoon
[Regarding Jesse’s point that “the failure simulated by

     (file := (FileStream newFileNamed: 'foo.bar')) halt.
     [ file doStuff ] ensure: [ file close ].

...can be caught by moving the file creation into the block.]
Why would this failure mode happen? The only time I can see is if you are sending #terminate to your processes from code, and you should not do that.

If you are talking about people pressing alt-., then I think you just have to live with an occasional stray resource. That seems perfectly reasonable; run a garbage collection if need be.

If you are doing #terminate, then you cannot win. Even if you do this version:

    [ file := "open a file" ]
    ensure: [ "close the file if non-nil" ]

you still do not catch it if the file gets opened, and the file-open method is returning, but the interrupt happens before the return statement completes.