History of the Character Recogniser
Last updated at 12:14 pm UTC on 31 October 2003
This is a combination of three emails sent to the Squeak Mailing Lists by Alan Kay on 7 November 1998 - Russell Allen
Interaction Between Keyboard and Mouse
(Daniel Vainsencher wrote:
I'm a keyboard type of person, and I generally don't like the mouse at all, but I hate switching between them even more. Using Smalltalk, and Squeak specifically, requires almost constant use of the mouse. Does anyone here have any experience with things like Twiddlers or such? The idea of a device that combines the mouse and keyboard into one device, not requiring the annoying context switch, sounds great in theory. Any "but"s?)
Long ago, Doug Engelbart came up with both a theory for such interaction, and a particular instance of the theory.
The theory was that you should be able to stay at either the keyboard or the pointing device for long periods of time without having to switch very often.
His instance of the theory was to have a five finger "chord" keyboard for the nonmouse hand. This plus two of the three buttons on the mouse allowed 127 characters to be typed, which included both text and commands. (The other mouse button was "command accept".) So one typically would have "hands out" on the mouse and chord keyboard for most navigation, commands, correction of typos, and short text inputs. When it came time to type a whole paragraph, the hands would come into the keyboard. I got fluent at this in the late sixties, and liked it. The PARC Alto came with the chord keyboards, but they didn't catch on.
In the early Dynabook design, Engelbart's idea was adapted to a pen-based interface. RAND had done GRAIL, a really great pen-based system with a recognizer similar in spirit to and better than Grafitti. We realized from experience with Engelbart that even a perfect recognizer (which GRAIL almost was) was still too slow for some interactions, so I put a keyboard on the Dynabook model. Again the scheme was to navigate, give commands, fix typos, and do short text inputs with the pen, and then to switch to the keyboard for intensive text entry.
This scheme was argued for the Newton and rejected (not on logical grounds).
GRAIL in Squeak
Now, it happens that there is a GRAIL-type character recognizer lurking in Squeak that I did in the earliest days for fun and to provide some benchmarks. It is trainable, etc. It is called Class CharRecog. The code is very short and should be clear.
I don't think anything makes GRAIL unique except that it was the first really good and comprehensive all-pen-based GUI. RAND invented the tablet (in 1964) in order to do such an interface. The recognizer was done in '66 by Gabe Groner, and fit into 2k words of the single user 360/44 that was used as the host machine. Their basic notion was that whatever a UI does, it should do it essentially perfectly, because it is so jarring to have something that is supposed to feel like a transparent extension of one's body constantly stop the action with errors. So they opted for a recognizer that the user would have to learn – and eventually decided to try to do everything with single strokes for speed. The result was really great, and the totality of human factors was better than grafitti on the palm pilot.
The one I did that is now in Squeak comes from a long line of recognizers that go all the way back to GRAIL. It is only a page of Squeak code, and you can see that the GRAIL approach was quite elegant and simple.
GRAIL and Grafitti
As you might expect, I don't have any big interest in trying to fit Grafitti into GRAIL (since the latter was better on all important counts). The GRAIL recognizer doesn't need any case shifts – and characters like x can be done with one stroke. That being said, the scheme does allow for multiple strokes, and several of the recognizers we did at PARC were of this type – most of them were descendents of the Ledeen type recognizer (look at Newman and Sproull version 1). Having tried both, I am a big fan of one stroke, because the recognizer doesn't have to wait to decide if you are done – and it is even possible to recognize on the fly so you don't even have to finish the stroke (one of the GRAIL variants at RAND was of this type).
GRAIL and Cursive
The Squeak/GRAIL recognizer can handle single character cursives pretty well – and most of the chars I trained it with are cursive.
But connected cursive is an entirely different matter, and good solutions are of an entirely different scale. The Rosetta Newton recognizer (the good connected cursive one) was done by Larry Yeager, was very large, and used neural net technology – the problem is abstractly similar to connected speech recognition, and good solutions use similar methods.
The GRAIL approach is dog-simple, but elegant. It does require user training, but this is somewhat alleviated by it being trainable as well.