Multilingual support - unedited stuff


	links to this page:

Last updated at 11:24 am UTC on 7 August 2002

Temporary page for saving unedited stuff of the Unicode support discussion of the Squeak mailing list.

Please feel free to move it to a spot where you think it belongs to.

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[23-Sep-1999 / Robert P. Jarvis]
Conclusion
Your code added methods to String which then reflected back to the stored
objects. IMO this would add considerably to the protocol which String (or GeneralString, or whatever) would have to support. Many of the comments
made about e.g. the bi-directional nature of Hebrew (which is, I think, more
an issue of display rather than storage), the differing word-break
conventions in other languages, etc indicate to me that String isn't as
badly broken as some may want to see it, but that subclasses are needed to
handle these special cases. I don't think one single, all-purpose,
FinalUltimateSuperString class which can handle all the possible special
cases is desireable or doable. But until someone actually sits down and
starts cutting code it's just talk anyways.

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Title Unicode support
Author agree@carltonfields.com

From: agree@carltonfields.com
Date: Tue, 21 Sep 1999 11:23:59 -0400
Subject: RE: Unicode support

Smalltalk-80 proved that we don't need to conceptually special case the most common scenario to have full functionality AND efficiency: the Number hierarchy, and particularly the Integer hierarchy is a case in point Smalltalk seamlessly integrates the special case (SmallInteger) in a breathtakingly fast and almost cost-free operation, while providing broad flexibility for the more general case. The Number architecture provides the essential functionality upon which the rest is based.

Marcel seems to have this right by focusing on the essential question: "What 20 is a string?" Once we have the essential protocol, the rest revolves around making an intelligent hierarchy, with an eye toward making the special case or cases (ASCII/UTF or whatever) efficient as hell and cost-free in terms of function.

We have already seen a number of extensions of String (Text) and the experiment was worth watching. OpenDocs provides another model. What we need to do is thing bigger first – what is the essence of the String – and then how do we provide all the encodings (and then conversions between them.
within that framework hopefully seamlessly and very, very fast whenever that is possible and desired.