Requirements for String classes
Last updated at 4:28 pm UTC on 15 September 2003
Author "Andrew C. Greenberg"
Date: Wed, 22 Sep 1999 02:09:06 -0400
Subject: Unicode support
- Strings are somewhat homogenous;
- Strings are comprised of elements of a Flyweight pattern class;
- Much of the "usual" things we do with a string are determined by the underlying character class.
Perhaps we can study what things we really do with strings, at least
to enumerate them, and see how they matter.
Questions
- What do strings do?
- What must they do?
- Which of the following are necessary, mandatory, or even useful? Which depend upon the class?
- support a (partial;total) linear ordering (=,<,>)
- substring
- catenation
- indexing
- collecting, selecting, allAre, someIs, based upon blocks
that are passed (conceptually) to individual objects.
- clumping and declumping (word-based, delimiter-based, token-based)
- random access out (not as special case of substring, but pulling the character object out of the class as well.
- random access in (not as special case of catentation, but validating and/or coercing the character on the way in.
- notion and creation of a null string as identity operator for many of the preceding operations
- searching for substrings.
- sizing
I don't know that I agree with those who believe that a string must
be "growable/shrinkable" Perhaps we should consider making strings
that are length-immutable, per Python. Doing so can facilitate other
of what I have now come to think of as string-like operations, in
particular, slicing and index-shifting, by creation of proxy objects
on the original string, which share changes to the underlying
content. Size-shifting operations can be permitted, but create
copies of the original, changes to which do not impact the content of
already-taken slices.
Consider also the Applescript container notion
Date: Tue, 21 Sep 1999 19:12:39 -0500
From: Joe Davison
Subject: Unicode support
I tend to agree that the root of the problem is that we lack a good abstract definition of String, and I wonder if we really can have one that's much more specific than ArrayOfCharacter. Besides the obvious
- comparisons,
- concatenation and
- substring operations
things we often like to do with strings in various languages are
- pattern matching with regular expressions,
- creating token streams,
- storing them in files,
- reading them from files,
- interpreting them as a statement in some human or computer language
But most of those operations might properly be done on objects that Strings are convertible to/conformable with.
Do we really want a regular expression capability for GeneralString? -maybe so – certainly Perl's implementation of regular expression functions on character strings is a major advantage over many other languages. I suppose one might want two methods
GeneralString>>asRegularExpression (the compiler) and
GeneralString>>match: aRegularExpression (the interpreter),
hypothesizing a RegularExpression class – which would have to be pretty tricky with GeneralCharacter s...
joe