Squeak
  links to this page:    
View this PageEdit this PageUploads to this PageHistory of this PageTop of the SwikiRecent ChangesSearch the SwikiHelp Guide
String asDecomposedUnicode
Last updated at 10:01 pm UTC on 10 December 2015
The class String has a method

 asDecomposedUnicode
 "Convert the receiver into a decomposed Unicode representation.
 	Optimized for the common case that no decomposition needs to take place."
 	| lastIndex nextIndex out decomposed |
 	lastIndex := 1.
 	nextIndex := 0.
 	[(nextIndex := nextIndex+1) = self size] whileTrue:[
 		decomposed := Unicode decompose: (self at: nextIndex).
 		decomposed ifNotNil:[
 			lastIndex = 1 ifTrue:[out := WriteStream on: (String new: self size)].
 			out nextPutAll: (self copyFrom: lastIndex to: nextIndex-1).
 			out nextPutAll: decomposed.
 			lastIndex := nextIndex+1.
 		].
 	].
 	^out ifNil:[self] ifNotNil:[
 		out nextPutAll: (self copyFrom: lastIndex to: self size).
 		out contents]


The method
 Unicode decompose: aCharacter
uses the Decompositions class variable of Unicode. This variable has been initialized from UnicodeData.txt with the Unicode Decomposition Mapping property.


Example:

'ö' asDecomposedUnicode
'é' asDecomposedUnicode asOrderedCollection collect: [:code | code asInteger] 

an OrderedCollection(101 769)

more details

Notes


See also

String asPrecomposedUnicode