Squeak
  links to this page:    
View this PageEdit this PageUploads to this PageHistory of this PageTop of the SwikiRecent ChangesSearch the SwikiHelp Guide
Normalization test cases (Unicode)
Last updated at 9:09 pm UTC on 18 December 2015
The database of test cases for normalization is a text file with fields separated by semicolons.

The columns (c1, c2,...) have the following meaning:
source; NFC; NFD; NFKC; NFKD

http://www.unicode.org/Public/UCD/latest/ucd/NormalizationTest.txt

 @Part0 # Specific cases
 #
 1E0A;1E0A;0044 0307;1E0A;0044 0307; # (Ḋ; Ḋ; D◌̇; Ḋ; D◌̇; ) LATIN CAPITAL LETTER D WITH DOT ABOVE
.....
 @Part1 # Character by character test
 # All characters not explicitly occurring in c1 of Part 1 have identical NFC, D, KC, KD forms.
 #
 00A0;00A0;00A0;0020;0020; # ( ;  ;  ;  ;  ; ) NO-BREAK SPACE
 00A8;00A8;00A8;0020 0308;0020 0308; # (; ; ;  ◌̈;  ◌̈; ) DIAERESIS
 ...
 more of part 1
 ...
 @Part2 # Canonical Order Test
 #
 0061 0315 0300 05AE 0300 0062;00E0 05AE 0300 0315 0062;0061 05AE 0300 0300 0315 0062;00E0 05AE 0300 0315 0062;0061 05AE 0300 0300 0315 0062; # (a◌̕◌̀◌֮◌̀b; ◌֮◌̀◌̕b; a◌֮◌̀◌̀◌̕b; ◌֮◌̀◌̕b; a◌֮◌̀◌̀◌̕b; ) LATIN SMALL LETTER A, COMBINING COMMA ABOVE RIGHT, COMBINING GRAVE ACCENT, HEBREW ACCENT ZINOR, COMBINING GRAVE ACCENT, LATIN SMALL LETTER B
 0061 0300 0315 0300 05AE 0062;00E0 05AE 0300 0315 0062;0061 05AE 0300 0300 0315 0062;00E0 05AE 0300 0315 0062;0061 05AE 0300 0300 0315 0062; # (a◌̀◌̕◌̀◌֮b; ◌֮◌̀◌̕b; a◌֮◌̀◌̀◌̕b; ◌֮◌̀◌̕b; a◌֮◌̀◌̀◌̕b; ) LATIN SMALL LETTER A, COMBINING GRAVE ACCENT, COMBINING COMMA ABOVE RIGHT, COMBINING GRAVE ACCENT, HEBREW ACCENT ZINOR, LATIN SMALL LETTER B
 ...
 @Part3 
 # PRI 
 #29 Test
 #
 09C7 0334 09BE;09C7 0334 09BE;09C7 0334 09BE;09C7 0334 09BE;09C7 0334 09BE; # (ে◌̴া; ে◌̴া; ে◌̴া; ে◌̴া; ে◌̴া; ) BENGALI VOWEL SIGN E, COMBINING TILDE OVERLAY, BENGALI VOWEL SIGN AA