Introduction to Collections
Last updated at 9:20 am UTC on 16 October 2019
This page contains a simplistic walk through of the collection classes with some basic examples. It is meant to be a part of the Squeak Tutorial being (slowly) written by the documentation team.
At some point in any program you'll need to gather objects together and manipulate them as a group. Squeak provides you with a set of collection classes designed to help in these situations. Each class is particularly appropriate for certain uses and part of the challenge in learning Smalltalk is to learn which class to choose in which situation. This section describes the most commonly used of these classes, suggests when they should be used and when they would be inappropriate. All are subclasses of the class Collection where their basic behaviour is defined. If you open a browser on class Collection and examine it and the variour subclasses you will soon see what a wide range of tools you now have available.
The class Collection is the superclass of all the collection classes. In this class are defined messages that all types of collection share although some of the collection classes override the definition provided in class Collection to provide a version of their own more suited to their purpose. The collections hierarchy is a good example of inheritance in action.
An array is a collection of fixed size. You define the number of elements it can hold when you create it and cannot resize it later. This is a disadvantage when you cannot be sure of how the array will be used but it does have the advantage that such collections are very fast and efficient with memory. Because of these restrictions you should probably use arrays only when you understand the problem well and/or speed and memory usage are extremely important. Remember that as both Arrays and OrderedCollections understand many of the same messages so you may be able to write your program using OrderedCollections and only later, once the problem is well understood, change to use arrays if there is an advantage to be had. Arrays have a literal syntax that allows you to create them without having to use 'Array new' or similar messages.
You can create an Array using the literal syntax
myArray := #('Hello' 'Goodbye').
You can check that this has indeed created an Array
myArray class. => Array
You can access an element of the array by its position
myArray at: 2. => 'Goodbye'
I can change the contents of an array
myArray at: 2 put: 'Adios'.
myArray at: 2. => 'Adios'
A String is similar to an Array (in Squeak both inherit from the same superclass ArrayedCollection) and behaves similarly, however in its case it can only hold Characters. Strings are one of the classes which have a literal syntax. This means that you can create them using this special syntax without having to use 'String new' or similar messages. Strings are used so often that this makes code much easier to read.
I can create a String using the literal syntax
myString := 'This is some text'.
Note that a literal string uses single quotes and that double quotes indicate a comment.
I can later retrieve a character by its position
myString at: 3. => $i
Because Strings are a type of Collection, many of the commonly used messages work as expected. Many other languages treat Strings as a special data type with its own functionality which has to be learned separately from collection style datatypes.
An OrderedCollection is probably the most used Collection class. Unlike an array you don't need to worry about defining it's size ahead of time, if you add or remove objects the collection resizes itself automatically. There are several messages therefore that you can use to add and delete elements.
You can create an OrderedCollection as shown below. Unfortunately there is no literal syntax but there is a shortcut shown below. Note the use of a cascade to add elements at same time the OrderedCollectin is created.
myOC := OrderedCollection new add: 'Hello'; add: 'Goodbye'; yourself.
And now using the shortcut
myOC := #('Hello' 'Goodbye') asOrderedCollection.
Many of the collection classes support such messages allowing you to change between them easily and in this case we took advantage of the array literal syntax to create our OrderedCollection.
I can access the contents by position, just like an Array.
myOC at: 1. => 'Hello'
I can add a new element to my OrderedCollection, the collection will expand automatically and I don't need to worry about its size.
myOC add: 'Adios'.
myOC. => #('Hello' 'Goodbye' 'Adios')
And I can remove elements depending on their position.
myOC removeAt: 2.
myOC. => #('Hello' 'Adios')
I can see how many elements my OrderedCollection currently holds.
myOC size. => 2
And i can create a new OrderedCollection from the current collection.
myNewOC := OrderedCollection newFrom: myOC.
The OrderedCollection class is very flexible and performs very reasonably. For this reason it is probably best to begin with OrderedCollections rather than more restrictive types such as Arrays, only changing if performance is found to be wanting.
All the collection classes previously mentioned are indexed by an integer. by this I mean that you refer to the contents by their numeric position in the collection. The 'third element' or the 'sixth element' for example. But what if we want to store items in a collection but to refer to them in a different way, by names perhaps. A Dictionary allows us to do this. Say that you wanted to store a Car object by the name of the owner, You could do this with the following code.
myDictionaryOfCars := Dictionary new.
myDictionaryOfcars at: 'Sergio' put: sergiosCar.
Later, when i want to retrieve Sergio's car, I use the code;
myDictionaryOfCars at: 'Sergio'.
And this will return me the appropriate car. When storing items in a Dictionary we refer to the item stored as the 'value' and the label we used as its index as the 'key'. The key can be any type of object although strings are often used. Note that when you are using your own classes as keys you have to ensure that you have defined #= in a way that maks sense for that class as this message is used to locate the correct key. There is also a subclass of Dictionary called IdentityDictionary which uses the #== message to match keys.
Iterating over Collections
One of the most useful things about Collections is that they support messages that allow us to operate on all or some of their contents at once.
Assuming I have created an OrderedCollection holding several numbers. Note that I'm taking advantage of the Array literal syntax to create an Array and then create my OrderedCollection from that Array in a different way to the example given earlier. You will often find many ways to accomplish the same task.
myOC := OrderedCollection newFrom: #(1 2 3 4 5 6 7 8 9).
We can perform the same operation on every element using the message
myOC do: [ :element | element doSomething ].
The #do: message takes a block that requires a single argument. This block is then evaluated for each of the elements of myOC.
Often we want to perform an operation on each element and then return the results as another collection.
myOC collect: [ :element | element + 5 ].
In this case our result will be another OrderedCollection of the same size holding all the results of the evaluation of the block for each element.
Often we want to search for an element that satisfies a particular test. in this case we can use #detect:
myOC detect: [ :element | element = 5 ].
In this case the message will return the first element for which the block returns true, or, if no element satisfies the test, it will throw an exception.
Where we want to check for multiple elements that satisfy a test we can use #select:
myOC select [ :element | element > 5 ].
By using #select: you get an OrderedCollection containing all the elements for which the block returns true.
An example which uses select: and collect:
listString := '
# This is a comment line. Below are data lines.
dataLines := listString lines select: [:aLine | ((aLine beginsWith: '# ') or: (aLine withBlanksTrimmed = '')) not].
(dataLines collect: [:aLine | aLine splitOn: ',']) inspect
Sometimes we want to check if a Collection contains a particular element. Using detect we could write
myOC detect: [ :element | element = myTestObject ].
to find if myTestObject is indeed a member of the collection. However with the #includes: message we can write
myOC includes: myTestObject.
Which is much shorter and clearer. You will find that there are many variations on these common messages which allow you to perform slightly more complex operations with less code.
Iterating over Dictionaries
Dictionaries of course are collections of key and value relations and therefore are slightly different. For this reason there are messages that only exist for Dictionaries and their subclasses.
If we just want to iterate over the keys of a dictionary we can use #keysDo:
myDict := Dictionary new.
myDict at: 'Andrew' put: 5; at: 'David' put: 10.
myDict keysDo: [ :key | key doSomething ].
There is also a #valuesDo: if we only want to iterate over the values.
myDict valuesDo: [ :value | value doSomething ].
Of course often we want to operate both on the key and its associated value. For this we have the message #keysAndValuesDo:
myDict keysAndValuesDo: [ :key :value | value doSomething ].
Some String messages of interest