Defining a new index type


	links to this page:

Last updated at 3:15 pm UTC on 4 May 2018

Magma comes with just a few useful index types, but there are many ways we need to access our objects besides keywords. This page will serve as a tutorial for creating a new index type for Magma that can be used to find objects that reference particular Float values.

How indexing works in Magma

The key to MagmaCollections is a class used only by the server called MaHashIndex. You never have to deal with this class yourself, but it helps to understand a little about how it works.

MaHashIndex maintains a file structure that stores associations between Integers, very much like a Dictionary. The Integer "keys" of this dictionary are your "lookup values", the "values" of this dictionary are the the oids of the objects that generated those keys, and will be found when those keys are used as lookups.

When you add an object to a MagmaCollection, an entry is added to its "memberIndex" whose keys are the oids of the objects that have membership in that collection. Each index added to a MagmaCollection causes an additional MaHashIndex to be created (which means an extra file on the server), whose keys are used to look up its oid values.

Creating a new index type involves defining a method that answers an integer value for one particular "attribute" of the object being stored in the MagmaCollection that will be used to locate that object later.

Let's illustrate with an example. We'll be creating a Float index to find MoneyTransaction objects in a financial domain based on the amount of the transaction. Our MoneyTransaction class already has a getter called #amount that returns a Float. This is the attribute we want to index on. We will now go through the process of creating a new index type for Floats.

Step 1, Planning your Index

Before you do any coding, you need to make sure you fully understand how you intend to use your index.

I find it helpful sometimes to pretend I have the index, then write a little code that will attempt make use of it. In other words, write a statement that create and then use the MagmaCollectionReader based on the attribute you want to index.

 myReader _ myMoneyTransactionsCollection
   where: #amount
   from: 4.99
   to: 4.99
 myReader at: ...

Additionally, though it doesn't affect implementation, you need to decide how large the key values need to be for your index. Larger keys allow a greater range of values, smaller keys perform faster.

Deciding your key size can involve trade-offs. For example, with the MaAsciiStringIndexDefinition, the keySize matters tremendously. Print this code in a workspace to demonstrate:

 (MaAsciiStringIndexDefinition attribute: nil keySize: 32) meaningfulCharacters

(Note: Normally, we would never have nil be our attribute, but since we're only using this index to find out how many meaningfulCharacters we have, we didn't need to specify the attribute Symbol).

As you can see, 32 bits only gives you four characters of uniqueness. This means, if you use this index to search for "Smith", you'll also get "Smitty", which may or may not be useful. If you need it to be more discriminating, you can use a larger keySize:

 (MaAsciiStringIndexDefinition attribute: nil keySize: 128) meaningfulCharacters

This gives 18 characters of uniqueness, more than enough to find people by name, so 128 would probably be wasteful. 64 is probably an appropriate balance. The point is, you get the range of #'s specified by your keySize. 0 is always the lowest, to find out the highest:

 (MaAsciiStringIndexDefinition attribute: nil keySize: 32) highestPossibleKey

In our tutorial case, Squeak already provides a method to convert a Float to a 32-bit Integer. It's called #asIEEE32BitWord. Let's try it:

 4.99 asIEEE32BitWord

When we add a MoneyTransaction to our MagmaCollection, it will add an entry at that key 1084206612. When we want to find all MoneyTransactions with that key, we'll say:

 myTransactions
   where: [ : each | each amount = 4.99 ]

and Magma will convert those 4.99's to the 1084206612 to do the lookup in the correct MaHashIndex on the server and the correct instances will be part of that reader.

Isn't this easy? Enough planning, let's implement it.

Step 2, Create a new class defining the index type

MagmaCollectionIndex is the abstract superclass of all index types supported by Magma. Create a subclass of MagmaCollectionIndex:

 MagmaCollectionIndex subclass: #MaFloatIndex
   instanceVariableNames: ''
   classVariableNames: ''
   poolDictionaries: ''
   category: 'Magma-Large collections'

Step 3, define the method, #indexHashForIndexObject:

This method actually converts the Float to the Integer. Define the method as follows:

 indexHashForIndexObject: aFloat

   ^aFloat asIEEE32BitWord

Though we intended to use this index definition for our MoneyTransaction, there is no reason we shouldn't be able to use a separate instance of it for other Float attribute of other classes. That's why we merely define the method that converts aFloat to an Integer, not a MoneyTransaction to anInteger. The index instance will get the float value from the MoneyTransaction based on #performing its attribute, in this case, #amount.

That's it, we're done. Yes, it is that easy! To use the index, you simply connect to the repository, navigate to the MagmaCollection that needs a new index and:

 | newIndex |
 newIndex _ MaFloatIndex
   attribute: #amount
   keySize: 32.
 [ mySession commit: [ myMagmaCollection addIndex: newIndex ] ]
   on: MaNotification
   do: [ :noti | Transcript cr; show: noti messageText ]

Keep in mind that, depending on how large the collection is, this can take a long time. Additionally, it requires that the collection be made readOnly during the entire time the index is being added, so no one else will be able to add or remove from it (or add additional indexes).

An important note

To perform query processing, the code for the new index type must reside in the server image prior to the server being started.