Squeak
  links to this page:    
View this PageEdit this PageUploads to this PageHistory of this PageTop of the SwikiRecent ChangesSearch the SwikiHelp Guide
Magma Queries
Last updated at 5:32 pm UTC on 8 January 2010
To access the objects in a MagmaCollection the #where: method will construct a MagmaCollectionReader on that collection. MagmaCollectionReaders are a lot like sequenceable collections themselves. They offer #size and #at:, #do:, and #sortBy:.

Constructing the query

The #where: method constructs the MagmaCollectionReader. The parameter takes a block to specify which objects to read from the receiver collection:
	myReader := aMagmaCollection where:
		[ : reader |
		(reader 
			read: #date 
			from: '3/1/2002' asDate
			to: Date today)
		& (reader
			read: #keywords
			at: #('car')) ]

The above is the long form of building a query. There is also a short-form which bears a remarkable resemblence to standard Smalltalk select blocks:
	aMagmaCollection where:
		[ : each |
		(each date from: '3/1/2002' asDate to: Date today)
		& (each keywords at: #('car')) ]

This syntax is easier to read and consistent with standard select: blocks, but employs one of Smalltalks powerful dynamic features known as #doesNotUnderstand: to interpret the query. The consequence is any message implemented by MagmaCollectionReader (and up, through the hierarchy, to Object) cannot be used in the query expression. The following messages are the most likely possibilities of query attributes that would be affected by this in a standard image:
(Most-likely collisions from Object):
name
size
class
creationStamp
hash
value

(Most-likely collisions from MagmaCollectionReader):
expression
first
last
pageSize

Using any of these words (or any other message implemented on Object) as the name of an index requires use of the long form of querying. The short form should otherwise be fine.

Operators

The available operators are listed in the 'operators' category of MaClause:

MagmaCollectionReader

MagmaCollectionReader offers a rich set of methods for accessing the objects inside its underlying collection. If possible, applications should try to use readers directly rather than convert them to Smalltalk collections. Great care was taken to make readers as practical as normal collections. A reader can, for example, be used directly in a scrolling list.

Unfortunately MagmaCollections cannot be queried by unindexed attributes. To do this you must convert it to a Smalltalk collection with one of the 'converting' methods.

Sorting

Magma will optimize the query to the tighest clauses automatically. If it can be optimized down to one clause, then it will be sorted by the attribute of that clause and #isSorted will answer true.

You can easily determine what clause, if any, a MagmaCollectionReader is sorted by:
	myReader sortIndex

will answer the index it is optimized to, otherwise nil. If sorting on a different attribute is needed then #sortBy: may be used:
	myReader sortBy: #date

which will quickly answer a new reader, but it is based on a new MagmaCollection that is being "loaded" on the server in a background process. In the meantime, this new reader may be interrogated for the results that have been sorted so far. Until #sortComplete, #fractionSorted may be used to indicate progress on the sort.

To block program progress until the sort is complete, use the past tense of sortBy:, #sortedBy:.
	myReader 
		sortedBy: #date 
		makeDistinct: true


Sorting may be toggled ascending or descending with the #ascend or #descend messages.

Magma will create temporary files on the server to manage these transient sorted result sets. These files accumulate until the next compression.

Non-distinct Results

By default, an object will be included in a (MagmaCollectionReader) result once for each disjuncted (or'd) condition for which it qualifies. For example, given the following Car objects:

#year#make#model
2006ToyotaHighlander
1963ChevroletCorvette
2007ChevroletColorado

the following query:
  myCars where:
    [ :eachCar |
    (eachCar year > 2000)
    | (eachCar make at: 'Toyota' ]

The results would be:

2006ToyotaHighlander
2007ChevroletColorado
2006ToyotaHighlander

This duplication is a feature, not a bug. Besides offering better performance, some domain models depend on knowing the "weight" or number of qualifications for each query result.

Nevertheless, a very common use for where: will be to present unique "search results". To force distinct results, use the #sortBy:makeDistinct:. Unfortunately eliminating duplicates requires a full enumeration of the result set, and creation of a new MagmaCollection containing the unique objects of the result-set. So be sure to enumerate the result of this message, not the receiver. A good pattern would be to always assume a new reader result, (even though, for fully optimizable queries (see Optimizing Performance), it will be the receiver).

The API requires specification of a sort attribute for consistency (you get back a reader, not a MagmaCollection) and simplicity (because the most efficient way to access the objects of this (or any) new result-set MagmaCollection is by way of a reader).

A MagmaCollectionReader can also be created with the #where: distinct: sortBy: descending: convenience method.
  reader := myCollection
    where:
      [ :eachCar |
      (eachCar year > 2000)
      | (eachCar make at: 'Toyota' ]
    distinct: true
    sortBy: #model
    descending: false


Beware, #where: and #where: distinct: sortBy: descending: can also be sent to MagmaCollectionReader. Therefore, one can recursively query on a collection. However when using #where: distinct: sortBy: descending:, the answered MagmaCollectionReader is associated with a newly created MagmaCollection only indexed with the sorting attribute. So subsequent query will only work with attribute use for sorting.

Optimizing Performance

Performance is optimized by not having to fault objects for evaluation into the client. Query expressions are executed on the server, leveraging MaHashIndexes to perform only integral arithmetic and comparisons.

The query algorithm tries to be as lazy as possible. Using merely an all ANDed condition will answer a reader with objects sorted on the clause with the fewest results for "free". Then, only at: will cause it to return a first requested page of results.

But the following luxury features will prevent this laziness and incur a performance cost.

featurecost
using an OR clause1
sorting by a different attribute1+2
requiring distinct results1+2

cost 1 is enumeration of the entire result set on the server. This is still pretty fast.
cost 2 is creation of a new indexed MagmaCollection in the background, load it with the result set. This is pretty slow.

Any program which can avoid the luxury features will reap pure lazy-retrieval of results at maximum speed.

Managing Resources

To maintain server health, it is important to send #release to a MagmaCollectionReader when the application is done with it. This is especially important for Readers based on the luxury queries.
  myMagmaCollectionReader release