Querying on a Field-Dynamic Match Index
The field-dynamic match index is a bit of a tricky one to wrap your head around. It indexes all string fields in your records, like the dynamic match index, but it also remembers which field the text is in. To query it, you need to specify both the text to match and the field in which the text should appear.
This is most easily demonstrated with an example.
Let’s say you create a collection of people
, with this schema:
stash.create_collection("people", {
"types" => {},
"indexes" => {
"allText" => {
"kind" => "dynamic-match",
"tokenizer" => "standard",
"tokenFilters" => [
{ "kind" => "ngram", "tokenLength" => 3 }
]
},
"allTextWithFields" => {
"kind" => "field-dynamic-match",
"tokenizer" => "standard",
"tokenFilters" => [
{ "kind" => "ngram", "tokenLength" => 3 }
]
}
}
})
collection = stash.collection("people")
Then, you load in these records:
[
{
name: "Alice Angleton",
address: "42 Ambling St",
},
{
name: "Bob Blessington",
address: "61 Bowery Ln",
},
{
name: "Charlene Chaise",
address: "197 Cryston Rd",
},
{
name: "David Drury",
address: "8 Downer Cl",
notes: "Doesn't like Bob",
},
].each { |r| collection.insert(r) }
If we do a search for the string "Bob"
in the allText
index, we’ll get back two records:
collection.query { |p| p.allText.match("Bob") }.records.length # => 2
Because the string "Bob"
appears both in Bob’s name
, as well as David’s notes
.
If we definitely only wanted to search for someone named “Bob”, then we can use the allTextWithFields
index, which is a field-dynamic-match
index, and say “just search in the name
field”.
Like this:
collection.query { |p| p.allTextWithFields.match("name", "Bob") }.records.length # => 1
Notice that the match
operator for a field-dynamic-match
index takes two arguments: the field name and the string to search for.
This is unusual for a constraint operator, so it’s worth keeping in mind.
Field-Dynamic Match vs Match Indexes
Given how powerful field-dynamic match indexes are, you might be wondering why we would ever use the match
index type.
Why don’t we just use field-dynamic indexes all the time?
There are a couple of reasons why a match
index is usually preferred over a field-dynamic match index:
-
A
match
index can search across more than one field simultaneously. When you define a match index, you can list multiple fields to index, with thefields
parameter. Queries will then automatically apply across all those fields every time. With afield-dynamic-match
index, you can only search on one field at a time. -
A
field-dynamic-match
index is slower to update and query. Since all text in the record is indexed, all those terms need to be generated, encrypted, and sent to the server. That then means the index itself is larger, and thus slower to query than a smaller index.
When you know which field(s) you wish to query at the time the collection is created, you should always prefer the match
index type.
Only use the field-dynamic-match
index type when you need to figure out what field to match against at runtime.