Supported queries
Searchable encryption index types and the query operations they enable on encrypted data
Supported queries
CipherStash provides three core index types, each optimized for specific use cases:
| Index Type | Description | Example Operators |
|---|---|---|
| Exact | Equality comparisons and unique constraints | =, !=, IN, NOT IN |
| Match | Substring and pattern matching | LIKE, ILIKE, ~~, ~~* |
| Range | Comparison operations and ordering | <, <=, >, >=, BETWEEN, ORDER BY |
You can configure multiple index types on the same field to support different query patterns. For example, a field might have both Exact and Match indexes to support both equality checks and substring searches.
Choosing index types
When designing your schema, consider what queries you'll need to perform:
| Query pattern | Required index type |
|---|---|
WHERE email = 'user@example.com' | Exact |
WHERE name LIKE '%Smith%' | Match |
WHERE price < 100 | Range |
WHERE status IN ('active', 'pending') | Exact |
ORDER BY created_at DESC | Range |
UNIQUE constraint | Exact or Range |
CipherCell storage
Each index type stores its encrypted search terms in a specific key within the CipherCell structure:
hm: Exact index terms (HMAC)bf: Match index terms (Bloom filter)ob: Range/Order index terms (ORE blocks)
A single encrypted field can contain multiple index types in its CipherCell, allowing it to support multiple query patterns simultaneously.
Exact match
Exact indexes use HMAC-SHA256 to enable equality comparisons on encrypted data.
Supported operators
= — Equality
Returns true if both operands are equal.
SELECT *
FROM users
WHERE email = $1::eql_v2_encrypted;!= — Negated equality
Returns true if operands are not equal.
SELECT *
FROM users
WHERE email != $1::eql_v2_encrypted;IN — Array comparison
Returns true if the left-hand expression's result is equal to any of the right-hand expressions.
SELECT *
FROM users
WHERE state IN ($1::eql_v2_encrypted, $2::eql_v2_encrypted);NOT IN, ANY and SOME are also supported.
JOIN — Table joins on encrypted columns
SELECT
orders.id,
orders.amount,
users.name
FROM orders
JOIN users ON orders.user_email = users.email
WHERE orders.amount > 100;GROUP BY — Grouping by encrypted columns
SELECT
state,
COUNT(*) as user_count
FROM users
GROUP BY state;CipherCell key
Exact indexes are stored in the hm key of a CipherCell.
Database indexes
PostgreSQL B-tree indexes work with Exact.
CREATE INDEX
ON users (
encrypted_column eql_v2.encrypted_operator_class
);Unique indexes
UNIQUE indexes require either Exact or Range searchable encrypted indexes to be available on the CipherCell.
CREATE UNIQUE INDEX
ON users (
encrypted_column eql_v2.encrypted_operator_class
);Cryptography
Exact terms are generated using HMAC, specifically HMAC-SHA-256 which uses SHA with an output size of 32-bytes as the underlying hash function.
Match (pattern)
Match indexes use trigrams (three-character sequences) to enable substring matching on encrypted data. When a value is encrypted, trigrams are extracted and stored in an encrypted Bloom filter, allowing for efficient pattern matching without decrypting the data.
Trigrams work best for matching patterns that are at least 3 characters long. Very short search patterns may produce more false positives.
For example, the string "laptop" produces the following trigrams:
l(note the leading spaces)lalapaptptotopop(note the trailing space)
When searching for "apt" in an encrypted field, the system checks if the encrypted Bloom filter contains the apt trigram.
Text processing pipeline
Before text is indexed in an encrypted Bloom filter, it passes through a configurable processing pipeline on the client side. This pipeline determines how text is tokenized and normalized, which directly affects query behavior.
Tokenizers
Tokenizers break text into searchable units:
Standard (Unicode whitespace): Splits text on whitespace boundaries.
"Hello World" → ["Hello", "World"]Ngram (default: trigram): Generates character n-grams from text.
"laptop" → [" l", " la", "lap", "apt", "pto", "top", "op "]EdgeGram: Generates n-grams from the beginning of text, useful for prefix matching. Accepts min and max gram size.
"search" with min=1, max=6 → ["s", "se", "sea", "sear", "searc", "search"]Token filters
Token filters normalize tokens after tokenization:
- Upcase: Converts all text to uppercase
- Downcase: Converts all text to lowercase
- Stemmer: Reduces words to their root form (
"running"→"run") - Stop Word Removal: Filters out common words like "a", "an", "the", "is", "it", etc.
Pipeline impact on queries
If a downcase token filter is configured in the pipeline, both LIKE and ILIKE will perform case-insensitive searches. The distinction between LIKE (case-sensitive) and ILIKE (case-insensitive) only applies when no case-normalizing filters are used.
Supported operators
LIKE — Pattern matching
SELECT *
FROM products
WHERE name LIKE $1::eql_v2_encrypted;ILIKE — Case-insensitive pattern matching
SELECT *
FROM products
WHERE description ILIKE $1::eql_v2_encrypted;~~ — Equivalent to LIKE
SELECT *
FROM documents
WHERE title ~~ $1::eql_v2_encrypted;~~* — Equivalent to ILIKE
SELECT *
FROM documents
WHERE title ~~* $1::eql_v2_encrypted;LIKE and ILIKE mapping
Encrypted Bloom filters don't directly implement the SQL LIKE or ILIKE operators. Instead, they perform token-based matching using the configured text processing pipeline.
CipherStash Proxy and the Encryption SDK automatically map LIKE and ILIKE queries to encrypted Bloom filter searches.
This mapping works by:
- Extracting the search pattern from the
LIKEexpression (ignoring%wildcards) - Processing the pattern through the same text pipeline (tokenizer + filters)
- Checking if the resulting tokens exist in the encrypted Bloom filter
Limitations
- Very short patterns (1-2 characters): Patterns shorter than 3 characters may produce different results because trigrams require at least 3 characters
- Wildcard-only patterns: Patterns consisting only of wildcards cannot be mapped
- Character class patterns: SQL
LIKEsupports_(single character) wildcards, but encrypted Bloom filters don't - Position-specific matching: While prefix and suffix searches generally work, the behavior is trigram-based rather than position-exact
CipherCell key
Match indexes are stored in the bf key of a CipherCell.
Database indexes
PostgreSQL GIN (Generalized Inverted Index) indexes work with Match.
CREATE INDEX
products_encrypted_gin_index
ON products USING GIN (
eql_v2.bloom_filter(encrypted_column)
);Match indexes use GIN rather than B-tree indexes. GIN indexes are specifically designed for indexing composite values and are ideal for full-text and pattern matching operations.
Cryptography
Match terms are generated using encrypted Bloom filters, which enable boolean retrieval operations over encrypted data. The system extracts trigrams from the plaintext, encrypts each trigram using a keyed pseudorandom function, and stores them in a Bloom filter structure.
Academic references
- Cryptographically Secure Bloom-Filters (Nojima, Kadobayashi 2009)
- A New Bloom Filter Structure for Searchable Encryption Schemes (Chum, Zhang 2017)
Range / Order
Range indexes use Order Revealing Encryption (ORE), which allows comparison operations on encrypted data while preserving order relationships.
Supported operators
< — Less than
SELECT *
FROM products
WHERE price < $1::eql_v2_encrypted;<= — Less than or equal
SELECT *
FROM products
WHERE price <= $1::eql_v2_encrypted;> — Greater than
SELECT *
FROM orders
WHERE order_date > $1::eql_v2_encrypted;>= — Greater than or equal
SELECT *
FROM orders
WHERE order_date >= $1::eql_v2_encrypted;BETWEEN — Range comparison
SELECT *
FROM products
WHERE price BETWEEN $1::eql_v2_encrypted AND $2::eql_v2_encrypted;ORDER BY — Sort results
If applied to a string type, ordering is applied lexicographically (i.e. ordered by comparing elements left-to-right, like words in a dictionary).
SELECT *
FROM products
ORDER BY price ASC;CipherCell key
Order / Range indexes are stored in the ob key of a CipherCell.
Database indexes
PostgreSQL B-tree indexes work with Order / Range.
CREATE INDEX
ON products (
encrypted_column eql_v2.encrypted_operator_class
);UNIQUE indexes require either Exact or Range searchable encrypted indexes to be available on the CipherCell.
Cryptography
Order / Range terms are generated using Order Revealing Encryption (ORE), which allows comparison operations on encrypted data while preserving order relationships. This enables efficient range queries and sorting on encrypted columns without decrypting the data.
Academic references
The specific scheme used is based on Block ORE (Lewi-Wu 2016) and incorporates security enhancements identified by Bogatov et al 2018. The implementation is available on GitHub.