Limitations, Quirks, and Tradeoffs
CipherStash provides transparent encryption and decryption of sensitive data in SQL databases. It works by intercepting SQL queries and encrypting or decrypting the sensitive data on the fly. This reference document outlines the limitations, quirks, and tradeoffs of using CipherStash.
On this page
Limitations
Unsupported Features
When using the CipherStash driver for PostgreSQL, some SQL features are not supported on columns encrypted by CipherStash.
However, those SQL features will continue to work on plaintext columns that are not encrypted by CipherStash.
When an unsupported feature is used, the CipherStash driver will return an error.
Setting encrypted columns to arbitrary expressions in an UPDATE
-- Not supported:
UPDATE ...
SET some_encrypted_col = some_encrypted_col + 1;
JOIN
on encrypted columns
It is generally not possible to JOIN
on encrypted columns that may contain the same plaintext:
-- Not supported:
SELECT t.id,
t.amount,
p.name
FROM transactions t
JOIN payee p ON t.payee_email = p.email
ORDER BY t.name;
SQL statements using JOIN
on plaintext columns will run successfully.
JOIN
on encrypted columns with USING
or NATURAL
Any JOIN
using USING
or NATURAL
will error, regardless of whether the JOIN
is on an encrypted or plaintext column.
-- Not supported:
SELECT *
FROM toy
NATURAL JOIN cat;
-- Not supported:
SELECT post_id,
title,
review
FROM post
INNER JOIN post_comment USING(post_id);
INSERT
from other tables
-- Not supported:
INSERT INTO some_configured_table
SELECT *
FROM other_table;
Partitioning by encrypted columns
Partitioning splits what is logically one large table into smaller physical pieces.
Partitioning on encrypted columns is not supported:
-- Not supported:
CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
) PARTITION BY RANGE (logdate);
CREATE TABLE measurement_y2006m02 PARTITION OF measurement
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
CREATE TABLE measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
Partitioning on plaintext columns is supported.
Subqueries
-- Not supported:
SELECT id,
first_name,
last_name
FROM employees
WHERE department_id IN
(SELECT department_id
FROM departments
WHERE location_ID = 1700);
Temporary tables
-- Not supported:
CREATE
TEMPORARY TABLE table_t (column1 INT);
Window functions
-- Not supported:
SELECT product_name,
price,
group_name,
AVG (price) OVER (PARTITION BY group_name)
FROM products
INNER JOIN product_groups USING (group_id);
SQL features that differ from the upstream PostgreSQL client
These SQL features are either partially supported, or supported with workarounds.
Bulk UPDATE
UPDATE
statements with multiple modified rows are partially supported.
Bulk updates with literal values are supported:
-- Supported:
UPDATE accounts
SET balance = 100
WHERE id IN (1,
2,
3);
UPDATE
statements using values lists are not supported:
-- Not supported:
UPDATE table
SET update_column = temp.value
FROM (
VALUES ('foo', 'bar'), ('baz', 'qux'), ('et', 'cetera')
) temp (id, value)
WHERE key_column = temp.id;
UPDATE
statements (bulk or single record) that set columns to arbitrary expressions are not supported:
-- Not supported:
UPDATE account SET balance = balance + 1 WHERE id = 123.
UPDATE
statements directly from other tables are not supported:
-- Not supported:
UPDATE accounts
SET contact_first_name = first_name,
contact_last_name = last_name
FROM employees
WHERE employees.id = accounts.sales_person;
Schema-level default values for encrypted columns
-- Not supported:
CREATE TABLE products (
product_no integer,
name text,
price numeric DEFAULT 9.99
);
This can be worked around with application-level defaults (for example, in your ORM).
This will be supported in future releases.
Aggregate functions
Only COUNT()
is supported:
-- Supported:
SELECT COUNT(id)
FROM users
WHERE date_of_birth > '1984-01-01';
Other aggregate functions are not supported:
-- Not supported:
SELECT department_id,
SUM(salary)
FROM employees
GROUP BY department_id;
INSERT
with ON CONFLICT
Columns managed by CipherStash cannot be used in ON CONFLICT
clauses:
-- Not supported:
INSERT INTO table_a (cs_column1, cs_column2)
VALUES (value1, value2),
(value3, value4),
(value5, value6) ON CONFLICT cs_column1 DO NOTHING;
-- Not supported:
INSERT INTO table_a (cs_column1, cs_column2)
VALUES (value1, value2),
(value3, value4),
(value5, value6) ON CONFLICT cs_column1 DO UPDATE;
Columns not managed by CipherStash can be used in ON CONFLICT
clauses:
-- Supported:
INSERT INTO table_a (not_cs_column1, not_cs_column2)
VALUES (value1, value2),
(value3, value4),
(value5, value6) ON CONFLICT not_cs_column1 DO NOTHING;
-- Supported:
INSERT INTO table_a (not_cs_column1, not_cs_column2)
VALUES (value1, value2),
(value3, value4),
(value5, value6) ON CONFLICT not_cs_column1 DO UPDATE;
Rails 7+ uses ON CONFLICT ... DO UPDATE
for #upsert_all
.
Quirks
CipherStash has the following quirks:
- Match queries against text fields can have false positives. CipherStash uses bloom filters to perform queries against some indexed columns. However, bloom filters can have false positives, which means that a query result may contain records that do not actually match the query. This problem is not unique to CipherStash and is a known tradeoff of using bloom filters. To minimize false positives, index tuning is required.
LIKE
andILIKE
queries require minor modification. ForLIKE
andILIKE
queries to continue to work inencrypted-duplicate
andencrypted
modes, letter case functions need to be removed:-- Before: SELECT name FROM user WHERE lower(name) LIKE "%Alice%"; -- After: SELECT name FROM user WHERE name LIKE "%Alice%";
NULL
s are not yet represented in an encrypted form. BecauseNULL
s don’t yet have an encrypted representation, the driver cannot detect if aNULL
in an encrypted source column (__x_encrypted
for example) reflects the actual plaintext source value. This can lead to data loss during migrations. CipherStash intends to ship support for encryptedNULL
s in Q2 2023.- Exact queries against citext (case-insensitive text) fields are case-sensitive
Let’s suppose a PostgreSQL table
users
with a fieldemail
of typecitext
. Inplaintext-duplicate
mode, an email with the valueSam_Fisher@example.com
would match any equal case-insensitive lookups such as:
In order to support this query inSELECT email from users where email = 'sam_FISHER@example.com'; SELECT email from users where email = 'sam_fisher@example.com'; SELECT email from users where email = 'SAM_FISHER@EXAMPLE.COM';
encrypted-duplicate
andencrypted
modes, CipherStash uses the unique index column. The default behaviour for this index column is case-sensitive, which will cause the above queries to not return the record as expected. In order to preserve the case-insensitive lookup behaviour, make sure that the unique index is defined with the downcase token filter.
This will lowercase the values before encrypting and inserting them into the index column. It will also lowercase any query terms before queries are performed.tables: - path: users fields: - name: email in_place: false cast_type: utf8-str mode: encrypted-duplicate indexes: - version: 1 kind: unique token_filters: - kind: downcase
Tradeoffs
CipherStash makes the following tradeoffs:
- The same ciphertext may appear in multiple rows during bulk operations.
When transforming various queries (such as
UPDATE
) that do bulk operations, multiple rows can be set with the same encrypted value. This can make it easier for attackers to perform certain types of inference attacks. Ideally, the same plaintext value in multiple rows and fields should have a unique ciphertext. CipherStash intends to ship a release that fixes this by Q3 2023. - Encrypted left ciphertext. CipherStash stores the encrypted left ciphertext (as defined in the Order Revealing Encryption paper), which can make it easier for attackers to use inference attacks to determine the plaintext. This is a tradeoff that is made to provide better performance and usability.
CipherStash is the easiest and safest way to protect structured sensitive data in your organisation. By understanding its limitations, quirks, and tradeoffs, users can make informed choices about whether CipherStash is the right solution for their applications.