CipherStash - Protect Data, Not Just Systems.

CipherStash provides transparent encryption and decryption of sensitive data in SQL databases. It works by intercepting SQL queries and encrypting or decrypting the sensitive data on the fly. This reference document outlines the limitations, quirks, and tradeoffs of using CipherStash.

Limitations

Unsupported Features

When using the CipherStash driver for PostgreSQL, some SQL features are not supported on columns encrypted by CipherStash.

However, those SQL features will continue to work on plaintext columns that are not encrypted by CipherStash.

When an unsupported feature is used, the CipherStash driver will return an error.

Setting encrypted columns to arbitrary expressions in an `UPDATE`

1-- Not supported:
2UPDATE ...
3SET some_encrypted_col = some_encrypted_col + 1;
4

`JOIN` on encrypted columns

It is generally not possible to JOIN on encrypted columns that may contain the same plaintext:

1-- Not supported:
2SELECT t.id,
3       t.amount,
4       p.name
5FROM transactions t
6JOIN payee p ON t.payee_email = p.email
7ORDER BY t.name;
8

SQL statements using JOIN on plaintext columns will run successfully.

`JOIN` on encrypted columns with `USING` or `NATURAL`

Any JOIN using USING or NATURAL will error, regardless of whether the JOIN is on an encrypted or plaintext column.

1-- Not supported:
2SELECT *
3FROM toy
4NATURAL JOIN cat;
5
6-- Not supported:
7SELECT post_id,
8       title,
9       review
10FROM post
11INNER JOIN post_comment USING(post_id);
12

`INSERT` from other tables

1-- Not supported:
2INSERT INTO some_configured_table
3SELECT *
4FROM other_table;
5

Partitioning by encrypted columns

Partitioning splits what is logically one large table into smaller physical pieces.

Partitioning on encrypted columns is not supported:

1-- Not supported:
2CREATE TABLE measurement (
3    city_id         int not null,
4    logdate         date not null,
5    peaktemp        int,
6    unitsales       int
7) PARTITION BY RANGE (logdate);
8
9CREATE TABLE measurement_y2006m02 PARTITION OF measurement
10    FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
11
12CREATE TABLE measurement_y2006m03 PARTITION OF measurement
13    FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
14

Partitioning on plaintext columns is supported.

Subqueries

1-- Not supported:
2SELECT id,
3       first_name,
4       last_name
5FROM employees
6WHERE department_id IN
7    (SELECT department_id
8     FROM departments
9     WHERE location_ID = 1700);
10

Temporary tables

1-- Not supported:
2CREATE
3TEMPORARY TABLE table_t (column1 INT);
4

Window functions

1-- Not supported:
2SELECT product_name,
3       price,
4       group_name,
5       AVG (price) OVER (PARTITION BY group_name)
6FROM products
7INNER JOIN product_groups USING (group_id);
8

Prepared statements params sequence

The params placeholders in prepared statements need to appear in sequential order. Each placeholder can only appear once.

1-- Supported:
2PREPARE find_user as SELECT id, first_name FROM users WHERE first_name like $1 or email like $2;
3
4-- Not supported:
5PREPARE find_user as SELECT id, first_name FROM users WHERE first_name like $2 or email like $1;
6
7-- Not supported:
8PREPARE find_user as SELECT id, first_name FROM users WHERE first_name like $1 or email like $1;
9

SQL features that differ from the upstream PostgreSQL client

These SQL features are either partially supported, or supported with workarounds.

Bulk `UPDATE`

UPDATE statements with multiple modified rows are partially supported.

Bulk updates with literal values are supported:

1-- Supported:
2UPDATE accounts
3SET balance = 100
4WHERE id IN (1,
5             2,
6             3);
7

UPDATE statements using values lists are not supported:

1-- Not supported:
2UPDATE table
3SET update_column = temp.value
4FROM (
5    VALUES ('foo', 'bar'), ('baz', 'qux'), ('et', 'cetera')
6) temp (id, value)
7WHERE key_column = temp.id;
8

UPDATE statements (bulk or single record) that set columns to arbitrary expressions are not supported:

1-- Not supported:
2UPDATE account SET balance = balance + 1 WHERE id = 123.
3

UPDATE statements directly from other tables are not supported:

1-- Not supported:
2UPDATE accounts
3SET contact_first_name = first_name,
4    contact_last_name = last_name
5FROM employees
6WHERE employees.id = accounts.sales_person;
7

Schema-level default values for encrypted columns

1-- Not supported:
2CREATE TABLE products (
3    product_no integer,
4    name text,
5    price numeric DEFAULT 9.99
6);
7

This can be worked around with application-level defaults (for example, in your ORM).

This will be supported in future releases.

Aggregate functions

Only COUNT() is supported:

1-- Supported:
2SELECT COUNT(id)
3FROM users
4WHERE date_of_birth > '1984-01-01';
5

Other aggregate functions are not supported:

1-- Not supported:
2SELECT department_id,
3       SUM(salary)
4FROM employees
5GROUP BY department_id;
6

`INSERT` with `ON CONFLICT`

Columns managed by CipherStash cannot be used in ON CONFLICT clauses:

1-- Not supported:
2INSERT INTO table_a (cs_column1, cs_column2)
3VALUES (value1, value2),
4       (value3, value4),
5       (value5, value6) ON CONFLICT cs_column1 DO NOTHING;
6
7-- Not supported:
8INSERT INTO table_a (cs_column1, cs_column2)
9VALUES (value1, value2),
10       (value3, value4),
11       (value5, value6) ON CONFLICT cs_column1 DO UPDATE;
12

Columns not managed by CipherStash can be used in ON CONFLICT clauses:

1-- Supported:
2INSERT INTO table_a (not_cs_column1, not_cs_column2)
3VALUES (value1, value2),
4       (value3, value4),
5       (value5, value6) ON CONFLICT not_cs_column1 DO NOTHING;
6
7-- Supported:
8INSERT INTO table_a (not_cs_column1, not_cs_column2)
9VALUES (value1, value2),
10       (value3, value4),
11       (value5, value6) ON CONFLICT not_cs_column1 DO UPDATE;
12

Rails 7+ uses ON CONFLICT ... DO UPDATE for #upsert_all.

Queries that are not allowed

Queries on encrypted columns are not allowed

Queries that directly access the encrypted columns are not allowed. Encrypted columns have the following patterns:

__[field]_encrypted
__[field]_ore
__[field]_match
__[field]_unique

1-- Not allowed:
2SELECT __name_encrypted FROM users WHERE id = 1;
3
4-- Not allowed:
5SELECT id, name FROM users WHERE __name_encrypted = "John Doe";
6
7-- Not allowed:
8UPDATE users SET __name_encrypted = "John Doe" WHERE id = 1;
9
10-- Not allowed:
11INSERT INTO users (name, __name_encrypted) VALUES ("John Doe", "John Doe");
12

Quirks

CipherStash has the following quirks:

Match queries against text fields can have false positives. CipherStash uses bloom filters to perform queries against some indexed columns. However, bloom filters can have false positives, which means that a query result may contain records that do not actually match the query. This problem is not unique to CipherStash and is a known tradeoff of using bloom filters. To minimize false positives, index tuning is required.
LIKE and ILIKE queries require minor modification. For LIKE and ILIKE queries to continue to work in encrypted-duplicate and encrypted modes, letter case functions need to be removed:
```
1-- Before:
2SELECT name FROM user WHERE lower(name) LIKE "%Alice%";
3
4-- After:
5SELECT name FROM user WHERE name LIKE "%Alice%";
6
```
NULLs are not yet represented in an encrypted form. Because NULLs don't yet have an encrypted representation, the driver cannot detect if a NULL in an encrypted source column (__x_encrypted for example) reflects the actual plaintext source value. This can lead to data loss during migrations. CipherStash intends to ship support for encrypted NULLs in Q2 2023.
Exact queries against citext (case-insensitive text) fields are case-sensitive Let's suppose a PostgreSQL table users with a field email of type citext. In plaintext-duplicate mode, an email with the value [email protected] would match any equal case-insensitive lookups such as:
```
1SELECT email from users where email = '[email protected]';
2SELECT email from users where email = '[email protected]';
3SELECT email from users where email = '[email protected]';
4
```
In order to support this query in encrypted-duplicate and encrypted modes, CipherStash uses the unique index column. The default behaviour for this index column is case-sensitive, which will cause the above queries to not return the record as expected. In order to preserve the case-insensitive lookup behaviour, make sure that the unique index is defined with the downcase token filter.
```
1tables:
2  - path: users
3    fields:
4      - name: email
5        in_place: false
6        cast_type: utf8-str
7        mode: encrypted-duplicate
8        indexes:
9          - version: 1
10            kind: unique
11            token_filters:
12              - kind: downcase
13
```
This will lowercase the values before encrypting and inserting them into the index column. It will also lowercase any query terms before queries are performed.

Tradeoffs

CipherStash makes the following tradeoffs:

The same ciphertext may appear in multiple rows during bulk operations. When transforming various queries (such as UPDATE) that do bulk operations, multiple rows can be set with the same encrypted value. This can make it easier for attackers to perform certain types of inference attacks. Ideally, the same plaintext value in multiple rows and fields should have a unique ciphertext. CipherStash intends to ship a release that fixes this by Q3 2023.
Encrypted left ciphertext. CipherStash stores the encrypted left ciphertext (as defined in the Order Revealing Encryption paper), which can make it easier for attackers to use inference attacks to determine the plaintext. This is a tradeoff that is made to provide better performance and usability.

CipherStash is the easiest and safest way to protect structured sensitive data in your organisation. By understanding its limitations, quirks, and tradeoffs, users can make informed choices about whether CipherStash is the right solution for their applications.