CipherStash
CipherStash Documentation

Part 1: Define which database columns should be encrypted

In the previous step, we:

  • Added the CipherStash driver npm packages to our Express + Sequelize app
  • Installed stash, the CipherStash CLI

Now we’re going to define which database columns should be encrypted.

In this part we will:

  1. Log in to the stash CLI tool
  2. Create a dataset, which defines what database columns are encrypted
  3. Create a client, which allows your Rails app to access that dataset
  4. Push dataset configuration, so CipherStash knows what indexes to search with

1. Log in

Make sure stash is logged in:

stash login

This will save an authentication token that stash and @cipherstash/libpq can use to talk to CipherStash during local development.

2. Create a dataset

Next, we need to create a dataset for tracking what data needs to be encrypted.

A dataset holds configuration for one or more database tables that have columns you need to encrypt.

Create our first dataset by running:

stash datasets create patients --description "Data about patients"

The output will look like this:

Dataset created:
ID         : <a UUID style ID>
Name       : patients
Description: Data about patients

Note down the dataset ID, as you’ll need it in step 3.

3. Create a client

Next we need to create a client.

A client allows an application to programatically access a dataset.

A dataset can have many clients (for example, different applications working with the same data), but a client belongs to exactly one dataset.

Use the dataset ID from step 2 to create a client (making sure you substitute your own dataset ID):

stash clients create --dataset-id $DATASET_ID "Express app"

The output will look like this:

Client created:
Client ID  : <a UUID style ID>
Name       : Express app
Description:
Dataset ID : <your provided dataset ID>

#################################################
#                                               #
#  Copy and store these credentials securely.   #
#                                               #
#  THIS IS THE LAST TIME YOU WILL SEE THE KEY.  #
#                                               #
#################################################

Client ID          : <a UUID style ID>

Client Key [hex]   : <a long hex string>

Note down the client key somewhere safe, like a password vault. You will only ever see this credential once. This is your personal key, and you should not share it.

Set these as environment variables using the below variable names:

export CS_CLIENT_KEY=
export CS_CLIENT_ID=

4. Push the dataset configuration

Now we need to configure what columns are encrypted, and what indexes we want on those columns.

This configuration is used by the CipherStash driver to transparently rewrite your app’s SQL queries to use the underlying encrypted columns.

Our demo Sequelize app has a schema that looks like this:

const SCHEMA = {
  full_name: DataTypes.STRING,
  email: DataTypes.STRING,
  dob: DataTypes.DATEONLY,
  weight: DataTypes.FLOAT,
  allergies: DataTypes.STRING,
  medications: DataTypes.STRING
}

In this example we want to encrypt all columns as they all could contain sensitive information. However in different circumstances you may only encrypt a few of the columns.

We can configure what columns should be encrypted with a configuration file which is in the root of the demo titled dataset.yml:

# dataset.yml
tables:
- path: patients
  fields:
  - name: full_name
    in_place: false
    cast_type: utf8-str
    mode: plaintext-duplicate
    indexes:
    - version: 1
      kind: match
      tokenizer:
        kind: ngram
        token_length: 3
      token_filters:
      - kind: downcase
      k: 6
      m: 2048
      include_original: true
    - version: 1
      kind: ore
    - version: 1
      kind: unique
  - name: email
    in_place: false
    cast_type: utf8-str
    mode: plaintext-duplicate
    indexes:
    - version: 1
      kind: match
      tokenizer:
        kind: ngram
        token_length: 3
      token_filters:
      - kind: downcase
      k: 6
      m: 2048
      include_original: true
    - version: 1
      kind: ore
    - version: 1
      kind: unique
  - name: allergies
    in_place: false
    cast_type: utf8-str
    mode: plaintext-duplicate
    indexes:
    - version: 1
      kind: match
      tokenizer:
        kind: ngram
        token_length: 3
      token_filters:
      - kind: downcase
      k: 6
      m: 2048
      include_original: true
    - version: 1
      kind: ore
    - version: 1
      kind: unique
  - name: medications
    in_place: false
    cast_type: utf8-str
    mode: plaintext-duplicate
    indexes:
    - version: 1
      kind: match
      tokenizer:
        kind: ngram
        token_length: 3
      token_filters:
      - kind: downcase
      k: 6
      m: 2048
      include_original: true
    - version: 1
      kind: ore
    - version: 1
      kind: unique
  - name: dob
    in_place: false
    cast_type: date
    mode: plaintext-duplicate
    indexes:
    - version: 1
      kind: ore
  - name: weight
    in_place: false
    cast_type: float
    mode: plaintext-duplicate
    indexes:
    - version: 1
      kind: ore

This configuration file defines two types of encrypted indexes for the columns we want to protect:

  • A match index on the full_name, email, allergies and medications columns, for full text matches
  • A ore index on the full_name, email, dob and weight columns, for sorting and range queries

Now we push this configuration to CipherStash:

stash datasets config upload --file dataset.yml --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY

The command stash datasets config upload will display the dataset configuration that was successfully stored.

You can see your dataset configuration at any time with:

stash datasets config display --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY

Now it’s time for the next part: add a database migration for the new encrypted columns.