CipherStash
CipherStash Documentation

Part 1: Define which database columns should be encrypted

In the previous step, we:

  • Added the CipherStash driver RubyGem to our Rails app.
  • Installed the stash, the CipherStash CLI.

Now we’re going to define which database columns should be encrypted.

In this part we will:

  1. Log in to the stash CLI tool
  2. Create a dataset, which defines what database columns are encrypted
  3. Create a client, which allows your Rails app to access that dataset
  4. Push dataset configuration, so CipherStash knows what indexes to search with

1. Log in

Make sure stash is logged in:

stash login

This will save a special token stash will use for talking to CipherStash.

2. Create a dataset

Next, we need to create a dataset for tracking what data needs to be encrypted.

A dataset holds configuration for one or more database tables that contain data to be encrypted.

Create our first dataset by running:

stash datasets create patients --description "Data about patients"

The output will look like this:

Dataset created:
ID         : <a UUID style ID>
Name       : patients
Description: Data about patients

Note down the dataset ID, as you’ll need it in step 3.

3. Create a client

Next we need to create a client.

A client allows an application to programatically access a dataset.

A dataset can have many clients (for example, different applications working with the same data), but a client belongs to exactly one dataset.

Use the dataset ID from step 2 to create a client (making sure you substitute your own dataset ID):

stash clients create --dataset-id $DATASET_ID "Rails app"

The output will look like this:

Client created:
Client ID  : <a UUID style ID>
Name       : Rails
Description:
Dataset ID : <your provided dataset ID>

#################################################
#                                               #
#  Copy and store these credentials securely.   #
#                                               #
#  THIS IS THE LAST TIME YOU WILL SEE THE KEY.  #
#                                               #
#################################################

Client ID          : <a UUID style ID>

Client Key [hex]   : <a long hex string>

Note down the client key somewhere safe, like a password vault. You will only ever see this credential once. This is your personal key, and you should not share it.

Set these in your Rails credentials file:

cipherstash:
  client_id:
  client_key:

Or set these as environment variables in a .envrc file using the below variable names:

export CS_CLIENT_KEY=
export CS_CLIENT_ID=

If you are using direnv run:

direnv allow

If you’re not you can export the variables by running:

source .envrc

4. Push the dataset configuration

Now we need to configure what columns are encrypted, and what indexes we want on those columns.

This configuration is used by the CipherStash driver to transparently rewrite your app’s SQL queries to use the underlying encrypted columns.

Our demo Rails app has a schema that looks like this:

class CreateUsers < ActiveRecord::Migration[7.1]
  def change
    create_table :users do |t|
      t.string :full_name
      t.string :email
      t.date :dob
      t.float :weight
      t.string :allergies
      t.string :medications

      t.timestamps
    end
  end
end

We will want to encrypt all columns, as they contain sensitive information.

We do this with a configuration file which is in the root of the Rails demo titled dataset.yml:

# dataset.yml
tables:
  - path: patients
    fields:
      - name: full_name
        in_place: false
        cast_type: utf8-str
        mode: plaintext-duplicate
        indexes:
          - version: 1
            kind: match
            tokenizer:
              kind: ngram
              token_length: 3
            token_filters:
              - kind: downcase
            k: 6
            m: 2048
            include_original: true
          - version: 1
            kind: ore
          - version: 1
            kind: unique
      - name: email
        in_place: false
        cast_type: utf8-str
        mode: plaintext-duplicate
        indexes:
          - version: 1
            kind: match
            tokenizer:
              kind: ngram
              token_length: 3
            token_filters:
              - kind: downcase
            k: 6
            m: 2048
            include_original: true
          - version: 1
            kind: ore
          - version: 1
            kind: unique
      - name: allergies
        in_place: false
        cast_type: utf8-str
        mode: plaintext-duplicate
        indexes:
          - version: 1
            kind: match
            tokenizer:
              kind: ngram
              token_length: 3
            token_filters:
              - kind: downcase
            k: 6
            m: 2048
            include_original: true
          - version: 1
            kind: ore
          - version: 1
            kind: unique
      - name: medications
        in_place: false
        cast_type: utf8-str
        mode: plaintext-duplicate
        indexes:
          - version: 1
            kind: match
            tokenizer:
              kind: ngram
              token_length: 3
            token_filters:
              - kind: downcase
            k: 6
            m: 2048
            include_original: true
          - version: 1
            kind: ore
          - version: 1
            kind: unique
      - name: dob
        in_place: false
        cast_type: date
        mode: plaintext-duplicate
        indexes:
          - version: 1
            kind: ore
      - name: weight
        in_place: false
        cast_type: float
        mode: plaintext-duplicate
        indexes:
          - version: 1
            kind: ore

This configuration file defines two types of encrypted indexes for the columns we want to protect:

  • A match index on the full_name, email, allergies and medications columns, for full text matches
  • A ore index on the full_name, email, dob and weight columns, for sorting and range queries

Now we push this configuration to CipherStash:

stash datasets config upload --file dataset.yml --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY

The command stash datasets config upload will display the dataset configuration that was successfully stored.

You can see your dataset configuration at any time with:

stash datasets config display --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY

Now it’s time for the next part: add a database migration for the new encrypted columns.