Part 1: Define which database columns should be encrypted
In the previous step, we:
- Added the CipherStash driver npm packages to our Express + Sequelize app
- Installed
stash
, the CipherStash CLI
Now we’re going to define which database columns should be encrypted.
In this part we will:
- Log in to the
stash
CLI tool - Create a dataset, which defines what database columns are encrypted
- Create a client, which allows your Rails app to access that dataset
- Push dataset configuration, so CipherStash knows what indexes to search with
1. Log in
Make sure stash
is logged in:
stash login
This will save an authentication token that stash
and @cipherstash/libpq
can use to talk to CipherStash during local development.
2. Create a dataset
Next, we need to create a dataset for tracking what data needs to be encrypted.
A dataset holds configuration for one or more database tables that have columns you need to encrypt.
Create our first dataset by running:
stash datasets create patients --description "Data about patients"
The output will look like this:
Dataset created:
ID : <a UUID style ID>
Name : patients
Description: Data about patients
Note down the dataset ID, as you’ll need it in step 3.
3. Create a client
Next we need to create a client.
A client allows an application to programatically access a dataset.
A dataset can have many clients (for example, different applications working with the same data), but a client belongs to exactly one dataset.
Use the dataset ID from step 2 to create a client (making sure you substitute your own dataset ID):
stash clients create --dataset-id $DATASET_ID "Express app"
The output will look like this:
Client created:
Client ID : <a UUID style ID>
Name : Express app
Description:
Dataset ID : <your provided dataset ID>
#################################################
# #
# Copy and store these credentials securely. #
# #
# THIS IS THE LAST TIME YOU WILL SEE THE KEY. #
# #
#################################################
Client ID : <a UUID style ID>
Client Key [hex] : <a long hex string>
Note down the client key somewhere safe, like a password vault. You will only ever see this credential once. This is your personal key, and you should not share it.
Set these as environment variables using the below variable names:
export CS_CLIENT_KEY=
export CS_CLIENT_ID=
4. Push the dataset configuration
Now we need to configure what columns are encrypted, and what indexes we want on those columns.
This configuration is used by the CipherStash driver to transparently rewrite your app’s SQL queries to use the underlying encrypted columns.
Our demo Sequelize app has a schema that looks like this:
const SCHEMA = {
full_name: DataTypes.STRING,
email: DataTypes.STRING,
dob: DataTypes.DATEONLY,
weight: DataTypes.FLOAT,
allergies: DataTypes.STRING,
medications: DataTypes.STRING
}
In this example we want to encrypt all columns as they all could contain sensitive information. However in different circumstances you may only encrypt a few of the columns.
We can configure what columns should be encrypted with a configuration file which is in the root of the demo titled dataset.yml
:
# dataset.yml
tables:
- path: patients
fields:
- name: full_name
in_place: false
cast_type: utf8-str
mode: plaintext-duplicate
indexes:
- version: 1
kind: match
tokenizer:
kind: ngram
token_length: 3
token_filters:
- kind: downcase
k: 6
m: 2048
include_original: true
- version: 1
kind: ore
- version: 1
kind: unique
- name: email
in_place: false
cast_type: utf8-str
mode: plaintext-duplicate
indexes:
- version: 1
kind: match
tokenizer:
kind: ngram
token_length: 3
token_filters:
- kind: downcase
k: 6
m: 2048
include_original: true
- version: 1
kind: ore
- version: 1
kind: unique
- name: allergies
in_place: false
cast_type: utf8-str
mode: plaintext-duplicate
indexes:
- version: 1
kind: match
tokenizer:
kind: ngram
token_length: 3
token_filters:
- kind: downcase
k: 6
m: 2048
include_original: true
- version: 1
kind: ore
- version: 1
kind: unique
- name: medications
in_place: false
cast_type: utf8-str
mode: plaintext-duplicate
indexes:
- version: 1
kind: match
tokenizer:
kind: ngram
token_length: 3
token_filters:
- kind: downcase
k: 6
m: 2048
include_original: true
- version: 1
kind: ore
- version: 1
kind: unique
- name: dob
in_place: false
cast_type: date
mode: plaintext-duplicate
indexes:
- version: 1
kind: ore
- name: weight
in_place: false
cast_type: float
mode: plaintext-duplicate
indexes:
- version: 1
kind: ore
This configuration file defines two types of encrypted indexes for the columns we want to protect:
- A
match
index on thefull_name
,email
,allergies
andmedications
columns, for full text matches - A
ore
index on thefull_name
,email
,dob
andweight
columns, for sorting and range queries
Now we push this configuration to CipherStash:
stash datasets config upload --file dataset.yml --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY
The command stash datasets config upload
will display the dataset configuration that was successfully stored.
You can see your dataset configuration at any time with:
stash datasets config display --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY
Now it’s time for the next part: add a database migration for the new encrypted columns.