Create a Collection
All data in CipherStash exist inside Collections. You can think of these as loosely analogous to a table in a database or an index in a search system. For example, you might create a collection to store the users in your application. Collections might also be used to store application logs, insurance claims, sales orders or just about anything else you can imagine.

Defining a Collection

Let's imagine we want to create a collection to store company employees. The collection stores records as in the form:
1
{
2
name: "Ada Lovelace",
3
jobTitle: "Chief Executive Officer (CEO)",
4
startDate: new Date(1852, 11, 27),
5
6
salary: 250000n,
7
active: true,
8
employment: "full-time"
9
}
Copied!
In CipherStash, we call a record the source. It is encrypted in the client before being sent to the server and CipherStash never sees any plain-text data. Records are encoded as BSON before being encrypted.
To create a collection for our employees, we first need to create a CollectionSchema by calling CollectionSchema.defineBasic(collectionName) then pass the schema createCollection(schema) function. Note that the collection name will be encrypted so that it is hidden from CipherStash.
TypeScript
JavaScript
1
import { Stash, CollectionSchema } from "@cipherstash/stashjs"
2
3
type Employee = {
4
id: string,
5
name: string,
6
jobTitle: string,
7
startDate: Date,
8
email: string,
9
salary: bigint,
10
employment: string
11
}
12
13
const schema = CollectionSchema.defineBasic<Employee>("employees").notIndexed()
14
15
const employees = await stash.createCollection(schema)
Copied!
1
import { Stash, CollectionSchema } from "@cipherstash/stashjs"
2
3
const schema = CollectionSchema.defineBasic("employees").notIndexed()
4
5
const employees = await stash.createCollection(schema)
Copied!
We can insert records into our collection and they will be fully encrypted in CipherStash. However, aCollection with a schema created via `notIndexed()` only supports retrieval of records by ID. Such a collection behaves like a simple key-value store. In order to perform useful queries on the data we must define our collection schema with indexes.

Indexes

While traditional databases use indexes to improve query performance, CipherStash uses indexes to enable queries over an encrypted collection. You don't need to create an index for every field (in fact a collection can work as a simple key-value store without any indexes at all) but queries on a field are not possible without an index defined.
Let's have another go at creating our collection, but this time with some useful indexes.
TypeScript
JavaScript
1
import { Stash, CollectionSchema, downcase, ngram } from "@cipherstash/stashjs"
2
3
type Employee = {
4
id: string,
5
name: string,
6
jobTitle: string,
7
startDate: Date,
8
email: string,
9
salary: bigint,
10
employment: string
11
}
12
13
const schema = CollectionSchema.define<Employee>("employees").indexedWith(
14
mapping => ({
15
email: mapping.Exact("email"),
16
employment: mapping.Exact("employment"),
17
active: mapping.Exact("active"),
18
salary: mapping.Range("salary"),
19
startDate: mapping.Range("startDate"),
20
nameAndJobTitle: mapping.Match(["name", "jobTitle"], {
21
tokenFilters: [downcase],
22
tokenizer: ngram({ tokenLength: 3 })
23
})
24
})
25
)
26
27
const employeesCollection = await stash.createCollection(schema)
Copied!
1
import { Stash, CollectionSchema, downcase, ngram } from "@cipherstash/stashjs"
2
3
const schema = CollectionSchema.define("employees").indexedWith(
4
mapping => ({
5
email: mapping.Exact("email"),
6
active: mapping.Exact("active"),
7
salary: mapping.Range("salary"),
8
startDate: mapping.Range("startDate"),
9
nameAndJobTitle: mapping.Match(["name", "jobTitle"], {
10
tokenFilters: [downcase],
11
tokenizer: ngram({ tokenLength: 3 })
12
})
13
})
14
)
15
16
const employeesCollection = await stash.createCollection(schema)
Copied!
If you are following along and need to delete the collection we created earlier before creating it again, you can do so using deleteCollection.

Defining indices with field mappings

Let's breakdown what we've done so far. In the example code above we defined a CollectionSchema named "employees" and we've also defined some mappings on the fields by calling indexedWith and providing a callback function as an argument. The return value of the callback is an object that describes the indices for our record type. Indices define how we can query our collection.
The keys of the object (in the example above, email, employment, active, salary, startDate, nameAndJobTitle) are the names of the indices. The values of the object define the type of the index which determines the supported query operations.
For example, mapping.Exact("email") defines an index that provides the ability to use an equality clause in queries on that index. mapping.Range("salary") defines the ability to use a comparison clause in a query - e.g. less than, greater than, between etc.
There is also another kind of mapping: Match. An index created with Match can perform full text search queries. The following fragment, for example, enables full text search across two fields (name and jobTitle) first by normalising the text with a downcase filter and then analyzing the text with an ngram tokenizer.
TypeScript
JavaScript
1
mapping.Match(["name", "jobTitle"], {
2
tokenFilters: [downcase],
3
tokenizer: ngram({ tokenLength: 3 })
4
})
Copied!
1
mapping.Match(["name", "jobTitle"], {
2
tokenFilters: [downcase],
3
tokenizer: ngram({ tokenLength: 3 })
4
})
Copied!
A Match filter with this configuration implements "typeahead". A typeahead is a common pattern when interacting with data in forms.
Match indices in CipherStash work on string fields and allow for free text searches in a similar way to SQL's like. They use a fast encrypted B-Tree and support large collections (using n-grams and boolean search under the hood). If don't need to do partial string matches, you can just use an Exact index which only matches exact strings.
Index Name
Index Type
Enables
email
Exact
Searches for exact email
employment
Exact
Searches for exact employment type
active
Exact
Searches for exact active status
salary
Range
Range queries on salary
startDate
Range
Range queries on the employees date of birth
nameAndJobTitle
Match
Full text search on name and job title
For a full list of supported index types see Index Types.

Adding Indexes to a Collection

At the moment it isn't possible to add a new index to a collection (you must define all the indexes you need before adding data to the collection). This is something that will be addressed in a future version. However, you can re-index records one at a time if you need (say if you have changed the collection settings). See Put, Get and Delete Records.

Primary Keys

Every Collection must have a primary key: it can be a field you have defined in a source record, or it can be automatically generated for you. We'll have CipherStash generate records for us.
Generated IDs are 128-bit Universally Unique Identifiers (UUID). This is in contrast to integer sequences which are common in traditional data stores and are prone to leaking information about the data (such as insertion order).

Auto-generating IDs

If we don't provide an ID, the CipherStash client will generate one for us when we insert a record using put.
TypeScript
JavaScript
1
const ada = await employees.put({
2
name: "Ada Lovelace",
3
title: "Chief Technology Officer",
4
salary: 200000,
5
startDate: "2019-10-15",
6
7
active: true,
8
employment: "full-time"
9
})
Copied!
1
const ada = await employees.put({
2
name: "Ada Lovelace",
3
title: "Chief Technology Officer",
4
salary: 200000,
5
startDate: "2019-10-15",
6
7
active: true,
8
employment: "full-time"
9
})
Copied!
The returned record will have the auto-generated ID set:
TypeScript
JavaScript
1
ada.id // 'd8e54fde-3925-429f-bb56-1788dfc59773'
Copied!
1
ada.id // 'd8e54fde-3925-429f-bb56-1788dfc59773'
Copied!

Other Primary Keys

If you have a field in your source record that you want to use for a primary key, you can do so by first converting it into a uniformly random number. Here we'll do that by using the HMAC function in the node.js API.
TypeScript
JavaScript
1
const { createHmac } = require('crypto')
2
3
const hmac = createHmac('sha256', SECRET_KEY)
4
hmac.update('[email protected]')
5
6
const ada = await employees.put({
7
"id": hmac.digest(),
8
// ...other fields
9
})
Copied!
1
const { createHmac } = require('crypto')
2
3
const hmac = createHmac('sha256', SECRET_KEY)
4
hmac.update('[email protected]')
5
6
const ada = await employees.put({
7
"id": hmac.digest(),
8
// ...other fields
9
})
Copied!
Don't just use a hash like SHA256 for the ID here as this will make it very easy for an adversary to reverse engineer the values. A keyed hash function like HMAC is a much better way to go.
Note that you will need a Secret key for HMAC. You can use a random 32-byte string and store this securely in your application (say using AWS Secrets Manger or Hashicorp Vault).

Deleting a Collection

To delete the Employees collection, you can use the deleteCollection function:
TypeScript
JavaScript
1
await stash.deleteCollection("users")
Copied!
1
await stash.deleteCollection("users")
Copied!
Last modified 2mo ago