CipherStash
CipherStash Documentation

Define a Collection

All data in CipherStash is stored in a collection.

A collection is a set of encrypted data that have a similar structure and access pattern.

It is analogous to a table in an RDBMS.

Before creating a collection we need to define a schema for the collection.

When we define a schema, we are describing the structure of the collection and the indexing strategies for the data stored in the collection.

In CipherStash you can only search against fields that have been indexed.

Index Types

Tokenizers and token filters

Tokenizer

Divides text into individual terms (tokens).

These terms are used for finding matches when querying records.

  • Standard: Divides text based on word boundaries e.g “Welcome to CipherStash” is tokenized to ["Welcome", "to", "CipherStash"]

  • NgramTokenizer: Divides text based on the given token length — it’s like a sliding window across the text. e.g “Define a collection” with a {tokenLength: 3} is tokenized to ["Def", "efi", "fin", "ine", "ne ", "e a", " a ", "a c", " co", "col", "oll", "lle", "lec", "ect", "cti", "tio", "ion"]

Token Filters
  • NGram: Divides each term given the token length.

    e.g “Define a collection” with the standard tokenizer and an ngram filter with {tokenLength: 3}

    The standard tokenizer returns terms ["Define", "a", "collection"]

    Then with the ngram filter applied:

    ["Def", "efi", "fin", "ine", "col", "oll", "lle", "ect", "cti", "tio", "ion"]

  • Downcase: Downcases terms

  • Upcase: Uppercases terms

Define a schema using JSON

The JSON type definition is defined here.

JSON example:

{
  "type": {
    "title": "string",
    "runningTime": "uint64",
    "year": "uint64",
  },
  "indexes": {
    "exactTitle": { "kind": "exact", "field": "title" },
    "runningTime": { "kind": "range", "field": "runningTime" },
    "year": { "kind": "range", "field": "year" },
    "title": {
      "kind": "match",
      "fields": ["title"],
      "tokenFilters": [
        { "kind": "downcase" },
        { "kind": "ngram", "tokenLength": 3 }
      ],
      "tokenizer": { "kind": "standard" }
    },
    "allTextDynamicMatch": {
      "kind": "dynamic-match",
      "tokenFilters": [
        { "kind": "downcase" }
      ],
      "tokenizer":  { "kind": "ngram", "tokenLength": 3 }
    },
    "allTextFieldDynamicMatch": {
      "kind": "field-dynamic-match",
      "tokenFilters": [
        { "kind": "downcase" }
      ],
      "tokenizer":  { "kind": "ngram", "tokenLength": 3 }
    }
  }
}
import {
  CollectionSchema,
  generateSchemaDefinitionFromJSON,
  describeError,
} from "@cipherstash/stashjs";

interface Movie extends StashRecord {
  title: string;
  runningTime: number;
  year: number;
};

export const generateSchemaUsingJSON = async () => {
  const collectionName = "movies";

  const jsonMapping = JSON.stringify({
      "type": {
        "title": "string",
        "runningTime": "uint64",
        "year": "uint64",
      },
      "indexes": {
        "exactTitle": { "kind": "exact", "field": "title" },
        "runningTime": { "kind": "range", "field": "runningTime" },
        "year": { "kind": "range", "field": "year" },
        "title": {
          "kind": "match",
          "fields": ["title"],
          "tokenFilters": [
            { "kind": "downcase" },
            { "kind": "ngram", "tokenLength": 3 }
          ],
          "tokenizer": { "kind": "standard" }
        },
        "allTextDynamicMatch": {
          "kind": "dynamic-match",
          "tokenFilters": [
            { "kind": "downcase" }
          ],
          "tokenizer":  { "kind": "ngram", "tokenLength": 3 }
        },
        "allTextFieldDynamicMatch": {
          "kind": "field-dynamic-match",
          "tokenFilters": [
            { "kind": "downcase" }
          ],
          "tokenizer":  { "kind": "ngram", "tokenLength": 3 }
        }
      }
    });

  try {
    const schemaDefinition = await generateSchemaDefinitionFromJSON(
      jsonMapping
    );

    return CollectionSchema.define<Movie>(
      collectionName
    ).fromCollectionSchemaDefinition(schemaDefinition);
  } catch (err) {
    console.error(`Could not define schema. Reason: ${describeError(err)}`);
  }
};

generateSchemaUsingJSON()

Define a schema without mappings

This is useful for defining a collection to serve as a simple key value store.

The only operations supported on a collection with this schema will be put, get and delete.

It will not be queryable.

import { CollectionSchema } from "./collection-schema";

export const generateSchemaWithoutMapping = () => {
  return CollectionSchema.define<Movie>("movies").notIndexed();
};

generateSchemaWithoutMapping();