Collections
Overview
A Collection is a grouping of Entities with the same Partition Key and allows you to make efficient query across multiple entities. If your background is SQL, imagine Partition Keys as Foreign Keys, a Collection represents a View with multiple joined Entities.
ElectroDB Collections use a single DynamoDB query to retrieve results. One query is made to retrieve results for all Entities (the benefits of single table design), however keep in mind that DynamoDB returns all records in order of the Entity’s sort key. In cases where your partition contains a large volume of items, it is possible some entities will not return items during pagination. This can be mitigated through the use of Index Types.
Collections are defined on an Index, and the name of the collection should represent what the query would return as a pseudo Entity
. Additionally, Collection names must be unique across a Service
.
A
collection
name must be unique to a single common index across entities.
Table Definition
Example Setup
Table Definition
{ "TableName": "electro", "KeySchema": [ { "AttributeName": "pk", "KeyType": "HASH" }, { "AttributeName": "sk", "KeyType": "RANGE" } ], "AttributeDefinitions": [ { "AttributeName": "pk", "AttributeType": "S" }, { "AttributeName": "sk", "AttributeType": "S" }, { "AttributeName": "gsi1pk", "AttributeType": "S" }, { "AttributeName": "gsi1sk", "AttributeType": "S" } ], "GlobalSecondaryIndexes": [ { "IndexName": "gsi1pk-gsi1sk-index", "KeySchema": [ { "AttributeName": "gsi1pk", "KeyType": "HASH" }, { "AttributeName": "gsi1sk", "KeyType": "RANGE" } ], "Projection": { "ProjectionType": "ALL" } } ], "BillingMode": "PAY_PER_REQUEST" }
Entities
import { Entity } from "electrodb";
export const Employee = new Entity({
model: {
entity: "employee",
version: "1",
service: "taskapp",
},
attributes: {
employeeId: {
type: "string",
},
organizationId: {
type: "string",
},
name: {
type: "string",
},
team: {
type: ["jupiter", "mercury", "saturn"],
},
},
indexes: {
staff: {
pk: {
field: "pk",
composite: ["organizationId"],
},
sk: {
field: "sk",
composite: ["employeeId"],
},
},
employee: {
collection: "assignments",
index: "gsi2",
pk: {
field: "gsi2pk",
composite: ["employeeId"],
},
sk: {
field: "gsi2sk",
composite: [],
},
},
},
});
import { Entity } from "electrodb";
const Task = new Entity({
model: {
entity: "tasks",
version: "1",
service: "taskapp",
},
attributes: {
taskId: {
type: "string",
},
employeeId: {
type: "string",
},
projectId: {
type: "string",
},
title: {
type: "string",
},
body: {
type: "string",
},
},
indexes: {
project: {
pk: {
field: "pk",
composite: ["projectId"],
},
sk: {
field: "sk",
composite: ["taskId"],
},
},
assigned: {
collection: "assignments",
index: "gsi2",
pk: {
field: "gsi2pk",
composite: ["employeeId"],
},
sk: {
field: "gsi2sk",
composite: ["projectId"],
},
},
},
});
Example
import DynamoDB from "aws-sdk/clients/dynamodb";
const table = "projectmanagement";
const client = new DynamoDB.DocumentClient();
const TaskApp = new Service({
employee: Employee,
task: Task,
});
await TaskApp.collections.assignments({ employeeId: "JExotic" }).go();
Response Format
{
data: {
task: EmployeeItem[];
employee: TaskItem[];
}
cursor: string | null;
}
Equivalent Parameters
{
"TableName": "projectmanagement",
"ExpressionAttributeNames": { "#pk": "gsi2pk", "#sk1": "gsi2sk" },
"ExpressionAttributeValues": {
":pk": "$taskapp_1#employeeid_joeexotic",
":sk1": "$assignments"
},
"KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
"IndexName": "gsi2"
}
Collection Queries vs Entity Queries
To query across entities, collection queries make use of ElectroDB’s Sort Key structure, which prefixes Sort Key fields with the collection name. Unlike an Entity Query, Collection queries for isolated indexes only leverage Composite Attributes from an access pattern’s Partition Key, while Collection queries for clustered indexes allow you to query on both Partition and Sort Keys.
To better explain how Collection Queries are formed, here is a juxtaposition of an Entity Query’s parameters vs a Collection Query’s parameters:
Entity Query
Example
await TaskApp.entities.task.query.assigned({ employeeId: "JExotic" }).go();
Equivalent Parameters
{
"KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
"TableName": "projectmanagement",
"ExpressionAttributeNames": {
"#pk": "gsi2pk",
"#sk1": "gsi2sk"
},
"ExpressionAttributeValues": {
":pk": "$taskapp#employeeid_jexotic",
":sk1": "$assignments#tasks_1"
},
"IndexName": "gsi2"
}
Collection Query
Example
await TaskApp.collections.assignments({ employeeId: "JExotic" }).go();
Equivalent Parameters
{
"KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
"TableName": "projectmanagement",
"ExpressionAttributeNames": { "#pk": "gsi2pk", "#sk1": "gsi2sk" },
"ExpressionAttributeValues": {
":pk": "$taskapp#employeeid_jexotic",
":sk1": "$assignments"
},
"IndexName": "gsi2"
}
The notable difference between the two is how much of the Sort Key is specified at query time.
Entity Query Params
{
"ExpressionAttributeValues": { ":sk1": "$assignments#tasks_1" }
}
Collection Query Params
{
"ExpressionAttributeValues": { ":sk1": "$assignments" }
}
Collection Response Structure
Unlike Entity Queries which return an array, Collection Queries return an object. This object will have a key for every Entity name (or Entity Alias) associated with that Collection, and an array for all results queried that belong to that Entity.
For example, using the “TaskApp” models defined above, we would expect the following response from a query to the “assignments” collection:
Example
const results = await TaskApp.collections
.assignments({ employeeId: "JExotic" })
.go();
Response Format
{
data: {
tasks: [...], // tasks for employeeId "JExotic"
employees: [...] // employee record(s) with employeeId "JExotic"
},
cursor: null
}
Equivalent Parameters
{
"TableName": "projectmanagement",
"ExpressionAttributeNames": { "#pk": "gsi2pk", "#sk1": "gsi2sk" },
"ExpressionAttributeValues": {
":pk": "$taskapp_1#employeeid_joeexotic",
":sk1": "$assignments"
},
"KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
"IndexName": "gsi2"
}
Because the Tasks and Employee Entities both associated their index (gsi2
) with the same collection name (assignments
), ElectroDB is able to associate the two entities via a shared Partition Key. As stated in the collections section, querying across Entities by PK can be comparable to querying across a foreign key in a traditional relational database.
Sub-Collections
Sub-Collections are an extension of Collection functionality that allow you to model more advanced access patterns. Collections and Sub-Collections are defined on Indexes via a property called collection
, as either a string or string array respectively.
Sub-Collections are only supported on “isolated” index types.
The following is an example of functionally identical collections, implemented as a string (referred to as a “collection”) and then as a string array (referred to as sub-collections):
As a string (collection)
{
collection: "assignments"
pk: {
field: "pk",
composite: ["employeeId"]
},
sk: {
field: "sk",
composite: ["projectId"]
}
}
As a string array (sub-collections)
{
collection: ["assignments"]
pk: {
field: "pk",
composite: ["employeeId"]
},
sk: {
field: "sk",
composite: ["projectId"]
}
}
Both implementations above will create a “collections” method called assignments
when added to a Service.
const results = await TaskApp.collections
.assignments({ employeeId: "JExotic" })
.go();
The advantage to using a string array to define collections is the ability to express sub-collections. Below is an example of three entities using sub-collections, followed by an explanation of their sub-collection definitions:
import { Entity } from "electrodb";
const employees = new Entity({
model: {
entity: "employees",
version: "1",
service: "taskapp",
},
attributes: {
employeeId: {
type: "string",
},
organizationId: {
type: "string",
},
name: {
type: "string",
},
team: {
type: ["jupiter", "mercury", "saturn"] as const,
},
},
indexes: {
staff: {
pk: {
field: "pk",
composite: ["organizationId"],
},
sk: {
field: "sk",
composite: ["employeeId"],
},
},
employee: {
// highlight next line
collection: "contributions",
index: "gsi2",
pk: {
field: "gsi2pk",
composite: ["employeeId"],
},
sk: {
field: "gsi2sk",
composite: [],
},
},
},
});
import { Entity } from "electrodb";
const tasks = new Entity({
model: {
entity: "tasks",
version: "1",
service: "taskapp",
},
attributes: {
taskId: {
type: "string",
},
employeeId: {
type: "string",
},
projectId: {
type: "string",
},
title: {
type: "string",
},
body: {
type: "string",
},
},
indexes: {
project: {
collection: "overview",
pk: {
field: "pk",
composite: ["projectId"],
},
sk: {
field: "sk",
composite: ["taskId"],
},
},
assigned: {
// highlight next line
collection: ["contributions", "assignments"] as const,
index: "gsi2",
pk: {
field: "gsi2pk",
composite: ["employeeId"],
},
sk: {
field: "gsi2sk",
composite: ["projectId"],
},
},
},
});
import { Entity } from "electrodb";
const projectMembers = new Entity({
model: {
entity: "projectMembers",
version: "1",
service: "taskapp",
},
attributes: {
employeeId: {
type: "string",
},
projectId: {
type: "string",
},
name: {
type: "string",
},
},
indexes: {
members: {
collection: "overview",
pk: {
field: "pk",
composite: ["projectId"],
},
sk: {
field: "sk",
composite: ["employeeId"],
},
},
projects: {
// highlight next line
collection: ["contributions", "assignments"] as const,
index: "gsi2",
pk: {
field: "gsi2pk",
composite: ["employeeId"],
},
sk: {
field: "gsi2sk",
composite: [],
},
},
},
});
import DynamoDB from "aws-sdk/clients/dynamodb";
const table = "projectmanagement";
const client = new DynamoDB.DocumentClient();
const TaskApp = new Service(
{
employees,
tasks,
projectMembers,
},
{ client, table },
);
TypeScript Note: Use
as const
syntax when definingcollection
as a string array for improved type support
The last line of the code block above creates a Service called TaskApp
using the Entity instances created above its declaration. By creating a Service, ElectroDB will identify and validate the sub-collections defined across all three models. The result in this case are four unique collections: “overview”, “contributions”, and “assignments”.
The simplest collection to understand is overview
. This collection is defined on the table’s Primary Index, composed of a projectId
in the Partition Key, and is currently implemented by two Entities: tasks
and projectMembers
. If another entity were to be added to our service, it could “join” this collection by implementing an identical Partition Key composite (projectId
) and labeling itself as part of the overview
collection. The following is an example of using the overview
collection:
Simple Collections
Example
// overview
const results = await TaskApp.collections
.overview({ projectId: "SD-204" })
.go();
Response Format
{
data: {
tasks: [...], // tasks associated with projectId "SD-204
projectMembers: [...] // employees of project "SD-204"
},
cursor: null,
}
Equivalent Parameters
{
"KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
"TableName": "projectmanagement",
"ExpressionAttributeNames": { "#pk": "pk", "#sk1": "sk" },
"ExpressionAttributeValues": {
":pk": "$taskapp#projectid_sd-204",
":sk1": "$overview"
}
}
Complex Collections
Unlike overview
, the collections contributions
, and assignments
are more complex.
In the case of contributions
, all three entities implement this collection on the gsi2
index, and compose their Partition Key with the employeeId
attribute. The assignments
collection, however, is only implemented by the tasks
and projectMembers
Entities. Below is an example of using these collections:
Collection values of
collection: "contributions"
andcollection: ["contributions"]
are interpreted by ElectroDB as being the same implementation.
Example
// contributions
const results = await TaskApp.collections
.contributions({ employeeId: "JExotic" })
.go();
Response Format
{
data: {
tasks: [...], // tasks assigned to employeeId "JExotic"
projectMembers: [...], // projects with employeeId "JExotic"
employees: [...] // employee record(s) with employeeId "JExotic"
},
cursor: null,
}
Equivalent Parameters
{
"KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
"TableName": "projectmanagement",
"ExpressionAttributeNames": { "#pk": "gsi2pk", "#sk1": "gsi2sk" },
"ExpressionAttributeValues": {
":pk": "$taskapp#employeeid_jexotic",
":sk1": "$contributions"
},
"IndexName": "gsi2"
}
Complex Collections (Continued)
This collection contains only the entities that share the second collection element.
Example
const results = await TaskApp.collections
.assignments({ employeeId: "JExotic" })
.go();
Response Format
{
data: {
tasks: [...], // tasks assigned to employeeId "JExotic"
projectMembers: [...], // projects with employeeId "JExotic"
},
cursor: null,
}
Equivalent Parameters
{
"KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
"TableName": "projectmanagement",
"ExpressionAttributeNames": { "#pk": "gsi2pk", "#sk1": "gsi2sk" },
"ExpressionAttributeValues": {
":pk": "$taskapp#employeeid_jexotic",
":sk1": "$contributions#assignments"
},
"IndexName": "gsi2"
}
Looking above we can see that the assignments
collection is actually a subset of the results that could be queried with the contributions
collection. The power behind having the assignments
sub-collection is the flexibility to further slice and dice your cross-entity queries into more specific and performant queries.
Index and Collection Naming Conventions
ElectroDB puts an emphasis on allowing users to define more domain specific naming. Instead of referring to indexes by their name on the table, ElectroDB allows users to define their indexes as Access Patterns.
Please refer to the Entities defined in the section Sub-Collections as the source of examples within this section.
Index Naming Conventions
The following is an access pattern on the “employees” entity defined here:
staff: {
pk: {
field: "pk",
composite: ["organizationId"]
},
sk: {
field: "sk",
composite: ["employeeId"]
}
}
This Access Pattern is defined on the table’s Primary Index (note the lack of an index
property), is given the name staff
, and is composed of an organizationId
and an employeeId
.
When deciding on an Access Pattern name, ask yourself, “What would the array of items returned represent if I only supplied the Partition Key”. In this example case, the entity defines an “Employee” by its organizationId
and employeeId
. If you performed a query against this index, and only provided organizationId
you would then expect to receive all Employees for that Organization. From there, the name staff
was chosen because the focus becomes “What are these Employees to that Organization?“.
This convention also becomes evident when you consider Access Pattern name becomes the name of the method you use query that index.
await employee.query.staff({ organizationId: "nike" }).go();
Collection Naming Conventions
The following are access patterns on entities defined here:
// employees entity
employee: {
collection: "contributions",
index: "gsi2",
pk: {
field: "gsi2pk",
composite: ["employeeId"],
},
sk: {
field: "gsi2sk",
composite: [],
},
}
// tasks entity
assigned: {
collection: ["contributions", "assignments"],
index: "gsi2",
pk: {
field: "gsi2pk",
composite: ["employeeId"],
},
sk: {
field: "gsi2sk",
composite: ["projectId"],
},
}
// projectMembers entity
projects: {
collection: ["contributions", "assignments"] as const,
index: "gsi2",
pk: {
field: "gsi2pk",
composite: ["employeeId"],
},
sk: {
field: "gsi2sk",
composite: [],
},
}
In the case of the entities above, we see an example of a sub-collection. ElectroDB will use the above definitions to generate two collections: contributions
, assignments
.
The considerations for naming a collection are nearly identical to the considerations for naming an index: What do the query results from supplying just the Partition Key represent? In the case of collections you must also consider what the results represent across all the involved entities, and the entities that may be added in the future.
For example, the contributions
collection is named such because when given an employeeId
we receive the employee’s details, the tasks the that employee, and the projects where they are currently a member.
In the case of assignments
, we receive a subset of contributions
when supplying an employeeId
: Only the tasks and projects they are “assigned” are returned.