Lecture 4: Data modelling in MongoDB#
Learning objectives#
By the end of this lecture, students should understand:
Why data modelling is important
The key principle of data modelling in MongoDB
Techniques to develop a data model: workload estimate, identify relationships, schema patterns
The difference between embedding and referencing and when to use each type
Slides#
Note
Download a PDF version here
Supplemental materials#
Embed vs reference guideline#
Guideline name |
Question |
Embed |
Reference |
---|---|---|---|
Simplicity |
Would keeping the pieces of information together lead to a simpler data model and code? |
Yes |
No |
Go Together |
Do the pieces of information have a “has-a,” “contains,” or similar relationship? |
Yes |
No |
Query Atomicity |
Does the application query the pieces of information together? |
Yes |
No |
Update Complexity |
Are the pieces of information updated together? |
Yes |
No |
Archival |
Should the pieces of information be archived at the same time? |
Yes |
No |
Cardinality |
Is there a high cardinality (current or growing) in the child side of the relationship? |
No |
Yes |
Data Duplication |
Would data duplication be too complicated to manage and undesired? |
No |
Yes |
Document Size |
Would the combined size of the pieces of information take too much memory or transfer bandwidth for the application? |
No |
Yes |
Document Growth |
Would the embedded piece grow without bound? |
No |
Yes |
Workload |
Are the pieces of information written at different times in a write-heavy workload? |
No |
Yes |
Individuality |
For the children side of the relationship, can the pieces exist by themselves without a parent? |
No |
Yes |