MongoDB One-to-Many and Many-to-Many (Basics)

In my previous post I talked about sub-documents. If you are here, this means you have a compelling reason not to use sub-documents (probably reasons I talked about in my previous post). One important question is that, how can we model one-to-many and/or many-to-many relationships. In this post I want to talk about ways of implementing these two relationships.
In my previous post I talked about sub-documents. If you are here, this means you have a compelling reason not to use sub-documents (probably reasons I talked about in my previous post). One important question is that, how can we model one-to-many and/or many-to-many relationships. In this post I want to talk about ways of implementing these two relationships.
One-to-Many Relationship
Imagine we have a collection called users and another collection called addresses. Each user can have multiple addresses and for some reason, we have decided not to embed these addresses in our user document.
Solution 1:
The simplest way to handle these situations, is to add a reference to the user document in each address document (as we used to do in RDBs). This is a completely correct approach.
users:
{ "_id": ObjectID("5ae41b5965a9d4390d1ac3ef"), "name": "John Doe" }
addresses:
{ "_id": ObjectId("5ae42b5965a9d4390d1ac5ef"), "address": "NO.4, First Street, London, UK", "type": "work", "user_id": ObjectID("5ae41b5965a9d4390d1ac3ef") } { "_id": ObjectId("5ae42b5965a9d4390d1ac6ef"), "address": "NO.10, Second Street, London, UK", "type": "home", "user_id": ObjectID("5ae41b5965a9d4390d1ac3ef") }
Solution 2:
However, there is a better solution. Since MongoDB supports arrays, we can use this feature to store references to related address documents, to be more precise we can have an array of ObjectID references.
Note: You can keep user_id field in addresses collection which is called Two-Way Referencing. Doing so is not required, however it can help you find the user of a specific address.
users:
{ "_id": ObjectID("5ae41b5965a9d4390d1ac3ef"), "name": "John Doe", "addresses": [ObjectId("5ae42b5965a9d4390d1ac5ef"), ObjectId("5ae42b5965a9d4390d1ac6ef")] }
addresses:
{ "_id": ObjectId("5ae42b5965a9d4390d1ac5ef"), "address": "NO.4, First Street, London, UK", "type": "work", "user_id": ObjectID("5ae41b5965a9d4390d1ac3ef") } { "_id": ObjectId("5ae42b5965a9d4390d1ac6ef"), "address": "NO.10, Second Street, London, UK", "type": "home", "user_id": ObjectID("5ae41b5965a9d4390d1ac3ef") }
How will solution 2 help us? Well, instead of looking for those documents in addresses collection with a specific user_id (which might not be an indexed field), we can fetch our user using his _id first and use the $in operator and addresses field in our fetched document to find our desired address documents.
Note: Keep in mind that _id is an indexed field, by default.
Note: When using Two-Way-Referencing, if we delete an address document, we have to remove its _id from user’s addresses array. Sometimes we might have to update our array of references, for more complicated use cases (such as reassigning a document in collection A to another document is collection B).
Many-to-Many Relationship
Imagine we have a collection called employees and another collection called departments. In our example, each employee can work in multiple departments and each department can have multiple employees working in them.
Solution 1:
Just like a relational database, we can store the relationship between each employee and department in a third collection called employee_department_relations.
employees:
{ "_id": ObjectID("5ae41b5965a9d4390d1ac1af"), "name": "John Doe" } { "_id": ObjectID("5ae41b5965a9d4390d1ac2af"), "name": "Jane Doe" }
departments:
{ "_id": ObjectId("5ae42b5965a9d4390d1ac1bf"), "name": "Sales" } { "_id": ObjectId("5ae42b5965a9d4390d1ac2bf"), "name": "Web" }
employee_department_relations:
{ "_id": ObjectId("5ae42b5965a9d4390d1ac1cf"), "employee_id": ObjectID("5ae41b5965a9d4390d1ac1af") "department_id": ObjectId("5ae42b5965a9d4390d1ac1bf") } { "_id": ObjectId("5ae42b5965a9d4390d1ac2cf"), " employee_id": ObjectID("5ae41b5965a9d4390d1ac1af") " department_id": ObjectId("5ae42b5965a9d4390d1ac2bf") } { "_id": ObjectId("5ae42b5965a9d4390d1ac3cf"), " employee_id": ObjectID("5ae41b5965a9d4390d1ac2af") " department_id": ObjectId("5ae42b5965a9d4390d1ac1bf") }
In our example, John works in both Sales and Web departments and Jane works in Sales department
Solution 2:
We can use arrays to store references to the other document. When using this solution (because of MongoDB document size limit) we have to use the array in the document with less documents on the many side! It is a little hard to understand, I know. To make it simpler, if a limited number of employees (not more than a hundred) work in each department, we use array of employee ObjectIDs in our department documents. If each employee works in a limited number of departments (not more than a hundred), we use array of department ObjectIDs in our employee documents. The following is an example of the second condition (each employee works in a limited number of departments).
employees:
{ "_id": ObjectID("5ae41b5965a9d4390d1ac1af"), "name": "John Doe", "departments": [ObjectID("5ae41b5965a9d4390d1ac1bf"), ObjectID("5ae41b5965a9d4390d1ac2bf")] } { "_id": ObjectID("5ae41b5965a9d4390d1ac2af"), "name": "Jane Doe", "departments": [ObjectID("5ae41b5965a9d4390d1ac1bf")] }
departments:
{ "_id": ObjectId("5ae42b5965a9d4390d1ac1bf"), "name": "Sales" } { "_id": ObjectId("5ae42b5965a9d4390d1ac2bf"), "name": "Web" }
Sometimes both of our conditions are met, in other words each employee works in a limited number of departments and a limited number of employees work in each department, we can use Two-Way-Referencing:
employees:
{ "_id": ObjectID("5ae41b5965a9d4390d1ac1af"), "name": "John Doe", "departments": [ObjectID("5ae41b5965a9d4390d1ac1bf"), ObjectID("5ae41b5965a9d4390d1ac2bf")] } { "_id": ObjectID("5ae41b5965a9d4390d1ac2af"), "name": "Jane Doe", "departments": [ObjectID("5ae41b5965a9d4390d1ac1bf")] }
departments:
{ "_id": ObjectId("5ae42b5965a9d4390d1ac1bf"), "name": "Sales", "employees": [ObjectId("5ae42b5965a9d4390d1ac1af"), ObjectId("5ae42b5965a9d4390d1ac2af")] } { "_id": ObjectId("5ae42b5965a9d4390d1ac2bf"), "name": "Web", "employees": [ObjectId("5ae42b5965a9d4390d1ac1af")] }
We can also use the combination of these two solutions as well. Maybe in some use cases, the combination of these two approaches is needed to store Many-to-Many relationships.
When using the approaches above, always keep the following rule in your mind. If there are more than a couple of hundred (or as suggested by MongoDB, more than a couple of thousand) documents in the many side, do not use array of references.