Course recap MongoDB#
1. Non-relational database#
Multiple Choice Questions#
What is a key difference between relational and non-relational databases?
A) Relational databases use a flexible schema, while non-relational databases use a fixed schema.
B) Relational databases use SQL for queries, while non-relational databases use various query languages.
C) Non-relational databases are always faster than relational databases.
D) Relational databases do not support ACID properties, while non-relational databases do.
Reveal solutions
What is a key difference between relational and non-relational databases?
Correct Answer: B) Relational databases use SQL for queries, while non-relational databases use various query languages.
Explanation: Relational databases typically use SQL (Structured Query Language) for querying data, whereas non-relational databases may use different query languages depending on their type (e.g., MongoDB uses a JSON-like query language).
Incorrect Choices:
A) Relational databases use a fixed schema, while non-relational databases use a flexible schema.
Explanation: This is incorrect because relational databases use a fixed schema, and non-relational databases often use a flexible schema.
C) Non-relational databases are always faster than relational databases.
Explanation: This is not necessarily true; performance depends on the specific use case and data structure.
D) Relational databases do not support ACID properties, while non-relational databases do.
Explanation: Relational databases typically support ACID (Atomicity, Consistency, Isolation, Durability) properties, while non-relational databases may or may not support them.
Which of the following is a disadvantage of using a relational database?
A) They are not suitable for complex queries.
B) They require a fixed schema.
C) They do not support transactions.
D) They cannot handle large volumes of data.
Reveal solutions
Which of the following is a disadvantage of using a relational database?
Correct Answer: B) They require a fixed schema.
Explanation: Relational databases require a predefined schema, which can be less flexible when dealing with changing data structures.
Incorrect Choices:
A) They are not suitable for complex queries.
Explanation: Relational databases are actually well-suited for complex queries.
C) They do not support transactions.
Explanation: Relational databases support transactions and ACID properties.
D) They cannot handle large volumes of data.
Explanation: Relational databases can handle large volumes of data, though performance may vary based on the design and implementation.
Which type of non-relational database is best suited for hierarchical & networked data storage?
A) Column-based
B) Key-value
C) Graph
D) Document
Reveal solutions
Which type of non-relational database is best suited for hierarchical and network data storage?
Correct Answer: C) Graph
Explanation: Graph databases are designed to handle hierarchical and networked data structures efficiently.
Incorrect Choices:
A) Column-based
Explanation: Column-based databases are optimized for read-heavy operations and analytical queries, not hierarchical data.
B) Key-value
Explanation: Key-value databases are simple and efficient for storing key-value pairs but not ideal for hierarchical data.
D) Document
Explanation: Document databases can handle hierarchical data but are not as specialized for it as graph databases.
What is a common use case for a key-value non-relational database?
A) Storing large binary files
B) Real-time analytics
C) Storing user session information
D) Complex joins and transactions
Reveal solutions
What is a common use case for a key-value non-relational database?
Correct Answer: C) Storing user session information
Explanation: Key-value databases are efficient for storing and retrieving simple data structures like user session information.
Incorrect Choices:
A) Storing large binary files
Explanation: This is better suited for object storage systems.
B) Real-time analytics
Explanation: Real-time analytics often require more complex querying capabilities.
D) Complex joins and transactions
Explanation: Key-value databases are not designed for complex joins and transactions.
Which of the following statements about the scalability of relational and non-relational databases is true?
A) Relational databases are generally easier to scale horizontally than non-relational databases.
B) Non-relational databases are typically designed to scale horizontally more easily than relational databases.
C) Both relational and non-relational databases scale horizontally with equal ease.
D) Neither relational nor non-relational databases can scale horizontally.
Reveal solutions
Which of the following statements about the scalability of relational and non-relational databases is true?
Correct Answer: B) Non-relational databases are typically designed to scale horizontally more easily than relational databases.
Explanation: Non-relational databases are often designed with horizontal scalability in mind, allowing them to distribute data across multiple servers more easily. Relational databases, on the other hand, traditionally scale vertically by adding more resources to a single server, though some modern relational databases also support horizontal scaling.
Incorrect Choices:
A) Relational databases are generally easier to scale horizontally than non-relational databases.
Explanation: This is incorrect because relational databases typically face more challenges with horizontal scaling compared to non-relational databases.
C) Both relational and non-relational databases scale horizontally with equal ease.
Explanation: This is not true; non-relational databases generally have an advantage in horizontal scalability.
D) Neither relational nor non-relational databases can scale horizontally.
Explanation: This is incorrect as non-relational databases are designed to scale horizontally, and some relational databases also support horizontal scaling.
2. Document model in MongoDB#
Multiple Choice Questions#
What is a key feature of the document model data structure?
A) It uses a tabular format for data storage.
B) It allows for flexible, schema-less data storage.
C) It requires a fixed schema.
D) It is optimized for complex joins and transactions.
Reveal solutions
What is a key feature of the document model data structure?
Correct Answer: B) It allows for flexible, schema-less data storage.
Explanation: The document model supports flexible, schema-less data storage, allowing for varying data structures within the same collection.
Incorrect Choices:
A) It uses a tabular format for data storage.
Explanation: This is characteristic of relational databases, not document model databases.
C) It requires a fixed schema.
Explanation: Document model databases do not require a fixed schema.
D) It is optimized for complex joins and transactions.
Explanation: Document model databases are not typically optimized for complex joins and transactions.
In which format are data stored in a document model database like MongoDB?
A) JSON
B) XML
C) BSON
D) CSV
Reveal solutions
In which format are data stored in a document model database like MongoDB?
Correct Answer: C) BSON
Explanation: MongoDB stores data in BSON (Binary JSON) format, which is a binary representation of JSON-like documents.
Incorrect Choices:
A) JSON
Explanation: While MongoDB uses JSON-like documents, the actual storage format is BSON.
B) XML
Explanation: XML is not used for data storage in MongoDB.
D) CSV
Explanation: CSV is a format for tabular data, not used for document model databases.
What is BSON?
A) A binary representation of JSON-like documents.
B) A text-based format for storing data.
C) A query language for document databases.
D) A type of relational database schema.
Reveal solutions
What is BSON?
Correct Answer: A) A binary representation of JSON-like documents.
Explanation: BSON is a binary-encoded serialization of JSON-like documents, designed to be efficient in both storage and scanning.
Incorrect Choices:
B) A text-based format for storing data.
Explanation: BSON is a binary format, not text-based.
C) A query language for document databases.
Explanation: BSON is a data format, not a query language.
D) A type of relational database schema.
Explanation: BSON is not related to relational database schemas.
What is a collection in a document model?
A) A group of tables with fixed schemas.
B) A set of key-value pairs.
C) A group of related documents stored together.
D) A type of index used for fast querying.
Reveal solutions
What is a collection in a document model?
Correct Answer: C) A group of related documents stored together.
Explanation: In a document model database, a collection is a group of related documents stored together, similar to a table in a relational database.
Incorrect Choices:
A) A group of tables with fixed schemas.
Explanation: Collections are not groups of tables; they are groups of documents.
B) A set of key-value pairs.
Explanation: This describes a key-value store, not a collection in a document model.
D) A type of index used for fast querying.
Explanation: Collections are not indexes; they are groups of documents.
Which of the following is NOT a key feature of the document model?
A) Schema-less data storage
B) Flexible data structure
C) Fixed schema requirement
D) Hierarchical data representation
Reveal solutions
Which of the following is NOT a key feature of the document model?
Correct Answer: C) Fixed schema requirement
Explanation: The document model does not require a fixed schema, allowing for flexible and schema-less data storage.
Incorrect Choices:
A) Schema-less data storage
Explanation: Schema-less data storage is a key feature of the document model.
B) Flexible data structure
Explanation: The document model supports flexible data structures.
D) Hierarchical data representation
Explanation: The document model can represent hierarchical data structures.
3. Data modelling in MongoDB#
Multiple Choice Questions#
Why is data modelling important?
A) It simplifies the user interface design.
B) It helps in organizing and structuring data efficiently.
C) It reduces the need for data backups.
D) It eliminates the need for indexing.
Reveal solutions
Why is data modelling important?
Correct Answer: B) It helps in organizing and structuring data efficiently.
Explanation: Data modelling is crucial for organizing and structuring data in a way that supports efficient data access and management.
Incorrect Choices:
A) It simplifies the user interface design.
Explanation: Data modelling is focused on data organization, not user interface design.
C) It reduces the need for data backups.
Explanation: Data modelling does not eliminate the need for data backups.
D) It eliminates the need for indexing.
Explanation: Indexing is still necessary for efficient data retrieval, regardless of data modelling.
What is the key principle of data modelling in MongoDB?
A) Using a fixed schema for all collections.
B) Ensuring data is stored in a tabular format.
C) Designing the schema according to the application’s query patterns.
D) Avoiding the use of indexes.
Reveal solutions
What is the key principle of data modelling in MongoDB?
Correct Answer: C) Designing the schema according to the application’s query patterns.
Explanation: In MongoDB, the schema should be designed based on how the application queries the data to ensure efficient data retrieval.
Incorrect Choices:
A) Using a fixed schema for all collections.
Explanation: MongoDB supports flexible schemas.
B) Ensuring data is stored in a tabular format.
Explanation: MongoDB uses a document model, not a tabular format.
D) Avoiding the use of indexes.
Explanation: Indexes are important for efficient querying in MongoDB.
Which of the following is a technique to develop a data model in MongoDB?
A) Normalizing all data to the third normal form.
B) Using workload estimates to understand data access patterns.
C) Storing all data in a single collection.
D) Avoiding the use of relationships between data.
Reveal solutions
Which of the following is a technique to develop a data model in MongoDB?
Correct Answer: B) Using workload estimates to understand data access patterns.
Explanation: Workload estimates help in understanding how data will be accessed, which is crucial for designing an efficient data model.
Incorrect Choices:
A) Normalizing all data to the third normal form.
Explanation: Normalization is more relevant to relational databases.
C) Storing all data in a single collection.
Explanation: This is not an efficient approach and can lead to performance issues.
D) Avoiding the use of relationships between data.
Explanation: Relationships are important and should be considered in data modelling.
What is the difference between embedding and referencing in MongoDB?
A) Embedding stores related data in separate collections, while referencing stores related data in the same document.
B) Embedding stores related data in the same document, while referencing stores related data in separate collections.
C) Embedding is used for large datasets, while referencing is used for small datasets.
D) Embedding requires the use of indexes, while referencing does not.
Reveal solutions
What is the difference between embedding and referencing in MongoDB?
Correct Answer: B) Embedding stores related data in the same document, while referencing stores related data in separate collections.
Explanation: Embedding is used to store related data within the same document, while referencing involves storing related data in different collections and linking them.
Incorrect Choices:
A) Embedding stores related data in separate collections, while referencing stores related data in the same document.
Explanation: This is the opposite of the correct definition.
C) Embedding is used for large datasets, while referencing is used for small datasets.
Explanation: The choice between embedding and referencing is based on access patterns, not dataset size.
D) Embedding requires the use of indexes, while referencing does not.
Explanation: Both embedding and referencing can benefit from indexing for efficient querying.
When should you use embedding over referencing in MongoDB?
A) When the related data is frequently accessed together.
B) When the related data is rarely accessed.
C) When the related data is very large and needs to be split across multiple collections.
D) When the related data needs to be normalized.
Reveal solutions
When should you use embedding over referencing in MongoDB?
Correct Answer: A) When the related data is frequently accessed together.
Explanation: Embedding is beneficial when related data is frequently accessed together, as it reduces the need for multiple queries.
Incorrect Choices:
B) When the related data is rarely accessed.
Explanation: Referencing might be more appropriate when related data is rarely accessed together.
C) When the related data is very large and needs to be split across multiple collections.
Explanation: Referencing is typically used for large datasets that need to be split.
D) When the related data needs to be normalized.
Explanation: Normalization is more relevant to relational databases, and referencing can be used to maintain normalized data in MongoDB.
4. Schema design patterns#
Multiple Choice Questions#
Which of the following is a schema design pattern that involves storing multiple related attributes in a single document?
A) Attribute pattern
B) Bucket pattern
C) Polymorphic pattern
D) Schema versioning pattern
Reveal solutions
Which of the following is a schema design pattern that involves storing multiple related attributes in a single document?
Correct Answer: A) Attribute pattern
Explanation: The attribute pattern involves storing multiple related attributes in a single document to reduce the need for joins and improve read performance.
Incorrect Choices:
B) Bucket pattern
Explanation: The bucket pattern groups related data into a single document but is not specifically about storing multiple related attributes.
C) Polymorphic pattern
Explanation: The polymorphic pattern handles different types of related documents within the same collection.
D) Schema versioning pattern
Explanation: The schema versioning pattern manages changes to the schema over time.
What is a key advantage of the bucket pattern in schema design?
A) It allows for flexible, schema-less data storage.
B) It reduces the number of documents by grouping related data.
C) It supports multiple versions of a schema.
D) It allows for different types of documents in the same collection.
Reveal solutions
What is a key advantage of the bucket pattern in schema design?
Correct Answer: B) It reduces the number of documents by grouping related data.
Explanation: The bucket pattern reduces the number of documents by grouping related data into a single document, which can improve read performance and reduce the number of queries.
Incorrect Choices:
A) It allows for flexible, schema-less data storage.
Explanation: This is a general feature of document databases, not specific to the bucket pattern.
C) It supports multiple versions of a schema.
Explanation: This is a feature of the schema versioning pattern.
D) It allows for different types of documents in the same collection.
Explanation: This is a feature of the polymorphic pattern.
Which schema design pattern is best suited for handling different types of related documents within the same collection?
A) Attribute pattern
B) Bucket pattern
C) Polymorphic pattern
D) Schema versioning pattern
Reveal solutions
Which schema design pattern is best suited for handling different types of related documents within the same collection?
Correct Answer: C) Polymorphic pattern
Explanation: The polymorphic pattern is designed to handle different types of related documents within the same collection, allowing for more flexible data modeling.
Incorrect Choices:
A) Attribute pattern
Explanation: The attribute pattern is about storing multiple related attributes in a single document.
B) Bucket pattern
Explanation: The bucket pattern groups related data into a single document but does not specifically handle different types of documents.
D) Schema versioning pattern
Explanation: The schema versioning pattern manages changes to the schema over time.
What is a common use case for the schema versioning pattern?
A) Storing time-series data
B) Managing changes to the schema over time
C) Grouping related data into a single document
D) Handling different types of related documents
Reveal solutions
What is a common use case for the schema versioning pattern?
Correct Answer: B) Managing changes to the schema over time
Explanation: The schema versioning pattern is used to manage changes to the schema over time, allowing for backward compatibility and gradual schema evolution.
Incorrect Choices:
A) Storing time-series data
Explanation: This is not specific to schema versioning.
C) Grouping related data into a single document
Explanation: This is a feature of the bucket pattern.
D) Handling different types of related documents
Explanation: This is a feature of the polymorphic pattern.
Which of the following is a disadvantage of the subset pattern in schema design?
A) It can lead to large document sizes.
B) It may require multiple queries to retrieve all related data.
C) It does not support schema evolution.
D) It complicates the management of different document types.
Reveal solutions
Which of the following is a disadvantage of the subset pattern in schema design?
Correct Answer: B) It may require multiple queries to retrieve all related data.
Explanation: The subset pattern can lead to the need for multiple queries to retrieve all related data, which can impact performance.
Incorrect Choices:
A) It can lead to large document sizes.
Explanation: This is more likely a disadvantage of the attribute or bucket pattern.
C) It does not support schema evolution.
Explanation: This is not specific to the subset pattern.
D) It complicates the management of different document types.
Explanation: This is more likely a disadvantage of the polymorphic pattern.
5. CRUD operations#
Scenario-Based Fill in the Blank Questions#
You are working on a Python application that interacts with a MongoDB database to manage a collection of books. Each book document contains fields such as title
, author
, year
, and genre
. You need to perform various CRUD operations using PyMongo.
Sample Code and Questions#
Inserting a Single Document
Description: You want to add a new book to the collection with the title “To Kill a Mockingbird”, authored by Harper Lee, published in 1960, and categorized under the genre “Fiction”.
from pymongo import MongoClient client = MongoClient('mongodb://localhost:27017/') db = client['library'] collection = db['books'] new_book = { "title": "To Kill a Mockingbird", "author": "Harper Lee", "year": 1960, "genre": "Fiction" } result = collection.__________(__________) print(f"Inserted document ID: {result.inserted_id}")
Reveal solutions
Inserting a Single Document
from pymongo import MongoClient client = MongoClient('mongodb://localhost:27017/') db = client['library'] collection = db['books'] new_book = { "title": "To Kill a Mockingbird", "author": "Harper Lee", "year": 1960, "genre": "Fiction" } result = collection.insert_one(new_book) print(f"Inserted document ID: {result.inserted_id}")
Updating Multiple Documents
Description: You would like to change the genre of all books currently categorized as “Fiction” to “Classic Fiction”.
query = _______ new_values = {"$set": {"genre": "Classic Fiction"}} result = collection.__________(__________, __________) print(f"Documents updated: {result.modified_count}")
Reveal solutions
Updating Multiple Documents
query = {"genre": "Fiction"} new_values = {"$set": {"genre": "Classic Fiction"}} result = collection.update_many(query, new_values) print(f"Documents updated: {result.modified_count}")
Deleting a Single Document
Description: You need to delete the book titled “To Kill a Mockingbird” from the collection.
query = _______ result = collection.__________(__________) print(f"Documents deleted: {result.deleted_count}")
Reveal solutions
Deleting a Single Document
query = {"title": "To Kill a Mockingbird"} result = collection.delete_one(query) print(f"Documents deleted: {result.deleted_count}")
Finding All Documents Matching a Query
Description: You want to find all books authored by Harper Lee.
query = _______ documents = collection.__________(__________) for doc in documents: print(doc)
Reveal solutions
Finding All Documents Matching a Query
query = {"author": "Harper Lee"} documents = collection.find(query) for doc in documents: print(doc)
Replacing an Entire Document
Description: You need to replace the entire document for the book titled “To Kill a Mockingbird” with updated information, changing its genre to “Classic Fiction”.
query = _______ new_document = { "title": "To Kill a Mockingbird", "author": "Harper Lee", "year": 1960, "genre": "Classic Fiction" } result = collection.__________(__________, __________) print(f"Documents replaced: {result.modified_count}")
Reveal solutions
Replacing an Entire Document
query = {"title": "To Kill a Mockingbird"} new_document = { "title": "To Kill a Mockingbird", "author": "Harper Lee", "year": 1960, "genre": "Classic Fiction" } result = collection.replace_one(query, new_document) print(f"Documents replaced: {result.modified_count}")
6. Aggregations in MongoDB#
Scenario#
You are working on a Python application that interacts with a MongoDB database to manage a collection of books. Each book document contains fields such as title
, author
, year
, and genre
. You need to perform various aggregation operations using PyMongo.
Sample Code and Questions#
Grouping Documents by Genre
Description: You want to group the books by their genre and count the number of books in each genre.
pipeline = [ {"$__________": {"_id": "$__________", "count": {"$sum": 1}}} ] result = collection.aggregate(pipeline) for doc in result: print(doc)
Reveal solutions
Grouping Documents by Genre
pipeline = [ {"$group": {"_id": "$genre", "count": {"$sum": 1}}} ] result = collection.aggregate(pipeline) for doc in result: print(doc)
Sorting Documents by Year
Description: You want to sort the books by their publication year in descending order.
pipeline = [ {"$__________": {"__________": -1}} ] result = collection.aggregate(pipeline) for doc in result: print(doc)
Reveal solutions
Sorting Documents by Year
pipeline = [ {"$sort": {"year": -1}} ] result = collection.aggregate(pipeline) for doc in result: print(doc)
Projecting Specific Fields
Description: You want to include only the
title
andauthor
fields in the output documents.
pipeline = [ {"$__________": {"__________": 1, "__________": 1, "_id": 0}} ] result = collection.aggregate(pipeline) for doc in result: print(doc)
Reveal solutions
Projecting Specific Fields
pipeline = [ {"$project": {"title": 1, "author": 1, "_id": 0}} ] result = collection.aggregate(pipeline) for doc in result: print(doc)
Limiting the Number of Documents
Description: You want to limit the number of documents in the output to 5.
pipeline = [ {"$__________": __________} ] result = collection.aggregate(pipeline) for doc in result: print(doc)
Reveal solutions
Limiting the Number of Documents
pipeline = [ {"$limit": 5} ] result = collection.aggregate(pipeline) for doc in result: print(doc)
Combining Group, Sort, and Limit Stages
Description: You want to group the books by genre, sort the genres by the number of books in descending order, and limit the output to the top 3 genres.
pipeline = [ {"$__________": {"_id": "$__________", "count": {"$sum": 1}}}, {"$__________": {"__________": -1}}, {"$__________": __________} ] result = collection.aggregate(pipeline) for doc in result: print(doc)
Reveal solutions
Combining Group, Sort, and Limit Stages
pipeline = [ {"$group": {"_id": "$genre", "count": {"$sum": 1}}}, {"$sort": {"count": -1}}, {"$limit": 3} ] result = collection.aggregate(pipeline) for doc in result: print(doc)
7. Transactions in MongoDB#
Fill in the Blank Exercise on Transactions in MongoDB#
Scenario#
You are working on a Python application that interacts with a MongoDB database to manage a banking system. You need to implement a function to transfer funds between two accounts and ensure that the operation is performed atomically using transactions.
Exercise#
Complete the following code to implement the transfer_funds
function and perform a transaction to transfer $100 from account_1
to account_2
.
from pymongo import MongoClient
from pymongo.errors import ConnectionFailure, OperationFailure
client = MongoClient('mongodb://localhost:27017/')
db = client['bank']
accounts_collection = db['accounts']
transactions_collection = db['transactions']
def transfer_funds(session, from_account, to_account, amount):
# Check if the from_account has sufficient balance
from_account_doc = accounts_collection.find_one({"account_id": from_account}, session=__________)
if from_account_doc["balance"] < amount:
raise ValueError("Insufficient funds")
# Transfer funds
accounts_collection.___________(
{"account_id": from_account},
{"$__________": {"balance": -amount}},
session=__________
)
accounts_collection.___________(
{"account_id": to_account},
{"$__________": {"balance": amount}},
session=__________
)
transactions_collection.___________(
{
"from_account": from_account,
"to_account": to_account,
"amount": amount,
"status": "completed"
},
session=__________
)
# Start a session
with client.start_session() as session:
# Start a transaction
with session.start_transaction():
try:
transfer_funds(__________, "account_1", "account_2", 100)
except (ConnectionFailure, OperationFailure) as e:
print(f"Transaction aborted due to: {e}")
session.___________()
else:
session.___________()
print("Transaction committed successfully")
Solutions#
Reveal solutions
from pymongo import MongoClient
from pymongo.errors import ConnectionFailure, OperationFailure
client = MongoClient('mongodb://localhost:27017/')
db = client['bank']
accounts_collection = db['accounts']
transactions_collection = db['transactions']
def transfer_funds(session, from_account, to_account, amount):
# Check if the from_account has sufficient balance
from_account_doc = accounts_collection.find_one({"account_id": from_account}, session=session)
if from_account_doc["balance"] < amount:
raise ValueError("Insufficient funds")
# Transfer funds
accounts_collection.update_one(
{"account_id": from_account},
{"$inc": {"balance": -amount}},
session=session
)
accounts_collection.update_one(
{"account_id": to_account},
{"$inc": {"balance": amount}},
session=session
)
transactions_collection.insert_one(
{
"from_account": from_account,
"to_account": to_account,
"amount": amount,
"status": "completed"
},
session=session
)
# Start a session
with client.start_session() as session:
# Start a transaction
with session.start_transaction():
try:
transfer_funds(session, "account_1", "account_2", 100)
except (ConnectionFailure, OperationFailure) as e:
print(f"Transaction aborted due to: {e}")
session.abort_transaction()
else:
session.commit_transaction()
print("Transaction committed successfully")