Lecture 1: Introduction to Non-Relational Databases#
Learning objectives#
By the end of this lecture, students should be able to:
Understand how relational and non-relational database differ in terms of data structure, schema, and query language.
Explain the pros & cons of each type of database and its use case
Explain different types of non-relational databases (i.e., column-base, key-pair, graph, document)
Slides#
Note
Download a PDF version here
Supplemental materials#
Key differences between relational and non-relational databases#
Feature |
SQL Databases (Relational) |
NoSQL Databases (Non-Relational) |
---|---|---|
Data Model |
Structured data with predefined schema (tables, rows, columns) |
Flexible schema, supports unstructured and semi-structured data |
Schema Flexibility |
Rigid schema, requires predefined structure |
Schema-less, allowing for dynamic and flexible data models |
Scalability |
Vertical scaling (scaling up by adding more power to a single server) |
Horizontal scaling (scaling out by adding more servers) |
ACID Compliance |
Strong ACID (Atomicity, Consistency, Isolation, Durability) support |
Often favors eventual consistency over strong ACID compliance |
Performance |
Efficient for complex queries and transactions |
Optimized for high-volume reads/writes and specific use cases |
Querying |
Powerful and standardized SQL for complex queries |
Varies by database type; may require custom query languages |
Data Relationships |
Strong support for complex joins and relationships |
Limited or no support for complex joins, relationships are embedded or linked |
Data Integrity |
Enforces data integrity through constraints and normalization |
Data integrity is managed by the application, denormalization is common |
Use Case Suitability |
Best for structured data, complex transactions, and relationships |
Best for big data, real-time analytics, unstructured data, and high scalability needs |
Maintenance |
Requires more maintenance (e.g., schema changes, indexing, tuning) |
Generally requires less maintenance, but can be complex in distributed systems |
Learning Curve |
Standardized and well-documented; widely taught |
Diverse models with a steeper learning curve due to lack of standardization |
Ecosystem |
Mature ecosystem with a wide range of tools and support |
Emerging ecosystem, tools and support vary by database type |
Cost |
Can be expensive at scale due to the need for powerful hardware |
Cost-effective for large-scale systems, often runs on commodity hardware |
Examples |
MySQL, PostgreSQL, Oracle, SQL Server |
MongoDB, Cassandra, Redis, Neo4j |
Different types of non-relational databases and their features#
Type |
Description |
Features |
Key Usage |
Examples |
---|---|---|---|---|
Document Store |
Stores data in documents (typically JSON or BSON format). |
Flexible schema, hierarchical data structures, supports nested data. |
Content management, user profiles, catalogs. |
MongoDB, CouchDB, Firebase Firestore |
Key-Value Store |
Stores data as key-value pairs. |
Simple data model, high performance for lookups by key, easy to scale horizontally. |
Session management, caching, real-time data. |
Redis, DynamoDB, Riak |
Column Family Store |
Stores data in columns rather than rows. |
Optimized for read and write performance on large datasets, supports flexible column families. |
Time-series data, analytics, real-time big data. |
Apache Cassandra, HBase, ScyllaDB |
Graph Database |
Stores data as nodes and edges in a graph structure. |
Optimized for relationships and connections, supports complex queries on graph data. |
Social networks, recommendation systems, fraud detection. |
Neo4j, ArangoDB, Amazon Neptune |