Lecture 1: Introduction to Non-Relational Databases

Contents

Lecture 1: Introduction to Non-Relational Databases#

Learning objectives#

By the end of this lecture, students should be able to:

Understand how relational and non-relational database differ in terms of data structure, schema, and query language.
Explain the pros & cons of each type of database and its use case
Explain different types of non-relational databases (i.e., column-base, key-pair, graph, document)

Slides#

Note

Download a PDF version here

Supplemental materials#

Key differences between relational and non-relational databases#

Feature	SQL Databases (Relational)	NoSQL Databases (Non-Relational)
Data Model	Structured data with predefined schema (tables, rows, columns)	Flexible schema, supports unstructured and semi-structured data
Schema Flexibility	Rigid schema, requires predefined structure	Schema-less, allowing for dynamic and flexible data models
Scalability	Vertical scaling (scaling up by adding more power to a single server)	Horizontal scaling (scaling out by adding more servers)
ACID Compliance	Strong ACID (Atomicity, Consistency, Isolation, Durability) support	Often favors eventual consistency over strong ACID compliance
Performance	Efficient for complex queries and transactions	Optimized for high-volume reads/writes and specific use cases
Querying	Powerful and standardized SQL for complex queries	Varies by database type; may require custom query languages
Data Relationships	Strong support for complex joins and relationships	Limited or no support for complex joins, relationships are embedded or linked
Data Integrity	Enforces data integrity through constraints and normalization	Data integrity is managed by the application, denormalization is common
Use Case Suitability	Best for structured data, complex transactions, and relationships	Best for big data, real-time analytics, unstructured data, and high scalability needs
Maintenance	Requires more maintenance (e.g., schema changes, indexing, tuning)	Generally requires less maintenance, but can be complex in distributed systems
Learning Curve	Standardized and well-documented; widely taught	Diverse models with a steeper learning curve due to lack of standardization
Ecosystem	Mature ecosystem with a wide range of tools and support	Emerging ecosystem, tools and support vary by database type
Cost	Can be expensive at scale due to the need for powerful hardware	Cost-effective for large-scale systems, often runs on commodity hardware
Examples	MySQL, PostgreSQL, Oracle, SQL Server	MongoDB, Cassandra, Redis, Neo4j

Different types of non-relational databases and their features#

Type	Description	Features	Key Usage	Examples
Document Store	Stores data in documents (typically JSON or BSON format).	Flexible schema, hierarchical data structures, supports nested data.	Content management, user profiles, catalogs.	MongoDB, CouchDB, Firebase Firestore
Key-Value Store	Stores data as key-value pairs.	Simple data model, high performance for lookups by key, easy to scale horizontally.	Session management, caching, real-time data.	Redis, DynamoDB, Riak
Column Family Store	Stores data in columns rather than rows.	Optimized for read and write performance on large datasets, supports flexible column families.	Time-series data, analytics, real-time big data.	Apache Cassandra, HBase, ScyllaDB
Graph Database	Stores data as nodes and edges in a graph structure.	Optimized for relationships and connections, supports complex queries on graph data.	Social networks, recommendation systems, fraud detection.	Neo4j, ArangoDB, Amazon Neptune