Lecture 1: Introduction to Non-Relational Databases#

Learning objectives#

By the end of this lecture, students should be able to:

  • Understand how relational and non-relational database differ in terms of data structure, schema, and query language.

  • Explain the pros & cons of each type of database and its use case

  • Explain different types of non-relational databases (i.e., column-base, key-pair, graph, document)

Slides#

Note

Download a PDF version here

Supplemental materials#

Key differences between relational and non-relational databases#

Feature

SQL Databases (Relational)

NoSQL Databases (Non-Relational)

Data Model

Structured data with predefined schema (tables, rows, columns)

Flexible schema, supports unstructured and semi-structured data

Schema Flexibility

Rigid schema, requires predefined structure

Schema-less, allowing for dynamic and flexible data models

Scalability

Vertical scaling (scaling up by adding more power to a single server)

Horizontal scaling (scaling out by adding more servers)

ACID Compliance

Strong ACID (Atomicity, Consistency, Isolation, Durability) support

Often favors eventual consistency over strong ACID compliance

Performance

Efficient for complex queries and transactions

Optimized for high-volume reads/writes and specific use cases

Querying

Powerful and standardized SQL for complex queries

Varies by database type; may require custom query languages

Data Relationships

Strong support for complex joins and relationships

Limited or no support for complex joins, relationships are embedded or linked

Data Integrity

Enforces data integrity through constraints and normalization

Data integrity is managed by the application, denormalization is common

Use Case Suitability

Best for structured data, complex transactions, and relationships

Best for big data, real-time analytics, unstructured data, and high scalability needs

Maintenance

Requires more maintenance (e.g., schema changes, indexing, tuning)

Generally requires less maintenance, but can be complex in distributed systems

Learning Curve

Standardized and well-documented; widely taught

Diverse models with a steeper learning curve due to lack of standardization

Ecosystem

Mature ecosystem with a wide range of tools and support

Emerging ecosystem, tools and support vary by database type

Cost

Can be expensive at scale due to the need for powerful hardware

Cost-effective for large-scale systems, often runs on commodity hardware

Examples

MySQL, PostgreSQL, Oracle, SQL Server

MongoDB, Cassandra, Redis, Neo4j

Different types of non-relational databases and their features#

Type

Description

Features

Key Usage

Examples

Document Store

Stores data in documents (typically JSON or BSON format).

Flexible schema, hierarchical data structures, supports nested data.

Content management, user profiles, catalogs.

MongoDB, CouchDB, Firebase Firestore

Key-Value Store

Stores data as key-value pairs.

Simple data model, high performance for lookups by key, easy to scale horizontally.

Session management, caching, real-time data.

Redis, DynamoDB, Riak

Column Family Store

Stores data in columns rather than rows.

Optimized for read and write performance on large datasets, supports flexible column families.

Time-series data, analytics, real-time big data.

Apache Cassandra, HBase, ScyllaDB

Graph Database

Stores data as nodes and edges in a graph structure.

Optimized for relationships and connections, supports complex queries on graph data.

Social networks, recommendation systems, fraud detection.

Neo4j, ArangoDB, Amazon Neptune