Database Design & SQL

Chapter 1: The Two Worlds - SQL vs NoSQL

Every application needs a place to remember things. We call this a database.

1.1 SQL (Relational) - The Strict Librarian

SQL databases (PostgreSQL, MySQL, SQL Server) are like a strict spreadsheet. You must define your columns ("Name", "Age", "Email") before you add data. Everything is connected nicely.

Analogy: An Excel sheet where you can't type text into a "Number" column.

1.2 NoSQL (Non-Relational) - The Flexible Bucket

NoSQL databases (MongoDB, DynamoDB, Redis) are like a cardboard box. You can throw anything in—documents, JSON, images. You don't need to plan ahead.

Analogy: A folder on your computer. One file can be a photo, another a text file.

Chapter 2: SQL Concepts - Tables and Keys

2.1 Tables, Rows, and Columns

  • Table: A collection of things (e.g., "Users").
  • Row: One specific thing (e.g., "User Bob").
  • Column: A property (e.g., "Email Address").

2.2 Primary and Foreign Keys

Databases need to know how data relates to each other.

🔗 Real World: Student IDs

Primary Key (PK): Every student has a unique "Student ID" (e.g., 101). No two students have the same ID. This uniquely identifies you.

Foreign Key (FK): In the "Class Roster" table, we don't write your full name again. We just write "Student ID: 101". This points back to your main record.

Chapter 3: ACID Transactions - The Safety Promise

A Transaction is a sequence of database operations that must be treated as a single unit.

💸 Real World: The Bank Transfer

You transfer $100 to your friend.

  1. Subtract $100 from Your Account.
  2. Add $100 to Friend's Account.

The Danger: What if the power goes out after Step 1? You lost money, friend got nothing!

ACID Solution: The database guarantees Atomicity. If Step 2 fails, Step 1 is UNDONE automatically. It's all or nothing.

The ACID Rules

  • A - Atomicity: All operations happen, or none happen.
  • C - Consistency: Data always follows the rules (constraints).
  • I - Isolation: My transaction doesn't mess up yours (concurrency).
  • D - Durability: Once the database says "Saved", it stays saved even if the server explodes.

Chapter 4: Indexes - Finding Needles in Haystacks

By default, databases are slow at finding specific rows. They have to read every single row to find "Bob".

📖 Real World: The Textbook Index

To find "Photosynthesis" in a 1,000-page book:

  1. Slow Way (Full Table Scan): Read page 1, page 2, page 3... until you find it.
  2. Fast Way (Index): Go to the back of the book. Look under "P". It says "Page 542". Jump directly there.

The Trade-Off

If indexes make reading fast, why not index everything?

Because indexes make WRITING slow.

When you add a new chapter to the book, you also have to go to the back and update the index. If you have 10 indexes, you have to update 10 lists!

Chapter 5: Scaling - When One Server Isn't Enough

As you get millions of users, a single database server will get overwhelmed (CPU/RAM limit).

5.1 Vertical Scaling (Scaling Up)

Concept: Buy a bigger, expensive computer.

Pros: Easy (no code changes).

Cons: Expensive. Has a limit (you can't buy infinite RAM).

5.2 Horizontal Scaling (Sharding)

Concept: Split data across multiple smaller servers.

📚 Real World: Library Sections

The library is full. You build a second building.

  • Building A: Books starting with A-M.
  • Building B: Books starting with N-Z.

You have double the capacity! But fetching "All books by Author" is now harder if you don't know the author's name.

5.3 Read Replicas

Concept: One "Master" server handles Writes. Five "Slave" servers handle Reads.

Pros: Great for read-heavy apps (Twitter, Facebook).
Cons: "Replication Lag". You might post a comment, refresh the page, and not see it immediately because the Slave hasn't updated yet.

Chapter 6: NoSQL types - Choosing the Right Tool

Type Example Use Case
Document MongoDB Product catalogs, User profiles (flexible data).
Key-Value Redis Caching, Session management (super fast).
Column-Family Cassandra Massive write volume (Chat logs, IoT sensors).
Graph Neo4j Social networks, Recommendation engines.

Chapter 7: Summary Checklist

Database Best Practices:

  • [ ] Index Foreign Keys and Search Fields. If you filter by `Email`, index it.
  • [ ] Avoid N+1 Queries. Fetch data in batches, not loops.
  • [ ] Use Transactions. Always use transactions for money or critical multi-step updates.
  • [ ] Don't store Binary Files. Store images in S3 (cloud storage), not the database. Store the URL in the DB.
  • [ ] Backup. If you don't have a tested backup, you don't have data.

Quick Review

Database design is choosing a schema and access strategy that balances integrity, query performance, and scalability, allowing us to meet correctness and latency requirements as data grows.

✅ SQL vs NoSQL

  • SQL: structured schema, relations, joins, strong consistency tools.
  • NoSQL: flexible shapes and/or massive scale patterns (document, key-value, column, graph).

✅ Keys and relationships

  • Primary key: unique identity for a row.
  • Foreign key: links one table to another (enforces integrity when used).

✅ ACID (what transactions guarantee)

  • Atomicity: all-or-nothing.
  • Consistency: rules stay true before/after.
  • Isolation: concurrent work doesn’t corrupt results.
  • Durability: committed data survives crashes.

✅ Indexes and scaling

  • Index: speeds reads, costs writes and storage (index what you filter/join on).
  • Scale up: bigger machine (simple, limited).
  • Scale out: sharding and read replicas (more capacity, more complexity).