Database Design & SQL
Chapter 1: The Two Worlds - SQL vs NoSQL
Every application needs a place to remember things. We call this a database.
1.1 SQL (Relational) - The Strict Librarian
SQL databases (PostgreSQL, MySQL, SQL Server) are like a strict spreadsheet. You must define your columns ("Name", "Age", "Email") before you add data. Everything is connected nicely.
Analogy: An Excel sheet where you can't type text into a "Number" column.
1.2 NoSQL (Non-Relational) - The Flexible Bucket
NoSQL databases (MongoDB, DynamoDB, Redis) are like a cardboard box. You can throw anything in—documents, JSON, images. You don't need to plan ahead.
Analogy: A folder on your computer. One file can be a photo, another a text file.
Chapter 2: SQL Concepts - Tables and Keys
2.1 Tables, Rows, and Columns
- Table: A collection of things (e.g., "Users").
- Row: One specific thing (e.g., "User Bob").
- Column: A property (e.g., "Email Address").
2.2 Primary and Foreign Keys
Databases need to know how data relates to each other.
🔗 Real World: Student IDsPrimary Key (PK): Every student has a unique "Student ID" (e.g., 101). No two students have the same ID. This uniquely identifies you.
Foreign Key (FK): In the "Class Roster" table, we don't write your full name again. We just write "Student ID: 101". This points back to your main record.
Chapter 3: ACID Transactions - The Safety Promise
A Transaction is a sequence of database operations that must be treated as a single unit.
💸 Real World: The Bank TransferYou transfer $100 to your friend.
- Subtract $100 from Your Account.
- Add $100 to Friend's Account.
The Danger: What if the power goes out after Step 1? You lost money, friend got nothing!
ACID Solution: The database guarantees Atomicity. If Step 2 fails, Step 1 is UNDONE automatically. It's all or nothing.
The ACID Rules
- A - Atomicity: All operations happen, or none happen.
- C - Consistency: Data always follows the rules (constraints).
- I - Isolation: My transaction doesn't mess up yours (concurrency).
- D - Durability: Once the database says "Saved", it stays saved even if the server explodes.
Chapter 4: Indexes - Finding Needles in Haystacks
By default, databases are slow at finding specific rows. They have to read every single row to find "Bob".
📖 Real World: The Textbook IndexTo find "Photosynthesis" in a 1,000-page book:
- Slow Way (Full Table Scan): Read page 1, page 2, page 3... until you find it.
- Fast Way (Index): Go to the back of the book. Look under "P". It says "Page 542". Jump directly there.
The Trade-Off
If indexes make reading fast, why not index everything?
Because indexes make WRITING slow.
When you add a new chapter to the book, you also have to go to the back and update the index. If you have 10 indexes, you have to update 10 lists!
Chapter 5: Scaling - When One Server Isn't Enough
As you get millions of users, a single database server will get overwhelmed (CPU/RAM limit).
5.1 Vertical Scaling (Scaling Up)
Concept: Buy a bigger, expensive computer.
Pros: Easy (no code changes).
Cons: Expensive. Has a limit (you can't buy infinite RAM).
5.2 Horizontal Scaling (Sharding)
Concept: Split data across multiple smaller servers.
📚 Real World: Library SectionsThe library is full. You build a second building.
- Building A: Books starting with A-M.
- Building B: Books starting with N-Z.
You have double the capacity! But fetching "All books by Author" is now harder if you don't know the author's name.
5.3 Read Replicas
Concept: One "Master" server handles Writes. Five "Slave" servers handle Reads.
Pros: Great for read-heavy apps (Twitter, Facebook).
Cons: "Replication Lag". You might post a comment, refresh the page, and not see it immediately because the Slave hasn't updated yet.
Chapter 6: NoSQL types - Choosing the Right Tool
| Type | Example | Use Case |
|---|---|---|
| Document | MongoDB | Product catalogs, User profiles (flexible data). |
| Key-Value | Redis | Caching, Session management (super fast). |
| Column-Family | Cassandra | Massive write volume (Chat logs, IoT sensors). |
| Graph | Neo4j | Social networks, Recommendation engines. |
Chapter 7: Summary Checklist
Database Best Practices:
- [ ] Index Foreign Keys and Search Fields. If you filter by `Email`, index it.
- [ ] Avoid N+1 Queries. Fetch data in batches, not loops.
- [ ] Use Transactions. Always use transactions for money or critical multi-step updates.
- [ ] Don't store Binary Files. Store images in S3 (cloud storage), not the database. Store the URL in the DB.
- [ ] Backup. If you don't have a tested backup, you don't have data.
Quick Review
Database design is choosing a schema and access strategy that balances integrity, query performance, and scalability, allowing us to meet correctness and latency requirements as data grows.
✅ SQL vs NoSQL
- SQL: structured schema, relations, joins, strong consistency tools.
- NoSQL: flexible shapes and/or massive scale patterns (document, key-value, column, graph).
✅ Keys and relationships
- Primary key: unique identity for a row.
- Foreign key: links one table to another (enforces integrity when used).
✅ ACID (what transactions guarantee)
- Atomicity: all-or-nothing.
- Consistency: rules stay true before/after.
- Isolation: concurrent work doesn’t corrupt results.
- Durability: committed data survives crashes.
✅ Indexes and scaling
- Index: speeds reads, costs writes and storage (index what you filter/join on).
- Scale up: bigger machine (simple, limited).
- Scale out: sharding and read replicas (more capacity, more complexity).