05 Databases
Databases are a critical component of many applications, and are used to store and manage data in a structured way. There are many different types of databases, each with their own strengths and weaknesses.
Data Gravity
Data gravity describes the concept that data pulls applications and services towards it. This means, that applications and services live where the data is, and not the other way around. This is especially relevant for cloud migrations, where moving data can be expensive and time-consuming.
Scaling
Vertical Scaling
More resources for a single instance.
Pros: Simple
Cons: Single Point of Failure, Expensive (high costs), No Elasticity (cannot scale up and down dynamically)
Horizontal Scaling
Fragmentation of data across multiple instances.
Pros: Robust, Elasticity
Cons: Complex
CAP Theorem
The CAP theorem states that it is impossible for a distributed system to simultaneously provide all three of the following guarantees:
- Sequential Consistency: All nodes see the same data at the same time.
- Availability: Every request receives a response about whether it was successful or failed.
- Partition Tolerance: The system continues to operate despite arbitrary message loss or failure of part of the system.
Selecting two does not necessarily mean that the third is not present, but rather that it is not guaranteed. Systems can be designed to switch between AP and CP modes depending on the situation.
The CAP theorem should be used to think about the trade-offs between consistency, availability, and partition tolerance when designing distributed systems. The choice of which two to prioritize depends on the specific requirements of the system.
ACID vs BASE
ACID
- Atomicity: All or nothing.
- Consistency: Data is always in a valid state.
- Isolation: Concurrent transactions do not interfere with each other.
- Durability: Committed data is never lost.
BASE
- Basically Available: The system is always available.
- Available: The system is always available.
- Soft State: The state of the system may change over time.
- Eventually Consistent: The system will eventually become consistent.
Types of Databases
There are two main types of databases: SQL and NoSQL.
SQL
- Relational
- Analytical (OLAP)
NoSQL
- Document
- Key-Value
- Column-Family
- Graph
Comparison
| SQL | NoSQL | |
|---|---|---|
| Model | Relational | Non-Relational |
| Data | Structured | Semi-Structured |
| Flexibility | Strict Schema | Dynamic Schema |
| Transactions | ACID | mostly BASE |
| Consistency | Strong | Eventual to Strong |
| Availability | Consistency-prioritized | Basic availability |
| Scale | Vertical | Horizontal |