Databases are a critical component of many applications, and are used to store and manage data in a structured way. There are many different types of databases, each with their own strengths and weaknesses.

Data Gravity

Data gravity describes the concept that data pulls applications and services towards it. This means, that applications and services live where the data is, and not the other way around. This is especially relevant for cloud migrations, where moving data can be expensive and time-consuming.

Scaling

Vertical Scaling

More resources for a single instance.

Pros: Simple
Cons: Single Point of Failure, Expensive (high costs), No Elasticity (cannot scale up and down dynamically)

Horizontal Scaling

Fragmentation of data across multiple instances.

Pros: Robust, Elasticity
Cons: Complex

CAP Theorem

The CAP theorem states that it is impossible for a distributed system to simultaneously provide all three of the following guarantees:

  • Sequential Consistency: All nodes see the same data at the same time.
  • Availability: Every request receives a response about whether it was successful or failed.
  • Partition Tolerance: The system continues to operate despite arbitrary message loss or failure of part of the system.

Selecting two does not necessarily mean that the third is not present, but rather that it is not guaranteed. Systems can be designed to switch between AP and CP modes depending on the situation.

The CAP theorem should be used to think about the trade-offs between consistency, availability, and partition tolerance when designing distributed systems. The choice of which two to prioritize depends on the specific requirements of the system.

ACID vs BASE

ACID

  • Atomicity: All or nothing.
  • Consistency: Data is always in a valid state.
  • Isolation: Concurrent transactions do not interfere with each other.
  • Durability: Committed data is never lost.

BASE

  • Basically Available: The system is always available.
  • Available: The system is always available.
  • Soft State: The state of the system may change over time.
  • Eventually Consistent: The system will eventually become consistent.

Types of Databases

There are two main types of databases: SQL and NoSQL.

SQL

  • Relational
  • Analytical (OLAP)

NoSQL

  • Document
  • Key-Value
  • Column-Family
  • Graph

Comparison

SQLNoSQL
ModelRelationalNon-Relational
DataStructuredSemi-Structured
FlexibilityStrict SchemaDynamic Schema
TransactionsACIDmostly BASE
ConsistencyStrongEventual to Strong
AvailabilityConsistency-prioritizedBasic availability
ScaleVerticalHorizontal