On this page

00 Intro

Remote APIs

Remote APIs are used to access services over a network. Over time, different types of remote APIs have been developed.

RPC (Remote Procedure Call)
Message-based APIs (synchronous and asynchronous)
Shared repositories (e.g. tuple spaces)

Challenges

Remote APIs can be challenging to work with.

Heterogeneity

Different systems have different data representations, communication protocols, and programming languages.

Byte order (big-endian, little-endian), line endings, character encodings, …
Computer hardware
Operating systems
Programming languages

Solutions:

Internet protocols mask differences in the underlying networks.
Middleware provides a common interface to different systems (also provides language interoperability).

Latency

Networks have latency, which can be significant.

Remote invocation can be orders of magnitude slower than local invocation.

Solutions:

Chunking: Splitting large messages into smaller chunks.
Caching: Store results of remote invocations locally.
Asynchronous communication: Send a request and continue working, receive the result later.

Error Handling

Remote APIs can fail in different ways.

Overloads, timeouts, network failures, …
Graceful disconnections

Solutions:

Corrupted messages can be detected using checksums.
Sequence numbers can be used to detect lost messages.
Idempotent operations can be retried and simplify error handling.

Security

Remote APIs are exposed to the network and can be attacked.

Confidentiality: Data should not be accessible to unauthorized parties.
Integrity: Data should not be tampered with.
Authenticity: Data should be from a trusted source.

Solutions:

Authentication: Prove the identity of the sender.
Authorization: Determine what the sender is allowed to do.
Encryption: Protect data from unauthorized access.

Scalability

Remote APIs should be able to handle a large number of clients.

Stability: The system should remain stable under heavy load.
Cost: The system should be cost-effective to scale (cost should scale at most linearly with users).

Solutions:

Load balancing: Distribute requests across multiple servers.
Clustering: Multiple servers work together to provide a single service.
Efficient algorithms: Use efficient algorithms to handle large amounts of data.

Concurrency

Distributed systems are concurrent by nature.

Multiple clients can access the same resource at the same time.
Syncronizing access to resources.

Solutions:

Locking: Prevent multiple clients from accessing a resource at the same time.
Lock-free algorithms: Algorithms that do not require locks.
Actor model: Model of concurrent computation that treats actors as the universal primitives of concurrent computation.

Consistency

Distributed systems can have inconsistent states.

Update consistency: Several processes access and update the same data.
Replication consistency: Data is replicated across multiple nodes.
Cache consistency: Data is cached in multiple locations.
Clock consistency: Different nodes have slightly different clocks/times.

Solutions:

Transactions: A sequence of operations that are executed as a single unit.
Consensus algorithms: Algorithms that allow a group of nodes to agree on a single value.

Distributed Algorithms

Example of distributed algorithm to calculate the GCD of numbers in a network.

Steps:

Each node sends its number to its neighbors.
When receiving a number, a node compares it to its own number. If the received number is smaller, the node calculates its own number to be n = (n-1)%x+1
Repeat

11 Redux

01 Networking

00 Intro

Remote APIs link

Challenges link

Heterogeneity link

Latency link

Error Handling link

Security link

Scalability link

Concurrency link

Consistency link

Distributed Algorithms link

Remote APIs

Challenges

Heterogeneity

Latency

Error Handling

Security

Scalability

Concurrency

Consistency

Distributed Algorithms