Architecture

This document describes the context in which a P2P CDN solution might be deployed, and some of the challenges involved. It continues to establish goals for algorithmic and software solutions such that they can be evaluated against standalone deployments, and deployments involving CDNs.

Small/Self-Hosted Use-case

For the purposes of developing the idea, a small/self-hosted use-case is considered first.

Small Hosting Solution Without P2P CDNSmall Hosting Solution Without P2P CDNSmall hosting solution.[system]Reverse Proxy TLS termination andrequest forwarding.Content ServerClient 1Client 2Client 3Forwards allrequests, includingthose for staticcontent to.Requests contentover HTTP[TLS].Legend  Internal user Using an HTTP Client (Web Browser or similar)  system  system boundary 
Small Hosting Solution Without P2P CDNSmall Hosting Solution Without P2P CDNSmall hosting solution.[system]Reverse Proxy TLS termination andrequest forwarding.Content ServerClient 1Client 2Client 3Forwards allrequests, includingthose for staticcontent to.Requests contentover HTTP[TLS].Legend  Internal user Using an HTTP Client (Web Browser or similar)  system  system boundary 

Telephone Problem

There is no value to one telephone.

Just as with a telephone, when using p2pcdn in isolation, depending on the hardware arrangement, there may be no tangible benefit, and some overhead:

Telephone Problem: a Single P2P CDN Instance has no PeersTelephone Problem: a Single P2P CDN Instance has no PeersSmall hosting solution.[system]Reverse Proxy TLS termination andrequest forwarding.P2P CDNContent ServerClient 1Client 2Client 3Requests for dynamiccontentRequests for staticcontent.Requests contentover HTTP[TLS].Legend  Internal user Using an HTTP Client (Web Browser or similar)  system  system boundary  P2P CDN Instance 
Telephone Problem: a Single P2P CDN Instance has no PeersTelephone Problem: a Single P2P CDN Instance has no PeersSmall hosting solution.[system]Reverse Proxy TLS termination andrequest forwarding.P2P CDNContent ServerClient 1Client 2Client 3Requests for dynamiccontentRequests for staticcontent.Requests contentover HTTP[TLS].Legend  Internal user Using an HTTP Client (Web Browser or similar)  system  system boundary  P2P CDN Instance 

With a Second Instance

With a second instance, there is potential for benefit. In the diagram below, small hosting solution 1 and small hosting solution 2 agree to deliver static content for both themselves and each other:

Two Small Hosting Solutions Sharing P2P CDN Instances to Pool Static Content DeliveryTwo Small Hosting Solutions Sharing P2P CDN Instances to Pool Static Content DeliverySmall hosting solution 1[system]Small hosting solution 2[system]Reverse Proxy TLS termination andrequest forwarding.P2P CDNReverse Proxy TLS termination andrequest forwarding.P2P CDNClient 1Client 2Client 3Requests for staticcontent from eithersolution 2 or 1.Requests for staticcontent from eithersolution 1 or 2..Requests contentover HTTP[TLS]..Requests contentover HTTP[TLS].May redirect staticrequest toMay redirect staticrequest toLegend  Internal user Using an HTTP Client (Web Browser or similar)  system  system boundary  P2P CDN Instance 
Two Small Hosting Solutions Sharing P2P CDN Instances to Pool Static Content DeliveryTwo Small Hosting Solutions Sharing P2P CDN Instances to Pool Static Content DeliverySmall hosting solution 1[system]Small hosting solution 2[system]Reverse Proxy TLS termination andrequest forwarding.P2P CDNReverse Proxy TLS termination andrequest forwarding.P2P CDNClient 1Client 2Client 3Requests for staticcontent from eithersolution 2 or 1.Requests for staticcontent from eithersolution 1 or 2..Requests contentover HTTP[TLS]..Requests contentover HTTP[TLS].May redirect staticrequest toMay redirect staticrequest toLegend  Internal user Using an HTTP Client (Web Browser or similar)  system  system boundary  P2P CDN Instance 

Scaling Problem Definitions

In order for either of the above two small hosting solutions to realize a benefit, either they must each have a complete replica of the other's static content, or they must share content such that they each have sufficient overlapping content to serve a worthwhile quantity of requests when load is high.

There must also be a load balancing element - how does each p2pcdn instance know when to serve content itself, and when to redirect to a client?

There are also cross-origin considerations when redirecting clients to other domains.

Finally, as the number of peers increases, the task of vetting each peer becomes unfeasible. This means that simply authenticating peers is insufficient - there is no point in being able to identify who is responsible for each of thousands of badly behaving peers, when no practical action can be taken against them. Trust between peers must be an ongoing concern, and must not require any peer to behave consistently or honestly.

Replication Management

Each peer would ideally like to have a copy of all content elsewhere in the network. Under extreme load, this should be the desired outcome of a P2P CDN peer (PCP).

Restated, a PCP under extreme load should become a pure load balancer, round-robining requests to other PCPs that have the requested content.

Sharing PCPs must prioritize what they want to share - a standard cache eviction policy such as LRU may be sufficient to manage this.

Hosting PCPs must decide which content to accept for replication. Clearly, size is a concern, but during traffic spikes, bandwidth is likely to be a more important resource.

CORS and the Fetch Specification

The fetch spec defines how browsers and other clients make HTTP requests, and makes provision for cross-origin resource sharing (CORS). Atomic Redirect Handling should isolate clients from problems emerging from this.

Trust Maintenance

We use TLS certificate chain of trust to establish initial trust between peers. The amount of trust established by this can be characterized as, "I'll recognize you if I see you again, and I know you claim to be a P2P CDN peer". Beyond this very basic level of trust, there is nothing more we can gain from simple HTTPS.

This breaks down into two broad categories - peer reliability, and peer honesty. Honest peers can nonetheless be unwise choices for caching because they have frequent downtime, or limited bandwidth. Dishonest peers may serve corrupted content, fail to serve content they claim to have, or misrepresent their capabilities.

In both cases, sampling to verify peer content, and a gossip protocol to share peer reputations will help to identify and isolate bad actors.