P2P CDN
This is an idea about how to achieve some of the common Content Delivery Netork (CDN) features using a Peer-to-Peer (P2P) architecture intended for FOSS implementation and use.
The documentation can be edited at the Codeberg documentation repo.
This is written as an exercise to flesh out the details, and discover if there's an important reason that such a tool has not yet been built.
CDN Features
For existing centralized services, features comprise roughly the following:
- Caching: Store copies of content closer to end-users to reduce latency and improve load times.
- Load Balancing: Distribute traffic across multiple servers to ensure no single server becomes overwhelmed.
- Scalability: Automatically adjust resources based on demand to handle traffic spikes.
- Redundancy: Maintain multiple copies of content to ensure availability in case of server failures.
- Geographical Distribution: Serve content from locations closer to users to minimize latency.
- Security: Implement measures such as DDoS protection and secure content delivery.
- Analytics and Reporting: Provide insights into traffic patterns, user behavior, and performance metrics.
P2P CDN Scope
To have some hope of achieving a useful tool from this early project, the smallest feasible subset of these features is selected:
- Scalability: Use the inherent scalability of P2P networks to handle varying traffic loads.
Note
I regard "Analytics and Reporting" as a euphemism for pathological behaviour of CDN operators, resulting from their centralized control of user data. Therefore, I do not intend to try to implement this feature.
Assumptions
- The document proceeds assuming there is already a software module that can be deployed to provide the P2P CDN features. This will be referred to as "p2pcdn" when referring to the software, and "instance", when referring to a deployed version of the software.
- The intention of a content owner is to make their content irrevocably public when they upload it to a CDN, therefore, the same is assumed with a p2pcdn implementation.
- Content owners will deploy or at least control p2pcdn instances to serve their content.
- Content owners decide to publish their content explicitly, rather than having it cached implicitly.
- Any improvement on the scalability and performance of a single server hosted website is enough to justify the project.
- Traffic spikes hit individual domains hosting P2P CDN instances, rather than the entire network.
- A solution should be able to scale significantly beyond the capabilities of the largest individual network nodes (peers), so it should be assumed that, for all bandwidth and storage resources, each node can and probably will have an incomplete view of all content in the network.
Constraints
To prevent features such as spyware encroaching on the project, and to ensure responsibility for content resides with content owners, a few provisions are made here:
- Domain names and TLS certificates are used to initiate trust between peers. The prior sentence reads "initiate" rather than "establish", because trust is a continuing concern in P2P networks.
- Supported use-cases are limited to providing static content due to HTTP requests from arbitrary clients.