The Three Generations of CDN
Let me recap on the three generations of CDN.
We are familiar with this evolution in the IT and computing services sector.
This evolution has similarly underpinned the same trend (albeit lagging behind by the time to develop the applications on the hardware) in the broadcast services sector, as illustrated in Figure 2.4.
As we can see from the diagrams, both sectors are closely aligned. Indeed the innovation in the broadcast IT services sector today closely tracks the wider IT and computing trend, with both currently highly focused on both virtualization and distribution.
Broadcast distribution networks have historically evolved built on a number of proprietary network protocols, ones that were tied to very
Figure 2.4 Evolution of compute IT with broadcast IT compared. https://commons.wikimedia.Org/wiki/File:Human-evolution-man.png specific network technologies. Often driving innovation in the telecoms world, “broadcast telecom” was a key frontier in the broadcasters' reach. A strategic decision, opting for a particular technology choice, could fix the position of an entire nation's media for many years. With little interoperability to create competitive markets, this left much of the evolution of the media and broadcast telecoms industry at the mercy of the pace of innovation of their suppliers. They in turn became fat and lazy and this retarded the sector for many years.
IP's key market objective has been to ensure that those provisioning network technologies are no longer tied to a particular vendor. Through encouraging this competitive market we have seen IP disrupt, like no technology before it, and the pace of innovation has rocketed.
As discussed above, in the late 1990s the music industry fell victim to its own “luddite” reaction to the emergence of IP and labored in the misguided belief that the Internet would not affect their business, until it was far too late. Thanks to the time it took to deliver video delivery capable high-bandwidth to the general Internet users, broadcast video has had a little more time on its side to allow it to assess its reaction, to learn from the audio market's struggles, and to find a strong position.
Tradition in the broadcast sector broadened the reasoning behind the resistance to change, and widened the “logic” given for delay in adoption to include issues such as the immaturity of emerging video compression over IP, and the lack of service level agreement that can be formed around IP networks that function, in real-world practice, on “best effort”
These three issues of speed of line, quality of image, and commercial guarantee both set a bar for IP to attain before it would be included as an option and create a forum for those resistant to the incoming and inexorable change to the way broadcast telecoms is provisioned.
The CDNs themselves can do little to directly drive the speed of line that was available to end users, and ultimately are somewhat at the mercy of the ISPs and the compression vendors in addressing both the speed of line and the quality of image issues.
Pureplay CDNs fall into two main architecture schisms:
- • Overlay models provision thousands of servers either in or adjacent (topologically) to as many consumer ISPs as possible. The ISP networks that connect all of these devices are beyond the CDNs' management reach, and the core network is a patchwork of multiple public and privately managed routes.
- • Managed network models provision a high-quality private managed network between their origins and edge locations, and operate fewer edge locations for this reason. However, their ability to finely optimize the core network to underpin the distribution service level agreements is much greater than an overlay.
As the fiber glut - post 2002 - kicked in, many network services companies found the supply and demand economic changing, and this meant that there was a considerable consolidation in the CDN space between 2002 and 2004. The key remaining players became increasingly myopic about competing with each other.
The sector formed a narrative focusing on where CDN architecture could deliver higher service level guarantees, and in this climate, and in the era before virtualization had taken off, they underpinned this with a “best of breed” strategy - often involving simply buying what was conventionally agreed to be the most expensive, “gold-plated” technology regardless of if it was “overprovisioned” or not. With that commitment made to infrastructure on the basis of offering a belt-and-braces solution, it then became key to recoup that extra expense. This was very much in line to the traditional provisioning process, and enabled the CDNs to “speak the same language” as the traditional broadcasters. Despite the fact that often large sections of the IP video delivery network were not under direct management of the CDN, the focus on the measurement of key performance indicators was carefully trained on where they could be measured (namely on the links terminated with the expensive overprovisioned kit). This marketing gave the CDNs a battleground for the commercial teams.
In reality significant chunks of the CDN networks are massively overprovisioned, and the CDNs balance the risk of overprovisioning and cost of operations with the revenue they can charge, and the risk-mitigation capital they need to reserve should they have to pay out on a failure of SLA.
Any “real” SLA in the OTT world is a myth in technical terms: the layer-3 Internet is a “best-effort” network. Period! The best way to ensure delivery/ increase availability for an IP connection is to ensure that multiple paths are available. To traditional broadcasters this sounds expensive. Why have several paths of “best-effort” quality when you can pay a premium for a single “very reliable” path?
The fact is that a single “very reliable” path is a single path - an unintended fault on that path could kill a live sports TV event irrecoverably. If you double up that path, you may double your cost but still have no other options if both fail.
In the IP world we approach things differently: if something fails, we have myriad options. Our contribution encoders - which “speak IP” - can connect to any IP network. The IP network itself is highly redundant - almost any part of it can fail, and the failure can be routed around automatically. The commoditization of IP means that backup routes - right down to multiple layer-2 ISP connections at the live event - can be replicated cheaply. Often commercial arrangements can be established that can be paid for only when they are used. This makes it possible to massively over provision those occasional links from outside broadcast contributions. This is no better exemplified than in the evolution of cellular channel bonded 3 g/4 g video encoders and multiplexers (CellMuxes) that have dramatically disrupted the traditional satellite news gathering space, and instead of offering high-capacity guaranteed fixed quality contribution links, such as satellite has provided for many decades, the CellMuxes use “whatever IP network they can” and adapt to deliver the “best effort” - and it turns out that best effort is generally good enough, given that CellMuxes can be bought, commissioned, and deployed for a fraction of the complexity and cost of satellite.
So IP can actually bring high-availability benefits even with “best-effort” operations, but only if it is architected for as “IP” and not as a like-for-like “replacement to circuit switched” (etc.).
Also in the application space the interoperability of IP-based applications was also bringing new capabilities through virtualization, and if architected properly, the virtual application's capability can be offered with much higher availability than if it is tightly tied to a single physical implementation, purely by virtue that the virtual application can be repeatedly launched on new infrastructure and in new locations - allowing high-availability architects to design extremely fault tolerant systems and to replicate them “infinitely” - achieving SLA in a way that can never be delivered on a traditional “fixed” infrastructure.
As operators moved to virtualization, the video delivery networks faced two key challenges:
- • They didn't “trust” that the infrastructure would provide the performance they were used to paying a huge amount for in their current expensive (if overprovisioned) kit. They have struggled to realize that by moving to virtualization the infrastructure (as far as the application is concerned) becomes ephemeral. As capability improves, the application is migrated to the new technology. This is culturally alien to a community used to trying to nail down and account for every route and junction.
- • Their architecture and thinking was to transpose the existing architectures into a virtualized replication of the same architecture. While this is often a viable option to get started, in practice the “secret sauce” is to leverage the new availability that virtualization brings, to move capability to infrastructure, just in time and on-demand. As architects do learn this, it is leading to entirely new architecture possibilities.
And so we have now two generations of the three defined:
• The first generation consists of those that need to tie applications to “tin” (the engineer's vernacular for physical infrastructure). It evolved from (and includes) both traditional broadcast telecoms and IP-based broadcast telecoms with a “traditional fixed infrastructure” architecture.
• The second generation consists of those that have understood that the function of a particular “tin” unit can be replicated on a different “tin” unit. This is “virtualization” as those in the first generation group still largely think it to be. Essentially a “clone” of an entire computer is launched on typically pretty identical “tin” to the one that it was originally created on.
So let us now look at that third generation.
As mentioned above, the common “virtualization” model in practice today is to replicate the first generation workflow but abstract the infrastructure. So, given an available compute and network resource, the virtual machines that construct the workflow's end-to-end application can be run without being (very) deterministic about which computer or where on the network they are being executed. Often “which computer” and “where on the network” are overseen by some form of “orchestration” system, but in broad terms one computer image can be launched “anywhere” and it will function. This means that a failure of the computer resource or network link needn't mean a long-term period of downtime.
Since 2008 a new technology has appeared in the virtualization space - containers.
What a container is (in this context) is an extension to a base operating system that is running on the host computer, which “contains” all the specific additions to the base OS to be able to deliver the computational function required.
What does this mean?
Well one of the key benefits of Gen2 virtualization is that you can host several clients applications on a single machine if the host is powerful enough. However, the requirement for the host to run multiple Gen2 virtual machines is that it must run a host OS - albeit typically fairly minimal - and then each virtual machine much in turn start as if it was running on its own on the underlying resource, with the machines own host OS attempting to “abstract” the new virtual machine's OS from the underlying hardware. Once a second client's virtual machine is added, we now have a requirement from the underlying compute resource to host three OS, to ensure that these OS can share the hardware efficiently, and to ensure that the different machines cannot interfere with each other's process and cause operational or security problems.
With a container model, since the OS is common to all the containers, it becomes possible to install all the client-specific requirements directly onto the host OS, meaning that unlike Gen2 VMs, all applications must be built on the SAME OS architecture. As long as this constraint is not limiting to the overall operations, there are many significant advantages.
First and foremost, a single OS on the machine makes the resource utilization much more efficient.
Second, the containers are completely discrete from each other: there is complete isolation of one client's application from another. Arguably one of the most useful features, this means that when a container is terminated, it leaves the underlying host OS machine completely “clean” from that client's application.
The third really significant benefit is that because there is no “layering” of VM OS on top of host OS, the container is not “abstracted” from the hardware. With direct access, the container can obtain what is termed as “bare metal” speeds from the hardware, and this again increases the resource utilization, both in terms of the compute resources, and - particularly relevant to content delivery architecture - this ensures that maximum throughput can be obtained from network links and internal busses.
Because there is only one OS, and given that often the vast majority of a Gen2 VM image is the OS itself, this also means that in the container paradigm it is possible to launch many more containers on a given physical machine than it is to possible to launch Gen2 VMs. In fact with good architecture, while each container may appear from outside to be an independent computer, it is possible to launch almost as many containers on a single machine as you could launch applications. This ensures that the resource utilization can be made available to more “customers,” be they internal customers within a single company or third party customers who are using a publically available infrastructure.
The container boot times take only a few seconds, whereas Gen2 VMs have to effectively boot a whole OS before they launch the application. This makes container architectures extremely dynamic.
Given a particular user may decide to formulate his or her workflows from combining multiple containers, this ensures that software development can become tightly modular if desired, and isolated containers can form networks of resources that can be quickly launched in myriad configurations.
This dynamic capability is leading to a whole new application and workflow architecture.
Here it is worth noting that much the same capability can be obtained in other ways. Purely functional languages - such as the Erlang that my own company codes with dominantly - are natively discrete. Yet the high-level architecture that Erlang (which is 25 years mature) has, if anything, given us clear insight into how other programmers may now access these models of high availability and dynamic orchestration, and this will open up a wealth of new computing paradigms over the next few years.