/tmp/koushik

Kubernetes Is A Bloated Answer To C++ Problems

Oh So Many Languages

Before we discuss Kubernetes, we first need to understand the software landscape before it. High performance software was (and still is) written in C++. It is the best tool to get the most out of computer hardware.

But the fact is that managing a large C++ project with many developers is no easy task. Large tech companies will often have entire teams dedicated to building developer tooling and infrastructure to help with maintaining C++ code. This is not feasible for the scrappy startup, or an organization that is building applications that don't need to be that performant. Other languages like Java and C# came about to help with this. These higher level, garbage collected languages enabled organizations to develop quickly at the sacrifice of performance. Of course, it didn't end there, we have so many other languages that were built to make writing code easier, faster and more fun than developing in C++, all with the understanding that C++ would still be faster.

Some of the advantages these languages had over C++ were:

And so we have our current situation with oh so many programming languages. There's nothing inherently wrong with this, but there are some implications.

Lost In Translation

One side effect of so many purpose-built languages is that no two languages can reliably talk to each other. A Java project will have a hard time calling a Python function, it's not easy for a Perl script to call a Go routine. Pick any two languages and there's a good chance that it's impossible for code in one to invoke code in another.

This is not the end of the world, operating systems let us run multiple processes on a single machine that can take to each other over a network. There are patterns and protocols like Grpc, HTTP, Trpc, Thrift, AMQP etc. that allow us to write two programs in two different languages and let them talk to each other. Without a doubt, a function call is way faster than invoking IPC but we can accept this cost if it lets us continue to write code in whatever language we please.

Microservices Galore

Okay so far, we've described how we can build an application using multiple languages all communicating with each other using something like Grpc. We don't have to write a single line of C++; the garbage collectors are doing whatever they're doing but computers are fast so everything is okay. But what happens when our application becomes too big for a single machine? Now we're talking about distributed systems and things are about to get hairy.

Remember when we said we're split our multilingual application up into multiple processes? Well now to turn our application into a distributed system we need to convert our processes into microservices. A microservice is like a process except that it can exist on multiple machines. We can resource govern a microservice and schedule it how we please. While an end user can interact with our application from a single point on entry (i.e. a website), there could be hundreds of microservices powering different parts of the application behind the scene.

A microservice is much more than just the code however. We need to package it and its dependencies in a standardized way. If we want to install Python a microservice on a machine, we want to make sure it has the right version of Python installed, as well as all the Python dependencies there. Today the common way to do this is with Docker or the Open Container Initiative. Of course, there is overhead to running a container over running the code directly on the host.

This sounds great. Docker lets us package our microservices and Kubernetes lets us schedule and distribute them across many machines. We can continue to write our code in whatever language we want.

But At What Cost?

Remember where we started. We started with C and C++. These languages let us write code that make the most of the computer hardware we pay for. A well written C++ application can carefully manage all memory, I/O, CPU and external devices. It's no secret why operating systems, databases, video games and other high performant code still use C++. This efficiency and transparency especially important in the cloud, a most preformat application could be the difference between a profitable company and a dead one.

What I described earlier is a common architecture of a modern application using microservices packaged with Docker, deployed on Kubernetes. The application code now has no visibility into the hardware it is running on. Everything is abstracted away. The cost of this hyper-abstraction will show up on the cloud bill. "Elastic scale" is only as elastic as your wallet after all.

kubernetes.h

How did we get here? It all started when we needed so many other languages to make up for C++ shortcoming. Kubernetes is a platform that celebrates applications written in multiple languages.

What if all code was written in C++? We would not need a bloated platform like Kubernetes. In fact, all of the Kubernetes logic could be packaged as a C++ library, distributed as a header file. We could get rid of the overhead of inter-process communication and use function calls instead. We can get rid of Docker images since all our code could be compiled to a binary, perhaps even a statically-linked one. Deployments could be as easy as scp. Our code could be written to use only the resources it needs. We could use fewer, smaller VMs saving us potentially tens of thousands of dollars a month.

This architecture is not new by any stretch. For example, this is how distributed databases are written. They solve all the problems of distributed applications while being written in low level language like C++. From my own experience working at Microsoft and Yugabyte, database products at both of these companies are built this way. Microsoft uses a system called Service Fabric. It solves a lot of the same problems Kubernetes does, but it does not hide the host machine hardware from you. The caveat is that you can only write your application in C++ or C#. But with this enforcement comes all of the advantages I just explained.

The fact still remains that C++ has its problems. And so other languages will still continue the exist and thrive. Perhaps one day we will get a language that does everything C++ does, but also solves it problems. And no, I'm not saying Rust will do this. In fact, I believe Zig is the language with the most potential. Perhaps the future of distributed computing will come in the form of kubernetes.zig.