/tmp/koushik

Compiling Cpython with Zig

Ok so recently I have been learning about this cool programming language called Zig. If you haven't heard of it, I recommend you check out this video by Andrew Kelley the creator of Zig. In a nutshell, Zig is a systems programming language akin to C and Rust. It takes your code and compiles it into assembly for your target OS and CPU architecture. There is no runtime and no garbage collector. It is like C in that it has a simple syntax. It is like Rust in that it has a great standard library and it easy to read and write. Zig has many cool features and I won't go into a full comparison of Zig against between other popular languages like Rust, Odin, Nim, Vimscript, Nix, Brainf**k, Yaml, Emacs, VanillaJS and punch cards.

For the longest time, I have stayed away from low-level systems programming. To me it seemed intimidating, hard to read, hard to debug and most of all I had no real world experience. I took some classes in college where we were taught how to study for midterm and final exams about pointers and memory. Our C programming project was to write a 2D game in C. While that was fun, there was no expectation that we had to write "good" code. For example, in my game I ran into a problem implementing the "New Game" button. This button should reset all the variables of the game state and restart the game. Instead of actually solving this problem I decided to define all of my variables in the main() function and when someone clicked "New Game", I'd just recursively call main(). This meant that the whole game was shoved into the main() function, because why not?. If you clicked "New Game" too many times the stack would overflow and my game would crash. But that's okay because the autograder only clicked it once. This would never pass a code review in any company but it totally worked in college and I got an A on the project. I never learned how to write "good" C code and since then I just accepted that. Since then, I'd have to say my favorite language to program in is Python. As I wrote more Python, I got more familiar with its internals. The Python interpreter is known as CPython, which is the C implementation of an interpreter for the Python language spec; basically its the one that everyone uses.

One of the cool features of Zig is that Zig is also a C/C++ compiler!. The Zig compiler takes a dependency on Clang to generate LLVM bytecode and so it has to ability to parse and compile C code alongside Zig code into a single binary. So then I wondered, could I use Zig to compile CPython? Turns out, yes! I put up my findings at my zigpython repo. It's not perfect, for some reason the crypt module is not compiling and I don't know where libcrypt.so is on my machine (Ubuntu 20.04 WSL). In any case, I consider this to be my first time willingly dipping my toes into systems programming. The CPython codebase is very well written (much better than my game I made in college) and I encourage anyone who doesn't have experience with C to take a look at it. The PSF folks have written a great dev guide for those who want to build and debug CPython from source.

When we run the Zig compiled CPython we get:

➜  zigpython git:(main) ✗ ./cpython/python 
Python 3.12.0a3+ (heads/main:532aa4e4e0, Dec 18 2022, 21:02:08) [Clang 15.0.3 (git@github.com:ziglang/zig-bootstrap.git 85033a9aa569b41658404d0e on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>