This started as a July status, but I kept wanting to dig just a little deeper to give a more detailed update.
As I mentioned in the June status, the core idea of Viridis is to create a zero copy operating system with a finer grain unit of execution than the process or thread. The most fundamental difference is that the entire operating system will exist in a single memory context which should reduce the context switch overhead to practically nothing, although with the cost of having to self enforce memory protection.
I won’t get into more detail about the execution model, because I’m not there yet, but in this single-context it’s clear that we’ve thrown running native binaries out the window, at least for untrusted code. Because of this, the first step towards the goal is writing a compiler for a memory safe language which is what my focus has been for the last couple of months.
Vua is what I call this language, and as you might guess it’s based on Lua in syntax. I chose to use Lua as a base for a few reasons:
- It has relatively minimal, C-like syntax. Lua has few bells and whistles (although it does have some), it’s memory safe, and it’s based around just three basic data types (numbers, strings, tables).
- LuaJIT exists, is fast, and has a permissive license (MIT). I am still rolling my own for various reasons, but it’s nice to have a working model to look at and I have included a small amount of LuaJIT code directly (mostly bytecode definitions).
Of course, that’s not to say that Vua is Lua, in fact they’re already incompatible with each other in some very minor ways and the compiler/bytecode intepreter/JIT is going to need to be further tweaked to squeeze more predictability (i.e. “2” + 1 is an error, not 3) and performance out of the resulting native instructions. I have ideas on this, but mostly come down to replacing CData with a table-esque “struct” data structure that can be read and written byte accurately using a new long integer type “reg”, both with specialized bytecode. The default numeric type is also an integer.
The big thing I’m concerned about at this point is how garbage collection will work, or if it’s smarter to keep memory management explicit. In a traditional system, explicit memory management is obviously better for performance both in terms of fewer instructions and running memory usage since programs explicitly release memory as soon as they’re done with it. Maybe that’s the end of the discussion, and honestly that would be less work in the long run because a lot of effort goes into effective garbage collection. The only thing that makes me question this position is that stale references to freed memory are security issues in this system. Traditional processes get punished for use-after-free by faulting, or just screwing some other piece of their own data. In this system, whatever gets placed in that address space after the memory is freed could be entirely unrelated to your program. Then again, Viridis has control over the running code either through interpretation or compilation and we’re counting on some level of forced opt-in to safely deal with memory buffers, so maybe that’s a non-issue.
Anyway, at the moment I’m hedging my bets on this front, in the hopes that as I develop in the language (leaking memory everywhere) the best answer will become apparent. There is no GC at the moment, but the memory allocator and object header has some space reserved for it’s use.
Thoughts on Performance
Memory tracking aside, the big issue that comes to mind with using a compiler like this is performance. JITs, or other interpreter strategies have some advantages with runtime optimization, but fundamentally you’re adding a compile stage and/or interpreter overhead versus just loading a big chunk of instructions and letting the CPU go at it. It would seem that it’s impossible to beat that, but I’m hoping that this obstacle will be overcome partially through avoidance (perhaps bytecode or marked up native code could be cached instead of generated each time), partially through mitigation (making the compiler faster and better), and partially through the structure of the system favoring small, non-blocking operations on shared memory far moreso than any existing operating system paradigm.
Of course all three of these strategies will have to be employed just to break even on the performance front, but if we can come close then hopefully the other benefits of using Vua at a system level will make this a net gain. After all, even using the CPU isn’t technically as fast as rendering your algorithms directly into silicon but we put up with the abstraction because it’s fast enough and the extreme versatility of the processor (the fact that you don’t have to spend years in logic design, verification and manufacturing) more than make up for the performance loss. No piece of software is going to represent quite that level of quantum leap, but the core conceit of Viridis (and the JVM, and pretty much dynamic languages in general) is that a small amount of performance is worth trading for other concerns, like debuggability, portability, or expressivity.
It’s always best to look a the Viridis Git to see what I’ve been doing. To summarize, Vua’s parser and bytecode generation is almost complete. It needs to have its flow control and table manipulation fleshed out, but it’s capable of generating a function prototype (compiled bytecode) with most of the same bytecode instructions as LuaJIT.
At the moment, and the reason I’m procrastinating and writing this post instead of working on it now, I’m just about to being coding the actual virtual machine to execute the prototype.
I’ve also made a number of improvements on the kernel front, like moving to gas from NASM and calibrating TSC to allow for rough wall clock timing feedback, but most of my effort is focused on getting Vua up to snuff.