JavaScript in asm.js (and a little rust)
cananian

Over on twitter, Tim Caswell mentioned, "I think high-level scripting language on top of something like rust.zero would make for an amazing OS." and that set me off a bit. Twitter isn't a great place to write a reasoned discussion of programming languages or implementation strategies, so let's take a shot at it here.

As I've written about on this blog, I've been tinkering for years with TurtleScript, a small learnable JavaScript subset in the spirit of Alan Kay. Over in that twitter conversation David Herman mentioned rusty-turtle, my TurtleScript bytecode interpreter written in Rust. The rusty-turtle codebase includes a REPL which runs TurtleScript's tokenizer, parser, bytecode compiler, and standard library (all written in TurtleScript) through the Rust interpreter. It's quite cute, and I implemented much more of the JavaScript semantics than I strictly needed to (with the side-effect that the behaviors in the JavaScript wat talk now appear quite sane and sensible to me).

I wrote rusty-turtle as a personal warm-up: I was considering taking a job with the fine folks at Mozilla (OLPC having run out of money again) and wanted to understand the technology better. I described a number of further research projects I thought would be interesting to pursue in the rusty-turtle README, including cilk-style fork/join parallelism or transactional memory support (the latter being the subject of my thesis), and a JIT backend using rust's llvm library bindings.

But the true turtles-all-the-way-down approach would be to rewrite the backend using asm.js, which can be trivially JIT'ed (using llvm bindings). Then you've have an entire system from (pseudo-)assembly code up, all written in a consistent manner in JavaScript. To that end, I wrote single-pass type-checker/verifier for asm.js in TurtleScript, discovering lots of issues with the spec in the process (sigh). (I justified this as more "Mozilla interview preparation"! Besides, it was fun.)

Tim Caswell, to finally answer your question: I think that this "JavaScript all the way" system would make an amazing OS. The Rust stuff is just a distraction (except as needed to bootstrap).

In the next post I'll rant a bit more about Rust.

ps. In the end I over-prepared (!): Mozilla's feedback was that I seemed to "know too much about Rust to work on Servo" (Mozilla's experimental web layout engine, written in Rust). Mozilla seems to have reached that awkward size where it can not longer hire smart people and find places for them to contribute; new hires need to fit into precise job descriptions a priori. That sort of place is not for me.


Rust is not fast
cananian

There are plenty of safe high-level languages in the world; JavaScript, for example. Rust is different: it's supposed to be safe and fast.

But Rust is slow. (And its type system hates you.)

Rust is slow because there is lots of hidden indirection ("smart dereferencing") and other hidden costs (ref counting, etc). In low-level C code I can look at a line of code and know roughly how many (slow) memory accesses are present. Not so in Rust.

Further, Rust's type system leads to extra unnecessary copying, just to get your code to compile without massive refactoring of the standard library. When writing rusty-turtle I found myself having to add ~ or @ pointers to my types (forcing extra layers of dereferencing) just to work around the type system. Further, the APIs have a genericity problem: there are lots of duplicate methods, since &-pointers aren't truely generic/orthogonal. (And you will find yourself duplicating methods in your own APIs as well, in order to be able to pass in parameters with different reference types -- or else just throw up your hands and wrap an extra @ layer around everything.)

The ownership type system also fights against typical APIs like find_and_insert for maps, since you don't know (before you do the find) whether or not you will be giving up ownership of the parameter (in order to do an insert). So you just copy the inserted value, always! Cycles are cheap, right?

Rust is also slow because it is not built to be parallel. The language is concurrent, but this is a word game: in the past few years the terms have been redefined such that "concurrent" is (roughly) non-blocking cooperative multitasking (such is implemented by node.js and GNU Pth), and "parallel" is reserved for actually doing more than one thing simultaneously (whether on separate CPUs or separate cores of a single CPU). Rust's memory model doesn't help: there is no shared memory, and ownership types make fork/join parallelism difficult. All inter-task communication is explicit message passing, with the races that entails. (Perhaps I'm spoiled: the Cilk work-stealing nano/microscheduler is my benchmark for speed.)

Some possible improvements:

  • Get rid of smart dereferencing; make it clear when performance is impacted by memory references.
  • Fix bugs with small objects/ABI limitations to avoid unnecessary parameter wrapping.
  • Make & pointers truely generic (or invent a new pointer which is) and do template expansion/method splitting to generate the proper specialized version of the method automatically (although this will exacerbate existing problems with code caching).
  • Better support fast refcounting/fast gc (update coalescing, generations).
  • Support fork/join parallelism and work-stealing.

This post is written from my experience with Rust in May 2013. Some of these problems are known, and some may eventually be fixed. But it makes me wonder what the language is really supposed to be good at. There are already plenty of slow safe languages.


?

Log in

No account? Create an account