Library of The Week: Jailed

In the JavaScript world there is broad consensus about how to describe and exchange data: JSON. But some classes of application need more. In particular, you may need to describe and exchange behavior as well as data.

It isn't easy to find a simple real-world example because you tend to face this kind of problem in more complex applications. So let's accept that the following is a little artificial. Imagine you run coffee shop portal and want to show which nearby shops are currently open. You take a first crack at a format for your REST API:

{
  "name": "Kaffeine",
  "opening_hours": {
     "monday-friday": "10:00-20:00",
     "saturday": "12:00-20:00",
     "sunday": "12:00-18:00"
  }
}

This works fine for simple cases, but then the coffee shop decides to close on every other Monday. You consider extending the schema to support this special case
(e.g. "monday": {"odd": 10:00-20:00, "even": "closed"}), but you reject this idea as too inflexible (the next coffee shop will doubtless decide to stay closed on the first Monday of the month, at which point you will have to make more ad hoc schema extensions).

Instead, you hack your schema to allow particular days to be overridden:

{
  "name": "Espresso Troublemakers",
  "opening_hours": {
     "monday-sunday": "10:00-20:00",
     "2015-05-18": "closed",
     "2015-06-1": "closed",
     "2015-06-15": "closed"
  }
}

Not so bad, but it covers only finite days and the exceptions must be updated continually, something the user is bound to forget from time to time. The list of specific exceptions is generated anyway, so why not move this logic to the description? And since we are familiar with JavaScript, why not use it?

{
  "name": "Espresso Troublemakers",
  "opening_hours": {
     "monday-sunday": "10:00-20:00",
     "special": "return function(day) { \
        if (isEven(day)) { return \"closed\"; } \
     }"
  }
}

What's wrong with this approach? There are several issues, most importantly related to security. Consumers would have to execute the special code using eval, which is never a good idea. Even if the code is not intentionally malicious, it can still contain mistakes such as infinite loops. It is best to avoid this approach altogether.

In many applications, providing a mini-syntax to construct rules from a limited set of predicates can be enough, since you already know the requirements of your domain. For instance:

"opening_hours": {
  "friday": "10:00-23:00",
  "!holiday && nth_in_month(1st, monday)": "closed",
  "even(monday)": "closed",
  "odd(monday)": "10:00-20:00"
}

True, you must write engine to evaluate the special syntax, but this way you retain total control. You can validate the expressions to ensure that you don't get any undesired behavior.

On the other hand, the expressive power of a fully-fledged scripting language can be priceless. Is there any way to get the benefits of Turing completeness without the risk?

One option is Jailed, which provides a sandbox for running untrusted code. The great thing is that it works both in browser (using an iframe for the sandbox) and Node (using a subprocess), although it also has its own pitfalls.

To create a sandbox, you can use jailed.DynamicPlugin to run code from string or pass a path tojailed.Plugin to read the code from a file.

var jailed = require('jailed');
var plugin = new jailed.Plugin('untrusted.js');

Imagine you want compute something and print the result. So you run untrusted.js with the following contents:

var result = 3 * 4;
console.log(result);

which results in:

ReferenceError: console is not defined
    at untrusted.js:3:1

This is perfect since we don't want untrusted code to access global objects like console, require or window in the browser. To make specific functionality available to the plugin, we can explicitly export functions:

var plugin = new jailed.Plugin('untrusted.js', {
    log: console.log.bind(console)
});
var result = 3 * 4;
application.remote.log(result);

If you run this code in Node, you will see that the script never ends. This is because a subprocess is used for running sandboxed code. You must quit it explicitly.

var result = 3 * 4;
application.remote.log(result);
application.disconnect();

Take note also that exported objects must be functions. The function call is implemented internally by sending messages, and the recipient will use a proxy function and not the real object you passed. (So you can't pass the whole console object, for example, and then try to call application.remote.console.log.) Another consequence of proxies is that methods of application.remote always returns undefined regardless of the return value of the original function. You can work around this by using callbacks and returning values asynchronously:

var plugin = new jailed.Plugin('untrusted.js', {
    getConstant: function(cb) { cb(42); },
    log: console.log.bind(console)
});
application.remote.getConstant(function(constant) {
    application.remote.log('Answer is '+constant);
    application.disconnect();
});

A common use case for untrusted code is to provide a mini-library of functions. For this purpose, a mechanism is provided to export an API in the opposite direction. Look first at untrusted.js, which defines our mini-library of just one function:

application.setInterface({
    mul: function(a, b, callback) {
        application.remote.log('mul(' + a + ',' + b + ') called');
        callback(a * b);
    }
});

Again, you can't return result directly and must use a callback.

The application code is a little more complicated:

var plugin = new jailed.Plugin('untrusted.js', { log: console.log.bind(console) });

plugin.whenConnected(function() {
    plugin.remote.mul(4, 3, function(res) {
        console.log("Result is "+res);
        plugin.disconnect();
    });
});

We must wait for the plugin code defined inside application.setInterface to be executed. This is done with plugin.whenConnected. Note also that we have to teminate the plugin from the application code.

At this point it is probably clear how to handle possibly malicious or invalid code like:

var i = 0;
while (true) { i++; }

We can easily constrain it with a timeout:

var plugin = new jailed.Plugin('malicious.js', {...});
setTimeout(plugin.disconnect.bind(plugin), 1000);

Not a bad solution. The Jailed library can be a great way to pass scriping code over the wire, but it does require some effort to architect the sandboxed code and communication flow.