Re: How about a Safe Virtual Machine?

Nathaniel Borenstein (nsb@nsb.fv.com)
Sat, 1 Oct 1994 16:57:31 +0100

Excerpts from www-talk: 1-Oct-94 How about a Safe Virtual Ma.. "Daniel
W. Connolly"@hal (7745)

> Anyway... the point of this message is to explore the possibility of
> supporting mutliple languages by making the safe-computing platform a
> virtual machine, rather than a programming langage.

I like this idea a lot in theory. I'm not sure how feasible it is in
practice, though -- it will take some fairly serious thought and
experimentation. My biggest concern is the mixing of levels of
abstraction. It turns out that one of the most essential features in a
"safe" language is a customizable user confirmation process. The more
specifically the program can describe the action that needs to be
confirmed, the more likely it will be that the user can make an
intelligent choice. Thus, a user stands a much better chance of
answering intelligently when asked:

"This program is trying to send mail to 'president@whitehouse.gov' with
a Subject of 'Die, Commie Dog!' and a body of.....<text in scrollable
region>...... Do you want to permit this mail to be sent?"

than when asked

"This program is trying to execute the 'exec' system call with an
argument list of '/usr/lib/sendmail -t < /tmp/mm1824'. Do you want to
permit this system call to be executed?"

(Note that this is just a simplified example, I would never advocate
giving ANY such access to 'exec' or similarly broad functions, having
seen so-called safe-languages that did this.) The point is that the
confirmation process is most effective when it can reflect user-level
semantics for user-level confirmation. This means that the "machine
code" for an SVM (Safe Virtual Machine) probably has to look rather
different from the "machine code" for most VM's. In particular, you
probably want atomic primitives for user-level things like sending mail,
limited file system access, and so on. This is by no means impossible
(I think), but it does mean that this is NOT your father's Virtual
Machine, not by a long shot.

(As an aside for any potential implementors out there: Note that the
confirmation process can't just be a subroutine called by the program in
the safe-langauge, because if the program has the power to take
differential action based on the result of that subroutine, it has the
power to skip the confirmation subroutine altogether, or to spoof the
question to make the action contemplated seem misleadingly benign. Thus
the confirmation process has to be invoked from WITHIN the SVM
"primitive" operation. Then, to make matters more complicated and
level-mixing still, the confirmation process must use the right user
interface paradigm for the current interaction environment, which
probably means calling back up to the outermost layers of abstraction,
but in manner that can't be subverted by untrusted code. Safe-Tcl
actually gets all this right (I believe), but the implementation is more
complicated than you might think. Be careful!)

> I think that portable source code is a myth. By its nature, source
> code is for human consumption and manipulation. A body of code is not
> really reusable until it's reached the state of a shared library or
> DLL. (The possible exception being Ada/Modula-3 generic modules,
> and to some extent, C++ template classes.)

I'd like to separate your basic proposal for an SVM from the above
claim, because I'm a big believer in portable source code. The above
claim is not, however, necessary to support the desirability of an SVM,
so I'd prefer to drop it from this thread.

> On the one hand, it seems easier to exhaustively examine the behavior
> of a bytecoded virtual machine than the behaviour of the set of
> programs expressible in some high-level language.

I think theoretically they're just the same. In practice it is probably
easier for a human with source code and for a program with byte codes.

> Hmm... the more I think about it, the less interesting it becomes to
> think about executing completely untrusted, anonymous programs. The
> interesting part is to allow programs access to exactly the set of
> resources that they are authorized to use, and to support accounting
> for the use of these resources.

Right, and this is also why you need to be able to hook the capabilities
system with the authentication information that comes in from PGP or
PEM, too. The current safe-tcl interpreter has hooks for this, but
those hooks have not yet been attached to PGP or PEM to complete the
picture.

To my mind, the biggest open question is whether an SVM is really
feasible, when you take into consideration the way that the
user-confirmation process affects levels of abstraction. When you
consider, for example, all the parameters that are needed for a "send
mail" primitive, you can see why there aren't any REAL machines out
there on which "send mail" is a primitive machine instruction. My
concern is that by the time you make the SVM complicated enough to
handle such highly structured "machine instructions", it may end up
looking a lot like a high-level langauge such as Safe-Tcl!

I'm taking the liberty of re-sending some of these messages to the
safe-tcl list, as I'm sure there will be interest there. -- Nathaniel