next up previous contents
Next: Parallelization Up: Parallelization paradigms Previous: Shared memory (threads)   Contents

The distributed virtual machine

A typical execution of some parallel computation on the distributed virtual machine will resemble something like:
Node$_0$   Node$_1$
Code $\to$  
Data $\to$  
computation   computation
  $\gets$ Result
Code $\to$  
Data $\to$  
computation   computation
  $\gets$ Result
     
The programmer of course doesn't even have to know that this is happening. Still the parallelization method is important since it is what guarantees us that the system is working. It also has very strong ties to the virtual machine language, as stressed earlier.

Because we bind code and data so strongly together, it is always clear exactly what code changes what data, and what code depends on what results.

We can never forget to update a variable with a result from another node, because we will be awaiting the reply message from that other node before we can continue with our next calculations (the next opcodes depend on the last result).

Also, a result can never be changed in a node without it knowing about it, since it has to read the data by itself.

So the two main sources of errors from MPI / PVM and threads do not exist when we choose to see the nodes in our cluster as communicating sequential processes, and the instructions executed and the data they use as messages passed back and forth between the processes.

An excellent toy language that demonstrates the powers of parallelization with CSP is Concurrent ML3.2. It is an implementation of the ML programming language with parallelization extensions. ML is a functional language, so by nature it lends itself well to parallelism using CSP.

<


next up previous contents
Next: Parallelization Up: Parallelization paradigms Previous: Shared memory (threads)   Contents

1999-08-09