I have made heavy use of the C++ standard template library for implementing the sets, maps, lists of sets, and other fairly complicated but efficient data structures.
There are still optimizations that should be done in the code. There may be cases where we spoil the otherwise fine algorithmic complexity of the parallelizer. The following is a completely non-academic and unproven estimate of the complexity of the sequence parallelizer. It looks as though this estimate is possible to meet, but there may well be cases in the current implementation that blows this.
In a code sequence with
instructions, where the instructions have
a total of
arguments, we should be able to parallelize the entire
sequence in
time.
is the cost of building up a balanced tree
holding
arguments (dependencies). We use balanced trees to
implement the dependency sets.
The
is the cost of looking up
dependencies
in our mapping between touched variables and the touchers, for
instructions. Again, the mapping is implemented using a balanced tree,
and it is kept up to date as we move instructions between the various
sequences.
The number of dependencies
is likely to be a small number compared
to the number of instructions
,
so effectively we may see run-time
that looks more like
.