Regent: A Language for Implicit Dataflow Parallelism

Tuesday, December 22, 2015

Programming

Regent is a research programming language which extracts implicit dataflow parallelism from code written in an imperative and (apparently) sequential semantics.

How does this work? Let’s jump right into an example:

task main()
  var a, b, c = ...
  t1(a, b)
  t2(b, c)
end

Here, t1 and t2 are tasks—think of them as functions in the usual imperative sense of the word. Tasks can even mutate their arguments, just like normal imperative code. Conceptually, Regent code runs sequentially, so you can run through the logic in your head by just reading the code top-to-bottom. Under the covers, the Regent implementation will take advantage of any parallelism it can find in the code. But any parallel execution is guaranteed to produce results identical to the original sequential semantics.

Internally, Regent discovers parallelism by computing a dataflow graph¹ for the program. If the arguments a, b and c are all distinct objects, for example, and if we suppose that t1 and t2 don’t modify b, then we’d get this dataflow graph:

Since t1 and t2 are independent in this graph (there is no path in the graph from one to the other), the two are able to run in parallel. If instead t1 were to modify its argument b, then we’d get this graph:

And the two tasks would be forced to run sequentially.

This idea—of parallelizing imperative programs via a transformation to dataflow—is actually an old idea. For example, Jade from the early 1990s used this technique to great effect on then-current hardware, and the technique itself was known prior to that. The biggest differences between Regent and previous systems are in the type system and the (multiple) ways in which Regent is able to discover parallelism.

The Three Dimensions of Parallelism

In fact, Regent has multiple ways of discovering parallelism in a program. I like to think of these each as dimensions along which a program can be parallel. Tasks only need to be independent along a single dimension in order to run in parallel—the extra dimensions provide flexibility to ensure that a variety of parallel patterns can be described in the language.

1. Privileges

As we saw in the example above, if two tasks read but do not modify an argument, they can run in parallel. This leads us to an important aspect of the Regent programming model: privileges. Privileges describe the modes of usage for a given argument (read, write, etc.). In Regent, privileges are part of the type signature of a task:

task t1(a : region(T), b : region(T))
where reads(a, b), writes(a) do
  ...
end

(Regions are just abstract containers for data, and are described in detail below.)

Privileges are checked by the type system, so if a task tries to access data it didn’t declare privileges for, the compiler will catch the mistake:

task t3(c : region(T)) where writes(c) do ... end

task t1(a : region(T), b : region(T))
where reads(a, b), writes(a) do
  t3(b) -- ERROR: Missing privilege writes(b)
  for x in b do
    @x = ... -- ERROR: Missing privilege writes(b)
  end
end

Obviously, tasks which only read their inputs can trivially run in parallel. However, it is also useful to be able to describe reductions—commutative operators (such as +, *, etc.) which can be applied in a lazy manner to a value. Reductions also allow tasks to run in parallel, as long as all tasks use the same reduction operator.

task t4(d : region(T))
where reduces+(d) do
  for x in d do
    @x += ... -- Values are saved and applied later.
  end
end

task main()
  var d = ...
  t4(d) -- Both can run in parallel.
  t4(d)
  -- Temporary values are flushed before the next read.
end

2. Fields

Often, tasks access the same objects, but use different fields in said objects. Field-sensitive privileges are an easy way to achieve task parallelism in codes with this sort of access pattern. Fields are easy to specify (just include the fields accessed in the privilege declaration), and frequently lead to unexpected gains in parallelism. This sort of parallelism can be especially tedious to express in existing programming models.

task t5(a : region(T))
where reads(a.{x, y, z}), writes(a.z) do
  ...
end

task t6(a : region(T))
where reads(a.{x, y, w}), writes(a.w) do
  ...
end

task main()
  var a = ...
  t5(a) -- Both can run in parallel.
  t6(a)
end

An added bonus of declaring field-sensitive privileges is that the compiler is able to catch operations to the wrong fields:

task t5(a : region(T))
where reads(a.{x, y, z}), writes(a.z) do
  for t in a do
    t.x = ... -- ERROR: Missing privilege writes(a.x)
  end
end

3. Regions

Privileges are easy enough to track at a type system level when everything is a local variable. But many interesting problems require the use of pointer data structures. Pointers naturally introduce the possibility of aliasing (two pointers with the same pointee)—and aliasing is arguably the bane of all static analysis.

In order to manage aliasing in a sane way, Regent uses a type system based on logical regions. Regions are similar to arrays, but allow for better static type checking. For example, pointers are explicitly typed to a specific region, and can only be dereferenced if the task has privileges on the containing region.

task t6()
  var r = region(ispace(ptr, 5), T) -- Create enough space for 5 Ts.
  var x = new(ptr(T, r)) -- Allocate an element in r.
  @x = ... -- Access is OK. Pointer x is guaranteed to be in r.
end

Regions also allow us to achieve data parallelism via partitioning. Partitions subdivide a region into some number of subregions. Subregions resulting from a partition are not separate regions, they’re just views onto the parent region. So changes made to subregions are automatically visible in the parent.

task t7()
  var r = region(ispace(ptr, 5), T)

  -- Allocate some elements in r.
  new(ptr(T, r))
  new(ptr(T, r))
  new(ptr(T, r))

  -- Assign colors to the elements of r.
  var c = 0
  for x in r do
    x.color = c
    c += 1
  end
  var colors = ispace(ptr, c, 0) -- Remember the set of colors.

  -- Partition r based on the field color.
  var p = partition(r.color, colors)

  -- Launch a task over every subregion. Because the partition is
  -- disjoint, all tasks will run in parallel.
  for i in colors do
    t5(p[i])
  end
end

Partitions in Regent can be disjoint, as above, or aliased. Using disjoint partitions guarantees that tasks on the subregions can run in parallel, but even aliased partitions are useful. For example, aliased partitions can be used with either read-only or reduce privileges to move information around the system (for example in halo exchanges).

Regions and partitions are both first-class objects. They can be passed to tasks as arguments, and stored in the heap. Note, however, that while the values are first-class, privileges are not. Ways of dealing with this limit are beyond the scope of this post (see this paper for details).

Try Regent Online

Interested in learning more? Try Regent in your browser and run the examples yourself. Or, if you want to run Regent locally, see the installation page for details.

For More Information

You might also enjoy reading the source, language reference, or paper. There is also an older paper which describes a fragment of the type system.

I should note that this graph is technically not a proper dataflow graph in the traditional sense. Since traditional dataflow languages use functional semantics, nodes are pure functions of their arguments, and it makes sense to only model data as values flowing along the edges in the graph. Because Regent is imperative, memory locations may hold multiple values over time, and thus it makes sense to explicitly model how these values change as the computation proceeds. This representation also turns out to be useful in performing compile-time optimizations.↩︎

Elliott Slaughter