next up previous
Next: Termination Detection Up: Examples of Parallel Language Previous: Policy:

  
Latency Hiding

Latency hiding is an optimization technique to eliminate time to wait for remote messages, where the basic idea is to overlap local computation and remote communication. This is usually realized by modifying programs manually, i.e., by breaking up a single thread of control into multiple threads. The problem is that the modification is not small. Figure 3 shows two versions of the function product, which are manually modified for latency hiding from Figure 2. These two versions are different in the number of method invocation requests that are sent in advance to the actual use of the data (effectively, prefetching). In (a), only a request for the element that is used in the next iteration is sent in advance to its use, while requests for all the elements are sent before the computation in (b).
  
Figure 3: ``Manual'' latency hiding versions of dot-product
(a)

(defun product (v1 v2)
(let ((sum 0.0)
(size (size-of v1))
(elm-a (future (nth-element v2 0))) ; request for the first
(elm-b nil)
(dotimes (i size)
(setf elm-b
(future (nth-element v2 (+ i 1)))) ; request for the next
(setf sum (+ sum (* (nth-element v1 i)
(touch elm-a)))) ; use of the value
(setf elm-a elm-b))
sum))

(defun product (v1 v2) (let ((sum 0.0) (size (size-of v1)) (elm-a (future (nth-element v2 0))) ; request for the first (elm-b nil) (dotimes (i size) (setf elm-b (future (nth-element v2 (+ i 1)))) ; request for the next (setf sum (+ sum (* (nth-element v1 i) (touch elm-a)))) ; use of the value (setf elm-a elm-b)) sum))


(b)

(defun product (v1 v2)
(let* ((sum 0.0)
(size (size-of v1))
(elms (make-array size)))
(dotimes (i size) ; request for all
(setf (aref elms i) ; the elements
(future (nth-element v2 i))))
(dotimes (i size)
(setf sum (+ sum (* (nth-element v1 i)
(touch (aref elms i)))))) ; use of the values
sum))

(defun product (v1 v2) (let* ((sum 0.0) (size (size-of v1)) (elms (make-array size))) (dotimes (i size) ; request for all (setf (aref elms i) ; the elements (future (nth-element v2 i)))) (dotimes (i size) (setf sum (+ sum (* (nth-element v1 i) (touch (aref elms i)))))) ; use of the values sum))



We provide a mechanism for latency hiding, in which the programmer specifies when and what method invocation should be requested for prefetching in advance to the actual use of the result of the prefetch in the expressions by means of annotations embedded in the original programs. The annotation for latency hiding has the following form.

e{prefetch ea}

This annotated expression is interpreted as follows. Before evaluating the expression e, the expression ea is evaluated. The expression ea is expected to result in a sequence of synchronous method invocations (i.e., present type messages in ABCL[19]), which are executed as asynchronous (i.e., future type) invocations. The return values of these invocations are not used in ea itself; and resulting reply boxes are stored for later use. Then, the expression e and subsequent expressions are executed. In those expressions, a method invocation form is executed as touch operation to the stored reply box, if the form results in the same invocation to one of the invocations performed during the execution of ea. Using this mechanism, the annotated latency hiding versions of product become as shown is Figure 4. Note that without annotations, these two programs are identical to the original one in Figure 2.
  
Figure 4: Latency hiding versions of the dot-product function using annotations
(a)

(defun product (v1 v2)
(let ((sum 0.0) (size (size-of v1))
(dotimes (i size)
(setf sum
(+ sum (* (nth-element v1 i)
(nth-element v2 i)))) ; performs touch here
{prefetch (nth-element v2 (+ i 1))}) ; sends a request
{prefetch (nth-element v2 0)}
sum)))

(defun product (v1 v2) (let ((sum 0.0) (size (size-of v1)) (dotimes (i size) (setf sum (+ sum (* (nth-element v1 i) (nth-element v2 i)))) ; performs touch here {prefetch (nth-element v2 (+ i 1))}) ; sends a request {prefetch (nth-element v2 0)} sum)))


(b)

(defun product (v1 v2)
(let ((sum 0.0) (size (size-of v1))
(dotimes (i size)
(setf sum
(+ sum (* (nth-element v1 i)
(nth-element v2 i))))) ; performs touch here
{prefetch (dotimes (i size) ; even iterations can be
(nth-element v2 i))} ; written in annotations
sum)))

(defun product (v1 v2) (let ((sum 0.0) (size (size-of v1)) (dotimes (i size) (setf sum (+ sum (* (nth-element v1 i) (nth-element v2 i))))) ; performs touch here {prefetch (dotimes (i size) ; even iterations can be (nth-element v2 i))} ; written in annotations sum)))




next up previous
Next: Termination Detection Up: Examples of Parallel Language Previous: Policy:
Matt Hurlbut
1998-07-14