Name
HPL_pdtrsv Solve triu( A ) x = b.
Synopsis
#include "hpl.h"
void
HPL_pdtrsv(
HPL_T_grid *
GRID,
HPL_T_pmat *
AMAT
);
Description
HPL_pdtrsv
solves an upper triangular system of linear equations.
 
The rhs is the last column of the N by N+1 matrix A. The solve starts
in the process  column owning the  Nth  column of A, so the rhs b may
need to be moved one process column to the left at the beginning. The
routine therefore needs  a column  vector in every process column but
the one owning  b. The result is  replicated in all process rows, and
returned in XR, i.e. XR is of size nq = LOCq( N ) in all processes.
 
The algorithm uses decreasing one-ring broadcast in process rows  and
columns  implemented  in terms of  synchronous communication point to
point primitives.  The  lookahead of depth 1 is used to minimize  the
critical path. This entire operation is essentially ``latency'' bound
and an estimate of its running time is given by:
 
   (move rhs) lat + N / ( P bdwth ) +            
   (solve)    ((N / NB)-1) 2 (lat + NB / bdwth) +
              gam2 N^2 / ( P Q ),                
 
where  gam2   is an estimate of the   Level 2 BLAS rate of execution.
There are  N / NB  diagonal blocks. One must exchange  2  messages of
length NB to compute the next  NB  entries of the vector solution, as
well as performing a total of N^2 floating point operations.
Arguments
GRID    (local input)                 HPL_T_grid *
        On entry,  GRID  points  to the data structure containing the
        process grid information.
AMAT    (local input/output)          HPL_T_pmat *
        On entry,  AMAT  points  to the data structure containing the
        local array information.
See Also
HPL_pdgesv.