Name
HPL_pdfact recursive panel factorization.
Synopsis
#include "hpl.h"
void
HPL_pdfact(
HPL_T_panel *
PANEL
);
Description
HPL_pdfact
recursively factorizes a  1-dimensional  panel of columns.
The  RPFACT  function pointer specifies the recursive algorithm to be
used, either Crout, Left- or Right looking.  NBMIN allows to vary the
recursive stopping criterium in terms of the number of columns in the
panel, and  NDIV  allow to specify the number of subpanels each panel
should be divided into. Usuallly a value of 2 will be chosen. Finally
PFACT is a function pointer specifying the non-recursive algorithm to
to be used on at most NBMIN columns. One can also choose here between
Crout, Left- or Right looking.  Empirical tests seem to indicate that
values of 4 or 8 for NBMIN give the best results.
 
Bi-directional  exchange  is  used  to  perform  the  swap::broadcast
operations  at once  for one column in the panel.  This  results in a
lower number of slightly larger  messages than usual.  On P processes
and assuming bi-directional links,  the running time of this function
can be approximated by (when N is equal to N0):                      
 
   N0 * log_2( P ) * ( lat + ( 2*N0 + 4 ) / bdwth ) +
   N0^2 * ( M - N0/3 ) * gam2-3
 
where M is the local number of rows of  the panel, lat and bdwth  are
the latency and bandwidth of the network for  double  precision  real
words, and  gam2-3  is  an estimate of the  Level 2 and Level 3  BLAS
rate of execution. The  recursive  algorithm  allows indeed to almost
achieve  Level 3 BLAS  performance  in the panel factorization.  On a
large  number of modern machines,  this  operation is however latency
bound,  meaning  that its cost can  be estimated  by only the latency
portion N0 * log_2(P) * lat.  Mono-directional links will double this
communication cost.
Arguments
PANEL   (local input/output)          HPL_T_panel *
        On entry,  PANEL  points to the data structure containing the
        panel information.
See Also
HPL_dlocmax,
HPL_dlocswpN,
HPL_dlocswpT,
HPL_pdmxswp,
HPL_pdpancrN,
HPL_pdpancrT,
HPL_pdpanllN,
HPL_pdpanllT,
HPL_pdpanrlN,
HPL_pdpanrlT,
HPL_pdrpancrN,
HPL_pdrpancrT,
HPL_pdrpanllN,
HPL_pdrpanllT,
HPL_pdrpanrlN,
HPL_pdrpanrlT.