next up previous contents
Next: Convergence for the Galerkin Up: Diffusion Operator Previous: Optimal Numbering for the

Numerical Results

In this section we aim to examine the effect on computational efficiency of reducing the bandwidth of the Schur complement matrix. Both optimized and unoptimized orderings have been implemented in the spectral element code . We present two examples: the first is a solution to a scalar elliptic Helmholtz equation, while the second is a solution to incompressible Navier-Stokes equations. In order to obtain a sensible measure of this effect we consider the CPU time for the banded direct solve and the iterative solve of the boundary system resulting from the Helmholtz equation ($\nabla^2 u - u
=f$) against the CPU time taken by   to perform a direct solve on the unbanded boundary system. However, because of memory requirements we run into problems with the unbanded system becoming too large to run while the other methods are still very manageable.

We have performed the optimization analysis on the mesh inside a spiralling pipe as shown in figure 4.20. In this case we solved the Helmholtz equation with a forcing function designed so that we could measure the numerical error. In each of the three methods the L2 and the $L_\infty$ errors were identical for a given expansion order.


 
Table 4.2: Timings for the Schur complement obtained on a single node SGI R8000.
p 2c|Bandwidth 3c|User solve time in seconds      
  initial final unbanded direct banded direct iterative
3 361 59 0.53 0.50 0.82
4 1501 244 1.14 0.98 2.41
5 3441 558 2.59 1.94 6.19
6 6181 1001 5.7 3.70 17.4
7 9721 1573 10.95 6.08 39.35
8 14061 2274 20.45 10.37 76.09
9 19201 3104 -- 16.77 141.66
 

In table 4.2 we have compared the timings for the backsolve routine only for the unoptimized and optimized direct solves versus a diagonally preconditioned conjugate gradient iterative solve. The improvement in CPU time for the banded computations compared with the unbanded computations is substantial, where an improvement of an order of magnitude is obtained compared to the iterative solve. The preconditioner used in the iterative solve is based on the diagonal; in current work we are using the hierarchies of the expansion basis to obtain more effective preconditioners. The CPU savings reported here are coupled with an extreme reduction in storage requirements making this method quite competitive with the iterative method. Of course the iterative method wins for very high orders p due to memory restriction, but for these kinds of orders the banded method looks superior given a reasonable memory capacity.


  
Figure 4.20: The exponential decay in the error with increasing expansion order for the Helmholtz equation solved on the spiral domain.
\begin{figure}
\centerline{
\psfig {file=/crunch/crunch7/tcew/Thesis/Figures1/Eps/spiral3dconv.ps,width=4in}
}\end{figure}


next up previous contents
Next: Convergence for the Galerkin Up: Diffusion Operator Previous: Optimal Numbering for the
T. Warburton
10/24/1998