An Analysis on Some Algorithms of Fast Runge-Kutta Convolution Quadrature

Seema  Rani; Ashwani  Kumar

An Analysis on Some Algorithms of Fast Runge-Kutta Convolution Quadrature

Efficient Numerical Solution for Time-Domain Boundary Integral Equations with Retarded Potentials using Fast Runge-Kutta Convolution Quadrature

by Seema Rani*, Ashwani Kumar,

- Published in Journal of Advances and Scholarly Researches in Allied Education, E-ISSN: 2230-7540

Volume 13, Issue No. 1, Apr 2017, Pages 334 - 340 (7)

Published by: Ignited Minds Journals

ABSTRACT

This work addresses the question of the efficient numerical solution of time-domain boundary integral equations with retarded potentials arising in the problems of acoustic and electromagnetic scattering. The convolutional form of the time-domain boundary operators allows to discretize them with the help of Runge-Kutta convolution quadrature. In the work it is shown how this property can be used in the recursive algorithm to construct only a few matrices with the near-field, while for the rest of the matrices the far-field only is assembled. The resulting method allows to solve the three-dimensional wave scattering problem with asymptotically almost linear complexity. The efficiency of the approach is confirmed by extensive numerical experiments.

KEYWORD

algorithms, fast Runge-Kutta convolution quadrature, numerical solution, time-domain boundary integral equations, retarded potentials, acoustic scattering, electromagnetic scattering, convolutional form, discretize, recursive algorithm, matrices, near-field, far-field, three-dimensional wave scattering problem, asymptotically almost linear complexity, numerical experiments

INTRODUCTION

Our approach is conceptually different from the one used in the study of W. Hackbusch, W. Kress, and S. A. Sauter (2007). We construct the new method based on the algorithm of linear complexity, rather than back substitution of quadratic complexity. This approach allows us to avoid the actual evaluation of the convolution weights, thus enabling the use of fast techniques based on analytic expansions. Computational and storage costs of the improved algorithm scale linearly, up to logarithmic factors.

FAST RUNGE-KUTTA CQ ALGORITHM

Let us come back to the recursive algorithm. Recall that for this algorithm O(N) Galerkin discretizations of boundary single-layer operators for the Helmholtz equation with decay need to be constructed. A straightforward application of the data-sparse techniques (i.e. FMM and -matrices) would on its own lead to the algorithm of almost linear complexity. However, a significant drawback of this approach is large constants involved in complexity estimates. Our goal is to design a method that would reduce them. The data-sparse techniques in question have two main bottlenecks: • Costly evaluation of singular and nearly singular integrals in the near-field; • High matrix-vector multiplication costs of the high-frequency FMM. We overcome the first problem by the use of fast decay of convolution weights away from the neighborhood of We show that within the whole recursive algorithm only a few matrices (namely with the near-field need to be constructed, while for the rest we can assemble the far-field only. To motivate this strategy. In the end of this section we demonstrate that provided that for the approximation of different matrices a choice between-matrix techniques and the HF FMM is made properly, the problem of high matrix-vector multiplication costs of the IIF FMM ceases to exist.

MOTIVATION

The evaluation of the near-field integrals is commonly done with the help of coordinate transformation techniques. Given the kernel k(x,y) of a boundary single layer operator (that maps from to ), the evaluation of with the accuracy sufficient to preserve the stability and convergence of the Galerkin method, requires that the quadrature order scales as if if (nearly singular integrals) and 0 (1) if Thus the computation of the near-field (singular and nearly singular integrals) of one matrix is of complexity. Within the recursive convolution

The question of the efficient evaluation of singular and nearly singular integrals was addressed in recent works. Particularly, such integrals were represented as functions of multiple parameters and efficiently computed using interpolation and tensor decomposition techniques. The effect of the application of such techniques on the full -matrix assembly time was numerically studied. For the Laplace boundary single layer operator on various geometries it was demonstrated that the 50%-70% reduction of the time required for the evaluation of the nearly singular and weakly singular integrals results in the 10%-20% reduction of the total -matrix assembly time. Given the bound on the ranks of -matrix r, the rest of the time is spent for the evaluation of far-field integrals within the ACA+ procedure of the-matrix construction. If the evaluation of the far-field is done in a more efficient manner, the gain can be significantly larger. And this is the case for the fast multipole methods. The precomputation time (i.e. time needed for the construction of the translation operators) for the HF FMM scales as (assuming ) for the wavenumber ) and the constants involved are significantly smaller than that for the -matrix assembly. This can be seen in the experiments of D. Brunner, M. Junge, P. Rapp, M. Bebendorf, and L. Gaul (2010), where the IIF FMM precomputation times were reported to be in practice 9-20 times smaller than that for the-matrix construction. This can be also observed in the numerical experiments in Section 2.3. In the study of M. Fischer (2004), the time to compute the near-field for the HF FMM accelerated Burton-Miller formulation is compared to the time needed to construct the corresponding IIF FMM translation matrices. The results show that for BEM discretizations with 103 - HP triangular boundary elements the computation of the near-field is typically order of magnitude slower than the assembly of translation matrices. However the actual constants depend much on the implementation and the desired accuracy. Nevertheless, for large problems we should be able to see the improvement if we skip constructing the near-field. Asymptotic complexity estimates are improved as well. Indeed, while the application of ACA/ACA+ based -matrix techniques requires the evaluation of 4-dimensional integrals, for the use of the HF FMM in the far-field only the evaluation of two-dimensional integrals (for the cluster basis) is needed. We perform this step not during the precomputation stage, but when compute matrix-vector products (this allows to avoid storing the cluster basis for all matrices and thus improves memory costs). Therefore the relative improvement Since in the course of the recursive algorithm, the matrix- vector multiplication with the same matrix block is performed multiple times, it makes sense to precompute the corresponding discretizations of boundary integral operators and keep them in memory, rather than recompute them every time the matrix-vector multiplication is needed. For the matrices that are approximated with the help of the fast multipole method the near-field and translation operators can be stored. If only a small part of matrices has the near-field, the storage costs needed for HF FMM approximated matrices can be affected as well. Given the HF FMM approximation of the storage for its far-field part (translation matrices of the FMM) scales as where we assumed while for the near- field Hence, as is smaller than (though only by a logarithmic term). The improvement in the storage costs can be achieved only in the case when the constants in are so small that even for rather large M, As our numerical experiments in Section 4 show, in practice this is often the case. The presence of decay, i.e. in the case when facilitates the reduction of storage costs. If s1 is large enough, for such discretizations the far-field part see also Figure 1. Figure 1: Frequencies s for which we need to construct discretizations of boundary single-layer operators ; they are computed as eigenvalues of Here h=1. While many frequencies arc located close to the imaginary axis, a significant part of frequencies has (high-decay case). A large part of the far-field of the corresponding matrices is negligibly small and

NEAR-FIELD REUSE

Auxiliary Relations on Leaves of a Block-Clustcr Tree -

Before describing our strategy for dealing with the near-field, we introduce two auxiliary relations defined on leaves of a block-cluster tree, namely the near-field d-admissibility and the far-field d-admissibility. Recall that given a cluster the center of its bounding box we denote by and the diameter of the bounding box by . Definition 5.1. Given we will call a leaf near-field d-admissible if Definition 1. Given a leaf is far-field D- admissible if Remark 1. The following properties hold: (1) If is near-field d-admissible then (2) If is not far-field D-admissible then We will denote the set of near-field d-admissible leaves ofa block-cluster tree by and the set of far-field D-admissible leaves by Remark 2. The following observation is crucial for our algorithm. Recall that is defined as the set of all non-admissible block-clusters of the block-cluster tree Then it. is possible to choose d s.t.

(1)

This follows from the definition of the admissibility condition. Namely, given non-admissible leaves satisfy where are the centers of bounding boxes of and are their diameters. The choice

(2)

ensures that (1) holds true. Now we have all the ingredients needed to describe fast Runge-Kutta convolution quadrature. Consider the matrix-vector product, namely

(3)

After the discretization in space with the help of the Galerkin method (with trial and test basis functionsthe above system of equations can be rewritten as:

(4)

where and is the kernel function

(5)

Let d he chosen as in (2). The double integral in (4) can be split into a sum of two double integrals: one over the leaves of the block-cluster tree belonging to the set and the other being the remainder. Namely,

(5.6) (5.7)

where In this case does not contain the near-field, since all non-admissible block-clusters belong to Since are matrix-valued functions, and are tensors.

First, we demonstrate why such splitting may improve storage and computational costs. The bounds show that, for any given there exists L,

for all and (8)

We assume w.l.o.g. that

(9)

Then some of the elements of the tensor are approximately equal to zero. Let us show this. For Recall that the boundary single-layer operator for the Laplacian is continuous from . Hence, for some that depends only on it holds:

(5.10)

where = meas(supp()), i, j = 1,... M. Then e can always be chosen so that up to a desired precision can be rewritten as

(5.11)

where

(12)

Hence, to approximate completely the near-field part of the matrix of the system (3), only O(L) Galerkin matrices need to be constructed. In practice we do not assemble these matrices, but rather evaluate the matrix-vector product with with the help of either of two procedures we present below. Before describing these procedures, we would like to show that and does not depend on the size of the system. Recall that the diameter of non- admissible clusters where is the meshwidth (this is by construction of the admissible block-cluster tree). Hence, by (2), for some Since for some Importantly, is constant and does not depend on h and The estimate on L can be obtained, choosing a priori Namely, there exist constants s.t. Then L can be estimated from: From this it follows that for a fixed accuracy Remark 3. Increasing the value of d allows to reuse a pan of the far-field as well. Remark 4. We do not address here the question how has to be chosen to preserve the stability and convergence of the method. A full analysis would require the combination of the estimates of L. Banjai and S. Sauter (2008). In particular, it is shown that the convergence of the sparse DDF2 convolution quadrature is preserved if the convolution weights are cut off with the accuracy e satisfying We expect similar estimates to hold for our case as well, since all the errors are linear, and bounds for the errors and operator norms depend on h, polynomially or as powers (positive or negative) of h, Next the question of the efficient evaluation of a matrix vector product with the system (5.11) is addressed. We suggest the use of either of two methods.

In this section we would like to address several questions on the application of data-sparse techniques, -matrices and the high-frequency fast multipole method, for approximating Galerkin discretizations of boundary integral operators in the course of the recursive algorithm. Recall that the setup time (i.e. the matrix assembly time) of -matrices that use expansions coming from the HF FMM is much smaller than that of -matrices. How-ever, the corresponding IIF FMM accelerated matrix-vector multiplications are significantly slower than the matrix-vector multiplications with -matrices, even for discretizations with about 10° boundary elements . The structure of the system of equations we need to solve is shown in Figure 5.2.

Figure 2: The structure of the matrix of the convolution quadrature system of equations.

The solution of the small triangular system of size J (where the matrix to is involved) has to be performed times. Since this operation requires the construction of only a few matrices and performing many matrix-vector multiplications with them, it makes scene to approximate these matrices by -matrix techniques. Additionally, matrix-vector multiplications with matrix blocks at the lower levels of the recursive algorithm (in the figure these blocks are marked by Ti) need to be performed more often than that with the matrix blocks located at the higher levels of the recursive algorithm (To). Hence, for large problems it is reasonable to employ pure -matrix approximations in this case. For the rest of the Toeplitz blocks the choice whether - or -based approximation is to be used is done. The advantage of the recursive algorithm is its easy parallelizabilitv. The precomputation of Galerkin discretizations of boundary integral operators can be done independently in

FAST CQ ALGORITHM AND ITS COMPLEXITY

In this section the fast Runge-Kutta convolution quadrature algorithm is described. Compared to the conventional recursive algorithm, the multiplication with Toeplitz matrix blocks is replaced by the improved procedure. by the two procedures. MultiplyNF -performs the matrix-vector multiplication with the near-field: MultiplyFF - performs the matrix-vector multiplication with the far-field: Let the parameter J be fixed: every system of size smaller than J is to be solved directly. function Solve if then SolveBasici Else Solve MultiplyNF MultiplyFF Solve end if endFunction

performs an extra block matrix-vector multiplication with the near-field matrices. The computational complexity of each of matrix-vector multiplication with the near-field matrices is either (if the near-field matrix-vector multiplication with the diagonalization is used) or see Remark 5.4. Totally, there are matrix blocks (4), hence the total complexity of the near-field related matrix-vector multiplications is The number of matrix-vector multiplications with the far-field matrices is while each of this matrix-vector multiplications requires about operations (here the hidden constant depends on the accuracy of the approximation. Hence, the total complexity of the algorithm is The memory costs for the near-field matrices scale linearly, while for the rest of the matrices as As before, for the matrices with the far-field in this complexity estimate there is a hidden constant that depends on the accuracy of the approximation. The construction times for -matrices scale as where is the com plexity of the evaluation of the integrals in BEM, see also the discussion . Since we use the technique, scales not worse than for (in our case ). The construction times for - matrices scale as The hidden constants in these complexity estimates depend on the accuracy of the matrix approximations. Combined with the use of data-sparse techniques and the complexity estimates, the computational complexity of the Solve procedure is not worse than the time to construct the matrices for and the storage costs are

CONCLUSION

In this work we built up a quick recursive Runge-Kutta convolution quadrature algorithm for the solution of the wave scattering problem in three dimensions. This method requires the construction of Galerkin discretizations of boundary integral administrators for the Helmholtz equation with rot. Fast Runge-Kutta convolution quadrature is based on two ingredients: the use of fast data-sparse techniques, namely the high-frequency fast multipole method and H-matrices, and decay properties of Runge-Kutta convolution weights (that are the Exponentially fast decay of convolution weights away from the neighborhood of allows to skip constructing the diagonal and near-diagonal matrix blocks for most of the boundary integral operator discretizations, thus avoiding the evaluation of many singular and near-singular BEM integrals.

REFERENCES

Books-

M. Schanz (2001b). Wave Propagation in Viscoelastic and Poroelastic Continua: A Boundary Element Approach, vol 2 of Lecture Notes in Applied Mechanics. Springer-Verlag, Berlin, Heidelberg, New York. P.W. Partridge, C.A. Brebbia, and L.C.Wrobel (1992). The Dual Reciprocity Boundary Element Method. Computational Mechanics Publication, Southampton. Sauter, S. A., and Schwab, C. (2013). Boundary Element Methods, vol. 39 of Springer Series in Computational Mathematics. Springer, Berlin.

Research Papers-

Brunner, M. Junge, P. Rapp, M. Bebendorf, and L. Gaul (2010). Comparison of the fast multipole method with hierarchical matrices for the Helmholtz-BEM, CMES: Computer Modeling in Engineering & Sciences, 58, pp. 131-160. Ch. Lubich and A. Ostermann (1993). Runge-Kutta methods for parabolic equations and convolution quadrature, Mathematics of Computation, 60, pp. 105-131. L. Banjai and S. Sauter (2008). Rapid solution of the wave equation in unbounded domains, SIAM J. Numerical Analysis, 47 (2008), pp. 227-249. R. Laliena and F.-J. Sayas (2009). Theoretical aspects of the application of convolution quadrature to scattering of acoustic waves, Numer. Math., 112, pp. 637-678. W. Hackbusch, W. Kress, and S. A. Sauter (2007). Sparse convolution quadrature for time domain boundary integral formulations of the wave equation by cuto_ and panel-clustering, 29, pp. 113-134.

M. Fischer (2004). The Fast Multipole Boundary Element Method and its Application to Structure- Acoustic Field Interaction, PhD thesis, University of Stuttgart.

Corresponding Author Seema Rani*

Research Scholar of OPJS University, Churu, Rajasthan

E-Mail –