|
1. |
SOLVING SPARSE LEAST SQUARES PROBLEMS WITH PRECONDITIONED CGLS METHOD ON PARALLEL DISTRIBUTED MEMORY COMPUTERS |
|
Parallel Algorithms and Applications,
Volume 13,
Issue 4,
1999,
Page 289-305
TIANRUO YANG,
HAIXIANG LIN,
Preview
|
PDF (435KB)
|
|
摘要:
In this paper we study the parallel aspects of PCGLS, a basic iterative method based on the conjugate gradient method with preconditioner applied to normal equations and Incomplete Modified Gram-Schmidt (IMGS) preconditioner, for solving sparse least squares problems on massively parallel distributed memory computers. The performance of these methods on this kind of architecture is usually limited because of the global communication required for the inner products. We will describe the parallelization of PCGLS and IMGS preconditioner by two ways of improvement. One is to accumulate the results of a number of inner products collectively and the other is to create situations where communication can be overlapped with computation. A theoretical model of computation and communication phases is presented which allows us to determine the optimal number of processors that minimizes the runtime. Several numerical experiments on the Parsytec GC/PowerPlus are presented.
ISSN:1063-7192
DOI:10.1080/01495739908947371
出版商:Taylor & Francis Group
年代:1999
数据来源: Taylor
|
2. |
ON THE POWER OF SOME PRAM MODELS |
|
Parallel Algorithms and Applications,
Volume 13,
Issue 4,
1999,
Page 307-319
SELIMG. AKL,
LIN CHEN,
Preview
|
PDF (386KB)
|
|
摘要:
The focus here is the power of some underexplored CRCW PRAMs, which are strictly more powerful than exclusive write PRAM but strictly less powerful than BSR. We show that some problems can be solved more efficiently in time and/or processor bounds on these models. For example, we show that n linearly-ranged integers can be sorted in O(logn/loglogn) time with optimal linear work on Sum CRCW PRAM. We also show that the maximum gap problem can be solved within the same resource bounds on Maximum CRCW PRAM. Though some models can be shown to be more powerful than others, some of them appear to have incomparable powers.
ISSN:1063-7192
DOI:10.1080/01495739908947372
出版商:Taylor & Francis Group
年代:1999
数据来源: Taylor
|
3. |
EFFICIENT MAPPING REDUCTIONS USING ISO-PLANES ON THE POLYTOPE MODEL |
|
Parallel Algorithms and Applications,
Volume 13,
Issue 4,
1999,
Page 321-343
TOOMASP. PLAKS,
Preview
|
PDF (573KB)
|
|
摘要:
This paper presents a new technique for mapping algorithms onto regular (systolic) arrays. The technique integrates the associativity and commutativity of computations into space-time transformations on the polytope model and involves three categories of transformations: ( 1) iso-planes - forming iso-planes of computations for algorithm representation in contrast to the conventional technique using the data dependence graph; ( 2) increase in dimensionality -mapping a low dimensional algorithm representation into a higher dimensional version with a higher degree of parallelism; and (3) pipestructures - generating and choosing a particular partial order of computations on iso-planes for moving data around the regular array. Three operations for generating pipestructures are introduced: permutation, rotation and reversal. The method presented here increases the available degree of parallelism and thus improves the time complexity of systolic computations. Examples for developing 2-D arrays for 1-D convolution are presented.
ISSN:1063-7192
DOI:10.1080/01495739908947373
出版商:Taylor & Francis Group
年代:1999
数据来源: Taylor
|
4. |
A PARALLEL FINITE ELEMENT CODE FOR NONLINEAR LEAKY AQUIFER SYSTEMS |
|
Parallel Algorithms and Applications,
Volume 13,
Issue 4,
1999,
Page 345-361
GIORGIO PINP,
FLAVIO SARTORETTO,
Preview
|
PDF (498KB)
|
|
摘要:
Developing parallel codes for computing the nonlinear flow in multiaquifer porous systems is an important task both for improving model efficiency and for performing large real-life simulations. Multiaquifer systems consist of sandy and clayey alternating layers. In this paper, highly compressible multiaquifer systems are considered, where some hydraulic parameters depend on the potential head, thus the flow inside some layers is governed by nonlinear equations. An effective procedure for solving these equations is developed, relying upon The partition of the solution procedure into layer-wise steps. By assigning to each processor the computation of the flow inside a suitable set of layers, the iterative solution procedure can be efficiently implemented on a parallel super-computer. Using such a domain decomposition strategy, a satisfactory degree of parallelization is achieved when computing the flow in a realistic nonlinear multiaquifer system, employing a CRAY T3D massively parallel computer. Performing test simulations on real-life multiaquifer systems, the recorded speed-ups are as large as 1.89, 3.34. 5.37, with 2, 4, 8 processors, respectively. The importance of load balance and information exchange in casting the parallel performances of the code is also analyzed.
ISSN:1063-7192
DOI:10.1080/01495739908947374
出版商:Taylor & Francis Group
年代:1999
数据来源: Taylor
|
|