This book constitutes the refereed proceedings of the Third International Euro-Par Conference, held in Passau, Germany, in August 1997.
The 178 revised papers presented were selected from more than 300 submissions on the basis of 1101 reviews. The papers are organized in accordance with the conference workshop structure in tracks on support tools and environments, routing and communication, automatic parallelization, parallel and distributed algorithms, programming languages, programming models and methods, numerical algorithms, parallel architectures, HPC applications, scheduling and load balancing, performance evaluation, instruction-level parallelism, database systems, symbolic computation, real-time systems, and an ESPRIT workshop.
Author(s): Paul Feautrier (auth.), Christian Lengauer, Martin Griebl, Sergei Gorlatch (eds.)
Series: Lecture Notes in Computer Science 1300
Edition: 1
Publisher: Springer-Verlag Berlin Heidelberg
Year: 1997
Language: English
Pages: 1382
Tags: Software Engineering/Programming and Operating Systems; Computer Systems Organization and Communication Networks; Mathematics of Computing; Algorithm Analysis and Problem Complexity; Computational Mathematics and Numerical Analysis
Basis of parallel speculative execution....Pages 1-14
Unifying theories for parallel programming....Pages 15-30
Automatic parallelization of irregular and pointer-based computations: Perspectives from logic and constraint programming....Pages 31-45
Static and dynamic data management in networks....Pages 46-56
Iterative algorithms on high performance architectures....Pages 57-71
A performance tuning approach for shared-memory multiprocessors....Pages 72-83
Workshop 01: Support tools and environments....Pages 85-88
Nova visualization for optimization of data-parallel programs....Pages 89-93
On correcting the intrusion of tracing non-deterministic programs by software....Pages 94-101
Using control and data flow analysis for race evaluation....Pages 102-109
Client server computing on message passing systems: Experiences with PVM-RPC....Pages 110-117
Exdasy — A user-friendly and extendable data distribution system....Pages 118-127
Interconnecting multiple heterogeneous parallel application components....Pages 128-139
EDPEPPS: An integrated graphical toolset for the design and performance evaluation of portable parallel software....Pages 140-149
Load balancing based on process migration for MPI....Pages 150-157
A processors management system for PVM....Pages 158-161
A full program control flow representation for real programs....Pages 162-165
Workshop 02: Routing and communication in interconnection networks....Pages 167-170
Efficient total-exchange in wormhole-routed toroidal cubes....Pages 171-175
An analysis of deflection-based wormhole routing with virtual channels....Pages 176-187
Wormhole deadlock prediction....Pages 188-195
Broadcast and associative operations on fat-trees....Pages 196-207
On the fault tolerance of fat-trees....Pages 208-217
Minimal routing in the triangular grid and in a family of related tori....Pages 218-225
Embedding complete k -ary Trees into 2-dimensional meshes and tori....Pages 226-233
Optimal gossip in store-and-forward noncombining 2-D tori....Pages 234-241
Cutwidth of the mesh of d -ary trees....Pages 242-245
Embedding and emulation results for static multichannel mesh of optical buses....Pages 246-249
Routing on asyncronous processor networks....Pages 250-257
The complexity of shortest path and dilation bounded interval routing....Pages 258-265
Finding a pair on a mesh with multiple broadcasting is hard....Pages 266-271
Routing on the PADAM: Degrees of optimality....Pages 272-279
Workshop 03: Automatic parallelization and high-performance compilers....Pages 281-284
Handling memory cache policy with integer points countings....Pages 285-293
A graphical tool for automatic parallelization and scheduling of programs on multiprocessors....Pages 294-301
Identifying critical loads in real programs for decoupled VSM systems....Pages 302-305
Runtime interprocedural data placement optimisation for lazy parallel libraries (extended abstract)....Pages 306-309
A technique for mapping sparse matrix computations into regular processor arrays....Pages 310-317
A relational approach to the compilation of sparse matrix programs....Pages 318-327
Solutions to the communication minimization problem for affine recurrence equations....Pages 328-337
Dependence-free clustering of shift-invariant data structures....Pages 338-341
Experiences in analyzing data dependences for programs with pointers and structures....Pages 342-346
Applicability of program comprehension to sparse matrix computations....Pages 347-351
Hamiltonian recurrence for ILP....Pages 352-355
Optimizing storage size for static control programs in automatic parallelizers....Pages 356-363
Optimal distribution assignment placement....Pages 364-373
Workshop 04+08+13: Parallel and distributed algorithms....Pages 375-378
Parallel merge sort on concurrent-read owner-write PRAM....Pages 379-383
Feasible models of computation: Three-dimensionality and energy consumption....Pages 384-388
Sample sort on meshes....Pages 389-398
Sorting on a massively parallel system using a library of basic primitives: Modeling and experimental results....Pages 399-408
Parallel priority Queue and list contraction: The BSP approach....Pages 409-416
Priority queue operations on EREW-PRAM....Pages 417-420
Concurrent rebalancing of AVL trees: A fine-grained approach....Pages 421-429
NC approximation algorithms for 2-connectivity augmentation in a graph....Pages 430-439
Approximating scheduling problems in parallel....Pages 440-449
A new staircase separator theorem....Pages 450-457
Tentative time warp....Pages 458-467
Synchronized DSM models....Pages 468-475
A space-efficient and self-stabilizing depth-first token circulation protocol for asynchronous message-passing systems....Pages 476-479
Distributed self-stabilizing algorithm for minimum spanning tree construction....Pages 480-487
Partly-consistent cuts of databases....Pages 488-495
Exploiting atomic broadcast in replicated databases (extended abstract)....Pages 496-503
Workshop 05+06: Programming languages and concurrent object-oriented programming....Pages 505-510
Synchronising asynchronous communications....Pages 511-520
Typechecking of Pei expressions....Pages 521-529
Functional parallel programming with explicit processes: Beyond SPMD....Pages 530-537
Testing semantics for unbounded nondeterminism....Pages 538-545
An efficient compilation framework for languages based on a concurrent process calculus....Pages 546-553
Behavioural types for a calculus of concurrent objects....Pages 554-561
Time in message sequence charts: A formal approach....Pages 562-566
Integrating an entry consistency memory model and concurrent object-oriented programming....Pages 567-571
Modeling the dynamic behavior of objects on events, messages and methods (extended abstract)....Pages 572-575
A quality design solution for object synchronization....Pages 576-580
NeXeme: A distributed scheme based on Nexus....Pages 581-590
Athapascan runtime: Efficiency for irregular problems....Pages 591-600
Optimization of out-of-core computations using chain vectors....Pages 601-608
Workshop 07: Programming models and methods....Pages 609-613
Parlists — A generalization of powerlists....Pages 614-618
Skeletons for data parallelism in p31....Pages 619-628
Embodying parallel functional skeletons: An experimental implementation on top of MPI....Pages 629-633
On dividing and conquering independently....Pages 634-637
M-Tree: A parallel abstract data type for block-irregular adaptive applications....Pages 638-649
A monadic calculus for parallel costing of a functional language of arrays....Pages 650-661
A methodology for deriving parallel programs with a family of parallel abstract machines....Pages 662-669
Parallel distributed programming with Haskell+PVM....Pages 670-677
A parallelisation approach for supporting scalable and portable computing....Pages 678-682
Workshop 09: Parallel numerical algorithms....Pages 683-687
Scalability of parallel sparse Cholesky factorization....Pages 688-699
Optimal parallel algorithms for solving tridiagonal linear systems....Pages 700-709
Robust parallel Lanczos methods for clustered eigenvalues....Pages 710-717
A fully parallel symmetric matrix transformation....Pages 718-721
Numerical experiments with a parallel fast direct elliptic solver on Cray T3E....Pages 722-725
New matrix-by-vector multiplications based on a nonoverlapping domain decomposition data distribution....Pages 726-733
A comparison between different parallelization methods on workstation clusters to solve CFD-problems....Pages 734-741
Scalable parallel SSOR preconditioning for lattice computations in gauge theories....Pages 742-749
Deteriorating convergence for asynchronous methods on linear least squares problems....Pages 750-759
Workshops 10+11+14: Parallel computer architecture and image processing....Pages 761-765
The Delft-Java engine: An introduction....Pages 766-770
Scheduling instructions with uncertain latencies in asynchronous architectures....Pages 771-778
Co-processor system design for fine-grain message handling in KUMP/D....Pages 779-788
A virtual-physical on-chip cache for shared memory multiprocessors....Pages 789-792
Shared vs. snoop: Evaluation of cache structure for single-chip multiprocessors....Pages 793-797
Morphological hough transform on the instruction systolic array....Pages 798-806
An analytical design of high-speed pixel transformation for object boundary enhancement....Pages 807-814
Karhünen-Loève transform: An exercise in simple image-processing parallel pipelines....Pages 815-819
Use of F-code as a very high level intermediate laguage for DSP....Pages 820-823
Workshop 12: Applications of high-performance computing....Pages 825-827
Experiments on using WPVM for industrial visual inspection problems....Pages 828-831
Object-oriented parallel software for radio wave propagation simulation in urban environment....Pages 832-839
A portable parallel implementation of a 3D semiconductor device simulator....Pages 840-847
A parallel sparse LU decomposition with application to semiconductor device simulation....Pages 848-851
A parallel simulation of a quantitative large-strain polycrystal deformation....Pages 852-855
Parallel genetic algorithms applied to optimum shape design in aeronautics....Pages 856-863
Parallel multidimensional calculation of steady-state and time-dependent flows with combustion....Pages 864-871
A two-level parallel strategy for rotorcraft optimization and design....Pages 872-875
Workshop 15: Scheduling and load balancing....Pages 877-881
Performance comparison of load balancing policies based on a diffusion scheme....Pages 882-886
Effectively scheduling parallel tasks and communications on networks of workstations....Pages 887-894
On linear schedules of task graphs for generalized logp-machines....Pages 895-904
Rescheduling support for mapping dynamic scientific computation onto distributed memory multiprocessors....Pages 905-912
Versatile task scheduling of binary trees for realistic machines....Pages 913-921
Load balancing issues in the prepartitioning method....Pages 922-936
Design of novel load-balancing algorithms with implementations on an IBM SP2....Pages 937-944
Repartitioning of adaptive meshes: Experiments with multilevel diffusion....Pages 945-949
On the embedding of refinements of 2-dimensional grids....Pages 950-957
Dynamic program description as a basis for runtime optimization....Pages 958-965
Workshop 16: Performance evaluation and prediction....Pages 967-970
Workload analysis of computation intensive tasks: Case study on SPEC CPU95 benchmarks....Pages 971-984
Statistical performance modeling: Case study of the NPB 2.1 results....Pages 985-992
A general performance model for multistage interconnection networks....Pages 993-1000
Simulation of a routing algorithm using distributed simulation techniques....Pages 1001-1008
Message-passing performance of parallel computers....Pages 1009-1016
Prefetching and multithreading performance in bus-based multiprocessors with Petri Nets....Pages 1017-1024
On synchronisation in fault-tolerant data and compute intensive programs over a network of workstations....Pages 1025-1029
Performance analysis of a parallel program for wave propagation simulation....Pages 1030-1033
Bounding the minimal completion time of static mappings of multithreaded solaris programs....Pages 1034-1038
Workshop 17: Instruction-level parallelism....Pages 1039-1042
The performance potential of value and dependence prediction....Pages 1043-1052
An enhanced two-level adaptive multiple branch prediction for superscalar processors....Pages 1053-1060
The effect of the speculation depth on the performance of superscalar architectures....Pages 1061-1065
Allocating lifetimes to queues in software pipelined architectures....Pages 1066-1073
Treegion scheduling for highly parallel processors....Pages 1074-1078
Modulo scheduling with cache reuse information....Pages 1079-1083
Memory address prediction for data speculation....Pages 1084-1091
A realistic study on multithreaded superscalar processor design....Pages 1092-1101
A limitation study into access decoupling....Pages 1102-1111
Workshop 18: Parallel and distributed database systems....Pages 1113-1116
Load balanced query evaluation in shared-everything environments....Pages 1117-1124
Exploring load balancing in parallel processing of recursive queries....Pages 1125-1129
Use of a semantically grained database system for distribution and control within design environments....Pages 1130-1134
Method transformations for vertical partitioning in parallel and distributed object databases....Pages 1135-1143
Benchmarking and performance tuning of multimedia servers....Pages 1144-1153
Database program mapping onto a shared-nothing multiprocessor architecture: Minimizing communication costs....Pages 1154-1158
Large join order optimization on parallel shared-nothing database machines using genetic algorithms....Pages 1159-1163
Workshop 19: Symbolic computation....Pages 1165-1168
Using the parallel Karatsuba algorithm for long integer multiplication and division....Pages 1169-1172
Towards full prolog on a distributed architecture....Pages 1173-1180
Improving distributed unification through type analysis....Pages 1181-1190
Static granularity optimization of a committed-choice language Fleng....Pages 1191-1200
Distributed arrays in the functional language concurrent clean....Pages 1201-1208
Design and implementation of parallel TRAM....Pages 1209-1216
Changing the distribution depth during a parallel tree search....Pages 1217-1220
Workshop 19: Symbolic Computation Abstract And-Parallel Machines....Pages 1221-1225
Workshop 20: Real-time systems and constraints....Pages 1227-1230
Designing an embedded hard real-time system: A case study....Pages 1231-1235
Reactive real-time programming with distributed agents....Pages 1236-1243
An Ml -like module system for the synchronous language Signal ....Pages 1244-1253
Synchronous thread management in a distributed operating system's micro kernel....Pages 1254-1261
Task-system analysis using slope-parametric hybrid automata....Pages 1262-1273
A methodology for compilation of high-integrity real-time programs....Pages 1274-1281
Schedulers for age constraint tasks and their performance evaluation....Pages 1282-1289
Analyzing schedulability of astral specifications using extended timed automata....Pages 1290-1297
Deriving annotations for tight calculation of execution time....Pages 1298-1307
Cinderella: A retargetable environment for performance analysis of real-time software....Pages 1308-1315
Esprit projects on high-performance computing and networking....Pages 1317-1320
PHASE and MICA: Application specific metacomputing....Pages 1321-1326
FRONTIER: Use of HPCN technologies....Pages 1327-1332
PINEAPL: A European project to develop a parallel numerical library for industrial applications....Pages 1333-1339
RAIN: Redundant array of inexpensive workstations for neurocomputing....Pages 1340-1345
PARSAR: Parallelisation of a chirp scaling algorithm SAR processor....Pages 1346-1350
OCEANS: Optimizing compilers for embedded applications....Pages 1351-1356
SEEDS — Simulation environment for the evaluation of distributed traffic control systems....Pages 1357-1362
EFTOS: A software framework for more dependable embedded HPC applications....Pages 1363-1368
STAMPAR: A parallel processing approach for the explicit dynamic analysis of sheet stamping problems....Pages 1369-1374