Parallel and Distributed Processing: 15 IPDPS 2000 Workshops Cancun, Mexico, May 1–5, 2000 Proceedings (Lecture Notes in Computer Science, 1800)

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This volume contains the proceedings from the workshops held in conjunction with the IEEE International Parallel and Distributed Processing Symposium, IPDPS 2000, on 1-5 May 2000 in Cancun, Mexico. The workshopsprovidea forum for bringing together researchers,practiti- ers, and designers from various backgrounds to discuss the state of the art in parallelism.Theyfocusondi erentaspectsofparallelism,fromruntimesystems to formal methods, from optics to irregular problems, from biology to networks of personal computers, from embedded systems to programming environments; the following workshops are represented in this volume: { Workshop on Personal Computer Based Networks of Workstations { Workshop on Advances in Parallel and Distributed Computational Models { Workshop on Par. and Dist. Comp. in Image, Video, and Multimedia { Workshop on High-Level Parallel Prog. Models and Supportive Env. { Workshop on High Performance Data Mining { Workshop on Solving Irregularly Structured Problems in Parallel { Workshop on Java for Parallel and Distributed Computing { WorkshoponBiologicallyInspiredSolutionsto ParallelProcessingProblems { Workshop on Parallel and Distributed Real-Time Systems { Workshop on Embedded HPC Systems and Applications { Recon gurable Architectures Workshop { Workshop on Formal Methods for Parallel Programming { Workshop on Optics and Computer Science { Workshop on Run-Time Systems for Parallel Programming { Workshop on Fault-Tolerant Parallel and Distributed Systems All papers published in the workshops proceedings were selected by the p- gram committee on the basis of referee reports. Each paper was reviewed by independent referees who judged the papers for originality, quality, and cons- tency with the themes of the workshops.

Author(s): Jose Rolim (editor)
Publisher: Springer
Year: 2000

Language: English
Pages: 1317

Lecture Notes in Computer Science
Parallel andDistributed Processing
Volume Editors
Foreword
Contents
Memory Management in a combinedVIA/SCI Hardware
1 Motivation and Introduction
2 What are the Memory Management Considerations?
3 PCI{SCI vs. VIA discussion and comparison
3.1 Question 1: How a process' memory area is made available tothe NIC and in what way main memory is protected againstwrong accesses?
3.2 Question 2: At which point in the system a DMA engine isworking and how are the transactions of this DMA enginevalidated?
3.3 Question 3: In which way memory of a process on a remotenode is made accessible for a local process?
4 A new PCI{SCI Architecture with VIA Approaches
4.1 Advanced Memory Management
4.2 Operation of Distributed Shared Memory from amemory-
related point of view
4.3 Operation of Protected User-Level Remote DMA from amemory-
related point of view
4.4 A free choice of using either Programmed I/O or User-
LevelRemote DMA
5 Inuence on MPI Libraries
6 State of the project (November 1999)
7 Other Works on SCI and VIA
8 Conclusions and Outlook
References
ATOLL, a new switched, high speedInterconnect in Comparison to Myrinet and SCI
1 Introduction
2 Design Space for Network Interfaces
3 NIC Hardware Layout and Design
3.1 ATOLL
3.2 Myrinet
3.3 Scalable Coherent Interface (SCI)
4 Software
4.1 Low Level API
4.2 Upper Software Layer for Communication
4.3 Communication Overhead
5 Conclusion
References
ClusterNet: An Object-Oriented Cluster Network
1 Introduction
2 ClusterNet
2.1 Functionality: Router vs. Aggregate Function Execution
2.2 Network Storage: Routing Tables vs. Network-Embedded Data Structures
2.3 Network Port Interface: I/O Queues vs. Register Interface
2.4 Software Interface: Packet vs. Direct Read and Write
3 The ClusterNet4EPP Proof-of-Concept Prototype
4. Related Research
5. Conclusions and Future Directions
References
GigaBit Performance under NT
Abstract
1. Introduction
2. Message Passing
2.1 MPI Overview
2.2 PVM Overview
3. Gigabit Ethernet
4. MPI NT Environments
4.1 MPI/PRO for Windows NT
4.2 PaTENT WMPI 4.0
4.3 WMPI
5. Performance Tests
5.1 Test Equipment
5.2 Multi-processor Benchmark - PingPong
5.2.1 MPI Version
5.2.2 PVM Version
5.3 Differences of the MPI and PVM versions of PingPong
6. Results
6.1 Introduction
6.2 Latency Results (Table 3)
6.3 Network Bandwidths
6.3.1 Shared Memory Results (Figure 2)
6.3.2 Distributed Memory
7. Summary and Conclusions
7.1 Summary
7.2 Price/Performance Considerations
7.3 Summary of Conclusions
7.4 Future Work
References
MPI Collective Operations o ver IP Multicast *
1 Introduction
2 IP Multicast
3 MPI Collective Operations
3.1 MPI Broadcast
3.2 MPI Barrier Synchronization
4 Experimental Results
5 Conclusions and Future Work
References
An Open Market-Based Architecture forDistributed Computing
1 Introduction
2 System Properties
3 Resource Allocation
4 System Architecture
4.1 Overview of System Components
4.2 Basic System Services and Communication
5 Supporting Distributed Computing Paradigms
5.1 The Generic Master { Slave Model
5.2 A Sample Client Application
6 Related Work
7 Discussion
8 Future Directions
References
The MultiCluster Model to the Integrated Use ofMultiple Workstation Clusters
1 Introduction
2 Integrating Multiple Clusters
2.1 Hardware Aspects
2.2 Software Aspects
3 The MultiCluster Model
3.1 Hardware Platform
3.2 Software Structure
3.3 The Programming Environment—DECK
4 Related Work
5 Conclusions and Current Work
References
Parallel Information Retrieval on an SCI-BasedPC-NOW
1. Introduction
2. PC Cluster-based IR System
2.1 Typical IR System on Uniprocessor
2.2 Declustering IIF
2.3 Parallel IR System Model
2.4 Experimental PC Cluster System
2.5 SCI-based DSM Programming
3. Performance of PC Cluster-based IR System
3.1 Performance Comparison between SCI-based System and MPI-based System
3.2 Effect of Declustering IIF
3.3 Performance with Various-sized IIF
3.4 Reducing IR Operation Time
4. Conclusions
References
A PC-NOW Based Parallel Extension for aSequential DBMS
1 Introduction
2 Architecture
2.1 General Overview
2.2 The Server Module
2.3 The Calculator Module
3 Prototyping
3.1 General Overview
3.2 Implementation of the Server Module
3.3 Implementation of the Calculator Module
3.4 Communication Issues
4 Current Performance of the Extension Prototype
4.1 Underlying Hardware
4.2 Speed-up
4.3 Real Database Tests
5 Discussion
5.1 Parallel Extension vs. Porting
5.2 Toward a Generalization of the Parallel Extension Concept
6 Context
7 Summary
References
Workshop on Advancesin Parallel and Distributed Computational Models
The Heterogeneous BulkSynchronous Parallel Model
1 Introduction
2 Related Work
3 Heterogeneous BSP
4 HBSP Algorithms
4.1 Pre x Sums
4.2 Matrix Multiplication
4.3 Randomized Sample Sort
5 Conclusions and Future Directions
References
On stalling in LogP*
(Extended Abstract)
1 Introduction
2 The models
2.1 LogP’s stalling behaviour
3 Separation between -stalling LogP and stall-free LogP
Untitled
References
Parallelizability of some P-complete problems*
1
Introduction
2 Parameterized convex layers
3 Lexicographically first maximal 3 sums
4 Conclusions
References
A New Computation of Shape Moments viaQuadtree Decomposition *
1 Introduction
2 Basic Data Manipulation Operations
3 The Quadtree Decomposition
4 Computing Shape Moments
5 Parallel Moment Computation Algorithm
6 Concluding Remarks
References
The Fuzzy Philosophers
1. Introduction
2. A-protocol
3. Correctness of A-protocol
4. B-protocol
5. Efficiency of A-protocol
References
A Jav aApplet to Visualize Algorithms on Reconfigurable Mesh
1
Introduction
2 Reconfigurable Mesh
3 Specification of Software
3.1 User Interface
3.2 Programming Language
4 Execution of the JRM
4.1 Visualization of the Pre x Sums Algorithm
4.2 Some Algorithms Implemented on the JRM
5 Conclusion
References
A Hardware Implementation of PRAM and itsPerformance Evaluation *
1 Introduction
2 Design of the PRAM-like Computers
2.1 Steps of the PRAM Model and Our PRAM-like Computers
2.2 Architecture of the PRAM-like Computers
2.3 Memory Accessing and Synchronous Processing
2.4 The Internal Processing in the Nodes of the Units
2.5 Amount of Hardware and Theoretical Processing Time
3 Evaluation of the Implementation Method
References
A Non-Binary Parallel Arithmetic Architecture
1 Introduction
2 The shift switch logic
3 The small shift switch adder architecture
4 The larger shift switch adder architecture
5 Concluding remarks
References
Multithreaded Parallel Computer Model withPerformance Evaluation *
1 Introduction
2 Multithreaded parallel computer model
3 Multithreaded parallel computer model simulator
4 PRAM algorithms implemented in the MPCM
4.1 Prefix
sums algorithm
4.2 List ranking algorithm
5 Performance evaluation
References
Workshop on Parallel and DistributedComputing in Image Processing, VideoProcessing, and Multimedia (PDIVM2000)
Organizers
Preface
Committees
MAJC-5200: A High PerformanceMicroprocessor for Multimedia Computing
1 Introduction
2 Arc hitecture
3 MAJC-5200 Microprocessor
3.1 Building Blocks
3.2 MAJC CPU
4 Instruction Set
5 Performance in Multimedia Applications
6 Conclusion
References
A Novel Superscalar Architecture for FastDCT Implementation
1. Introduction
2. Modified SIMD Architecture for Fast DCT
3. Superscalar Execution of FDCT
4. Comparison and Conclusion
REFERENCES
Computing Distance Maps EcientlyUsing An Optical Bus
1 Introduction
2 The LARPBS Model
3 Algorithm Using n2 Processors
4 Algorithm Using n3 Processors
5 Conclusions
References
Advanced Data Layout Optimization forMultimedia Applications
1 Introduction and Related Work
2 Example Illustration
3 Main Memory Data Layout Organization (MDO)
3.1 The General Problem
3.2 The Pragmatic Solution
4 Experimental Results
5 Conclusions
References
Parallel P arsing of MPEG Video in aMulti-threaded Multiprocessor Environment
1 Introduction
2 Scene change detection in MPEG1 video
2.1 Description of the MPEG1 video format
2.2 The motion-luminance approach
2.3 The spatio-temporal approach
3 Parallel video parsing
3.1 Parallel video parsing using the motion-luminance approach
3.2 Parallel video parsing using the spatio-temporal approach
4 Experimental Results and Conclusions
References
Parallelization Techniques for Spatial-TemporalOccupancy Maps from Multiple Video Streams
1 Introduction
2 Distributed Sensing
3 Algorithms
3.1 Image-based
3.2 Map-based
3.3 Image-level parallelism
3.4 Pixel-level parallelism
3.5 Map-level parallelism
4 Results
5 Conclusion
References
Heuristic Solutions for a Mapping Problem in aTV-Anytime Server Network*
1 Introduction
2 A Hierarchical TV-Anytime Server Network
3 The Media Mapping Problem
3.1 A Feature of the Mapping Problem
3.2 Formalizing the Mapping Problem
4 Parallel Simulated Annealing Algorithms
4.1 Initial Solution
4.2 Neighborhood Structure
4.2.1 Neighborhood Structure - Phase I
4.2.2 Neighborhood Structure - Phase II
5 Performance Evaluation
6 TV Cache - A TV-Anytime System
References
RPV: A Programming Environment forReal-time Parallel Vision -Specification and programming methodology-
1 Introduction
2 System Overview
2.1 Hardware Con guration
2.2 Software Architecture
2.3 Modules
3 RPV Programming Tool
3.1 Class RPV Connection
3.2 Function RPV Invoke
3.3 Sample Programs
4 Conclusion
Acknowledgement
References
Parallel low-level image processing on adistributed-memory system
1 Introduction
2 Low-level image processing operators
3 Integrating parallelism in an image processing library
4 Experimental results
5 Conclusions
6 Future work
References
Congestion-free Routing of
Streaming Multimedia Content inBMIN-based Parallel Systems
1 Introduction
2 Folded Benes networks
3 Flow-based adaptive routing
4 The distributed modi ed looping algorithm
5 Simulation results
6 Conclusion
References
Performance of On-Chip Multiprocessorsfor Vision Tasks*
1 Introduction
2 Selected Vision Tasks
3 On-Chip Multiprocessor
4 Simulation Environment
5 Simulation Results and Analysis
6 Concluding Remarks
Acknowledgement
References
Parallel Hardware-Software Architecture forcomputation of Discrete Wavelet Transform using theRecursive Merge Filtering algorithm
1 Introduction
2 Formal Description of the Recursive Merge Filtering Algorithm
2.1 RMF Operator
3 DWT in terms of the RMF operator
4 RMF Algorithm Computations and Data Shifting
5 Transformation of Data Routing to Address Computation
6 Equations for Data Shifting
7 Hardware-Software Architecture
8 FPGA Implementation and Resource Use
References
Fifth International Workshop onHigh{level Parallel Programming Modelsand Supportive EnvironmentsHIPS 2000
Preface
Workshop Chair
Steering Committee
Program Committee
Acknowledgments
Accepted Papers for HIPS 2000
Pipelining Wavefront Computations:Experiences and Performance*
1 Introduction
2 Representations: MPI, HPF, and ZPL
3 Parallelization Experiences: MPI, HPF, and ZPL
3.1 MPI
3.2 HPF
3.3 ZPL
4 Performance
5 Conclusion
References
Speci cation Techniques for AutomaticPerformance Analysis Tools
1 Introduction
2 Related work
3 Overall Design of the KOJAK Cost Analyzer
4 Performance Property Speci cation
4.1 Data Model
4.2 Performance Properties
5 Implementation
6 Conclusion and Future Work
References
PDRS: A Performance Data Representation System*
1 Introduction
2 Design and Implementation of PDRS
2.1 Trace Data Module
2.2 Data Management Module
2.3 Performance Database
2.4 Relational Queries Module
2.5 Performance Diagnostic Agent (PDA) Module
2.6 Performance Visualization and Auralization (PVA) Module and Graphical UserInterface Module
3 Summary
References
Clix* - A Hybrid Programming Environment forDistributed Objects and Distributed SharedMemory
1 Introduction
2 The Arts Platform
3 Distributed Shared Memory Abstractions
4 Object-Oriented Design of the Clix-DSM
5 Sample Implementation
6 Related Work
7 Conclusions
References
Controlling Distributed Shared Memory Consistencyfrom High Level Programming Languages
1 Introduction
2 Implicit versus Explicit Consistency Management
3 Mome DSM Consistency Model
4 Consistency Management Optimizations
5 Experiments
5.1 Simulated Code
5.2 Data Prefetch
5.3 Consistency Management Strategy
5.4 Manager Distribution
6 Related Work
7 Conclusion and Future Work
References
Online Computation of Critical Pathsfor Multithreaded Languages
1 Introduction
2 Benefits
of Getting Critical Path Information
3 Our Scheme: Online Longest Path Computation
3.1 Target Language
3.2 Computed Critical Paths
3.3 Instrumentation
3.4 Potential Problems and Possible Solutions
4 Experiments
5 Related Work
6 Conclusion and Future Work
References
Problem Solving Environment Infrastructure forHigh Performance Computer Systems
1 Introduction
2 The Proposed Model for Problem SolvingEnvironments
2.1 The Layered Architecture
2.2 Level 0 - Infrastructure
2.3 Level 1 - Hardware Abstractions
2.4 Level 2 - Programming Model
2.5 Level 3 - Mathematics
2.6 Level 4 - Domain Speci c Interface
3 Implementation
3.1 CECAAD
3.2 An Electromagnetics Environment for Cluster Computers
3.3 An Image Processing Environment for Recon gurableComputers
4 Conclusion
References
Combining Fusion Optimizations and PiecewiseExecution of Nested Data-Parallel Programs
1 Introduction
2 Related Work
3 Combining Fusion and Piecewise Execution
4 Implementation and Benchmarks
5 Conclusion and Future Work
References
Declarativ e concurrency in Java
1 Introduction
2 Related work
2.1 Concurrent object-oriented programming
2.2 Temporal constraints
3 Logic programs for concurrent programming
3.1 Events and constraints
3.2 Markers and events
3.3 Constraints and methods
4 Synchronization constraints
5 Implementation
6 Conclusion
References
Scalable Monitoring Technique for DetectingRaces in Parallel Programs*
1 Introduction
2 Background
3 Scalable Monitoring Technique
4 Related Work
5 Conclusion
References
3rd IPDPS Workshop onHigh Performance Data Mining
Preface
Workshop Co-Chairs
Program Committee
Implementation Issues in the Design of I/OIntensive Data Mining Applications on Clustersof Workstations
1 Introduction
2 Implementation of I/O Intensive DM Applications
3 A Test Case DM Algorithm and its Implementation
4 Experimental Results and Conclusions
References
A Requirements Analysis for Parallel KDD Systems
1 Introduction
2 PKDD Requirements
3 Mining Methods
4 Hardware Models and Trends
5 Software Infrastructure
6 Conclusions
References
Parallel Data Mining on ATM-ConnectedPC Cluster and Optimization of itsExecution Environments
1 Introduction
2 Our ATM-connected PC cluster and its communicationcharacteristics
3 Parallel data mining application and its implementationon the cluster
3.1 Association rule mining
3.2 Implementation of HPA program on PC cluster
4 Optimization of transport layer protocol parameters
4.1 Broadcasting on the cluster and TCP retransmission
4.2 Total performance of HPA program using proposed method
5 Dynamic remote memory acquisition
5.1 Dynamic remote memory acquisition and its experiments
5.2 Remote update method
References
The Parallelization of a Knowledge DiscoverySystem with Hypergraph Representation*
1 Introduction
2 Serial System INDED
2.1 Inductive Logic Programming
2.2 Serial Arichitecture
3 Parallelizing INDED
4 Naive Decomposition
5 Data Parallel Decomposition with Data Partitioning
5.1 Data Partitioning and Locality
5.2 Partitioning Algorithm
6 Global Hypergraph using Speculative Parallelism
7 Current Status and Results
8 Current and Future Work
References
Parallelisation of C4.5 as a ParticularDivide and Conquer Computation
1 Problem statement
2 The Programming Environment
3 Related work and first experiments
4 Current results
5 Expected Results and Future work
References
Scalable Parallel Clustering for Data Mining on
1 Introduction
2 Bayesian Classification and AutoClass
3 P-AutoClass
3.1 Design of the parallel algorithm
3.1.1 Parallel update_wts
3.1.2 Parallel update_parameters
4 Experimental results
5 Related work
6 Conclusion
References
Exploiting Dataset Similarity for DistributedMining *
1 Introduction
2 Similarity Measure
2.1 Association Mining Concepts
2.2 Similarity Metric
2.3 Sampling and Association Rules
3 Clustering Datasets
4 Experimental Analysis
4.1 Setup
4.2 Sensitivity to Sampling Rate
4.3 Synthetic Dataset Clustering
4.4 Census Dataset Evaluation
5 Conclusions
References
Scalable Model for Extensional and IntensionalDescriptions of Unclassified Data
1 Introduction
2 Motivation
3 Proposed Architecture
4 ART1 Neural Network
5 Combinatorial Neural Model (CNM)
6 Ongoing Work
References
Parallel Data Miningof Bayesian Networks fromTelecommunications Network Data
1 Introduction and the Global Picture
1.1 Telecommunication Fault Management, Fault Correlation andData Mining BBNs
1.2 The Architecture
2 The Parallel Data Mining Algorithm
2.1 The Need for Parallelism
2.2 Parallel Cause And E ect Genetic Algorithm (P-CAEGA)
2.3 Results
2.4 Future Potential Research
3 Conclusion
References
IRREGULAR'00SEVENTH INTERNATIONAL WORKSHOP ONSOLVING IRREGULARLY STRUCTURED PROBLEMSIN PARALLEL
General Chair
Program Co-chairs
Steering Committee
Program Committee
Invited Speakers
FOREWORD
Load Balancing and Continuous QuadraticProgramming
1
Extended Abstract
Parallel Management of Large Dynamic SharedMemory Space: A Hierarchical FEM Application
1 Introduction
2 EÆcient Irregular Memory Accesses within a LargeVirtual Address Space
2.1 Understanding ccNUMA Architecture
2.2 Enhancing Virtual Pages Locality
3 EÆcient Parallel Dynamic Memory Allocation
3.1 The Fragmentation Problem
3.2 Allocating Memory in Parallel
4 Conclusion
References
Efficient
Parallelization of UnstructuredReductions on Shared Memory ParallelArc hitectures*
1 Introduction
2 Unstructured Reductions on Shared Memory Machines
3 Unstructured Reductions on Distributed MemoryMachines
4 Exclusive Ownership Technique
5 Performance Results
6 Summary and Conclusion
References
Parallel FEM Simulation of Crack Propagation {Challenges, Status, and Perspectives*
1 Introduction
2 System Overview
3 Geometric Modeling and Mesh Generation
4 Equation Solving and Preconditioning
5 Adaptivity
6 Future Work
7 Conclusions
References
Support for Irregular Computations inMassively Parallel PIM Arrays, Usingan Object-Based Execution Model
1 Introduction
2 Irregular Problems
3 Macroservers
4 PIM Arrays and Their Support for IrregularComputations
5 Case Study: Sparse Matrix Vector Multiply
6 Conclusion
References
Executing Communication-Intensive IrregularPrograms EÆciently
1 Introduction
2 Constraints on Execution
2.1 Barrier Synchronization
2.2 Fixed Size Processor Partitions
3 Scheduling on Fixed Size Partitions
3.1 Handling Y-Irregularity
3.2 Handling X-Irregularity
4 Online Rebalancing of Threads
4.1 Costs of Imbalance and Rebalancing
4.2 Optimal Online Algorithm
4.3 Low Overhead Alternatives
5 Summary and Future Work
References
NON-MEMORY-BASED AND REAL-TIMEZEROTREE BUILDING FOR WAVELETZEROTREE CODING SYSTEMS
1 INTRODUCTION
2 THE ARCHITECTURE FOR REARRANGING2-STAGE 2-D DWT
2.1 Two Priliminary Devices Used in the Architecture forRearrangement
2.2 the Proposed Architecture and the Analysis of Its Operations
3 THE DESIGN EXTENDED TO GENERAL STAGESOF DWT
4 PERFORMANCE ANALYSIS AND CONCLUSION
References
Graph Partitioning for Dynamic, Adaptive andMulti-phase Computations
A Multilevel Algorithm for Spectral Partitioningwith Extended Eigen-Models
1 Introduction
2 Extended Eigen-Model
3 Subspace Algorithms for Solving ExtendedEigenproblems
4 Algorithm
5 Numerical Results
References
An Integrated Decomposition and PartitioningApproach for Irregular Block-StructuredApplications
1 Introduction
2 The partitioning approach
2.1 Overview
2.2 The clustering algorithm
2.3 The distribution method
3 Numerical results
3.1 The applications
3.2 Comparison with the Berger-Rigoutsos algorithm
3.3 Parallel performance
4 Conclusions
Acknowledgments
References
Ordering Unstructured Meshes for SparseMatrix Computations on Leading ParallelSystems
1 Introduction
2 Partitioning and Linearization
2.1 Cuthill-McKee Algorithms (CM)
2.2 Self-Avoiding Walks (SAW)
2.3 Graph Partitioning (MeTiS)
3 Experimental Results
3.1 Distributed-Memory Implementation
3.2 Shared-Memory Implementation
3.3 Multithreaded Implementation
4 Work in Progress
Acknowledgements
References
A GRASP for computing approximate solutions for theThree-Index Assignment Problem
A
BSTRACT
On Identifying Strongly Connected Componentsin Parallel*
1 Introduction
2 A Parallelizable Algorithm for Strongly ConnectedComponents
3 Serial Complexity of Algorithm DCSC
4 Future Work
Acknowledgements
References
A Parallel, Adaptive Re nement Scheme forTetrahedral and Triangular Grids
1 Introduction
1.1 Serial Element Adaption Scheme
2 Parallel Implementation
2.1 Data Structures
2.2 Re nement
2.3 Grid Closure
3 Groundwater Application
4 Conclusion
References
PaStiX : A Parallel Sparse Direct Solver Basedon a Static Scheduling for Mixed 1D/2D BlockDistributions *
1 Introduction
2 Parallel solver and block mapping
3 Numerical Experiments
4 Conclusion and Perspectives
References
Workshop onJav afor Parallel and Distributed Computing
Workshop Organization
Program Co-Chairs
Program Committee
An IP Next Generation Compliant JavaTM VirtualMachine
1 Introduction
2 A Quick overview of IPv6
2.1 Addressing format
2.2 Modifications of the IP stack protocols
2.3 Other new features
3 Developing an IPv6 package for Java
3.1 Architecture
3.2 Implementation of the underlying mechanisms
4 Results and extensions
4.1 A raw level and new options
4.2 An IPv6 compliant JVM
4.3 Results
5 Conclusion
References
An Approach to Asynchronous Object-OrientedParallel and Distributed Computing onWide-Area Systems*
1 Introduction
2 Related work
3 The Moka programming model
4 The Java API of Moka
5 Transparent vs. non-transparent distributedinteractions
6 Performance evaluation
7 Conclusions
References
Performance Issues for Multi-language JavaApplications
1 Introduction
2 Choices for Native Interface
3 Java Native Interface Performance
3.1 Cost of a Native Call
3.2 Cost of Accessing Data from Native Code
4 Embedding the JVM in Servers: Memory Managementand Threading Interactions
5 Implementation of Fast JNI on IRIX
5.1 Fast JNI Calling Optimization in the MIPS JIT
5.2 JNI Pinning Lock Optimization in the JVM
6 Related Work
7 Conclusions
Acknowledgements
References
MPJ: A Proposed Java MessagePassing API and Environment forHigh Performance Computing
1. Introduction
2. Some design decisions
3. Overview of the Architecture
3.1 Process creation and monitoring
3.2 The MPJ daemon
3.3 Handling MPJ aborts – Jini events
3.4 Other failures – Jini leasing
3.5 Sketch of a “Device-Level” API for MPJ
4. Conclusions and Future Work
5. References
Implementing Jav aconsistency using a generic,multithreaded DSM runtime system
1 Introduction
2 Executing Java programs on distributed clusters
2.1 Concurrent programming in Java
2.2 The Hyperion System
3 Implementing Java consistency
3.1 DSM-PM2: a generic, multi-protocol DSM layer
3.2 Using DSM-PM2 to build a Java consistency protocol
4 Preliminary performance evaluation
5 Conclusion
References
Third Workshop on Bio-Inspired Solutions toParallel Processing Problems (BioSP3)
Workshop Chairs
Steering Committee
Program Committee
TAKE ADVANTAGE OF THE COMPUTINGPOWER OF DNA COMPUTERS
1 Introduction
2 DNA Computation Model
2.1 Operations
2.2 Biological Implementation
3 NP-complete Problem Solving
3.1 One Simpli ed NP-complete Problem
3.2 An Advanced Problem
4 Problem Reconsideration
5 Conclusion
References
Agent surgery: The case for mutable agents
1 Introduction
2 Mutable programs
3 Mutability in Bond
4 Surgery techniques
4.1 Simple surgical operations
4.2 Replacing the strategy of a state
4.3 Splitting a transition with a state
4.4 Bypassing states
4.5 Adding and removing planes
4.6 Joining and splitting agents
4.7 Trimming agents
5 Conclusions
Acknowledgments
References
Was Collective Intelligence1 before Life on Earth?
1. Introduction
2. Computational Collective Intelligence
2.1 Computational model of Collective Intelligence
2.2. The Inference Model for Collective Intelligence and its measure
3. Comprehension and definition of life
4. Ordering Collective Intelligence and Life
5. Conclusions
6. References
Solving Problems on Parallel Computers by CellularProgramming
1. Introduction
2. Cellular Programming
3. A Parallel Environment for CARPET
4. Programming Examples
4.1. The wireworld program
4.2. A forest fire model
4.3 Performance results
6. Conclusion
Acknowledgements
References
Multiprocessor Scheduling with Support byGenetic Algorithms - based Learning Classifier System
1
Introduction
2 Genetic Algorithms-based Learning Classi er System
3 Multi-agent Approach to Multiprocessor Scheduling
4 An Architecture of a Classi er System to SupportScheduling
5 Experimental Results
6 Conclusions
References
Viewing Scheduling Problems throughGenetic and Evolutionary Algorithms
1
Introduction
2 Genetic and Evolutionary Algorithms
2.1 Basic Concepts
2.2 GEPE: The Genetic and Evolutionary ProgrammingEnvironment
3 Analysis of the Scheduling Processes
4 Approaching the JSSP with GEAs
5 Results
6 A practical example
7 Conclusions
References
Dynamic Load Balancing Model: PreliminaryAssessment of a Biological Model for aPseudo-Search Engine
1 Introduction
2 Methodologies of Genetic Programming
2.1 Overview
2.2 Application of the Genetic Operators
3 Adapting the Biological Model
3.1 Overview of the Biological Model
3.2 Description of the Computer Model
4 Computation Measures for the Pseudo-Search Engine
4.1 Overview
4.2 Computational Measures
5 Conclusion
6 Acknowledgements
References
A Parallel Co-ev olutionary Metaheuristic
1 Introduction
2 The Quadratic Assignment Problem
3 CO-SEARCH: A parallel co-evolutionary metaheuristic
3.1 MTS: the search agent
3.2 The Adaptive Memory
3.3 The Diversifying GA
3.4 The intensifying kick
4 Experiments
5 Conclusion
References
Neural Fraud Detection in Mobile PhoneOperations
1 Introduction
2 Neural Fraud Model Construction
2.1 Classi cation of Mobile Phone Users
2.2 Neural Network Model
3 Experimental Study
3.1 Environment and Implementation Issues
4 Conclusion
References
Information Exchange in Multi Colony AntAlgorithms
1
Introduction
2 Ant Algorithm for TSP
3 Parallel Ant Algorithms
4 Strategies for Information Exchange
5 Results
6 Conclusion
References
A Surface-Based DNA Algorithm for theExpansion of Symbolic Determinants
1 Introduction
2 Surface-Based Operations
2.1 Abstract Model
2.2 Biological Implementation
3 Hard Computation Problem Solving
3.1 Expansion of Symbolic Determinants Problem
3.2 Surface-Based Algorithm
4 Analysis of the Algorithm
5 Conclusion
References
Hardware Support forSimulated Annealing and Tabu Search
1
Introduction
2 Local Search
3 Hardware Support
4 Implementation
5 Results
6 Discussion
References
Eighth International Workshop onParallel and Distributed Real-Time Systems
General Chair
Program Chairs
Publicity Chair
Steering Committee
Program Committee
Message from the Program Chairs
A Distributed Real Time Coordination Protocol
1 Introduction
2 Problem Formulation
3 Protocol Design
4 Summary and Conclusion
Acknowledgement
References
A Segmented Backup Scheme for Dependable RealTime Communication in Mult ihop Networks
1 Introduction
2 Spare Resource Allocation
3 Backup Route Selection
4 Failure Recovery
5 Delay and Scalability
6 Performance Evaluation
7 Conclusions
References
Real-Time Coordination in Distributed MultimediaSystems
1 Introduction
2 The Coordination Language Manifold
3 Extending Manifold with a Real-Time Event Manager
3.1 Recording Time
3.2 Expressing Temporal Relationships
4 Coordination of RT Components in a Multimedia Presentation
5 Conclusions
References
Supporting Fault-Tolerant Real-TimeApplications using the RED-Linux GeneralScheduling Framework *
1 Introduction
2 Related Work on Fault-Tolerant and Real-TimeSupport
3 The RED-Linux General Scheduling Framework
4 The Design of Fault Monitors
5 The Implementation of Task Group in RED-Linux
6 Conclusions
References
Are COTS suitable for building distributedfault-tolerant hard real-time systems*
?
1 Introduction
2 COTS and hard real-time constraints
2.1 Methodology
2.2 WCET analysis and COTS hardware
2.3 WCET analysis and COTS real-time operating systems
3 COTS and fault tolerance constraints
3.1 Methodology
3.2 O -line task replication
3.3 Basic fault-tolerance mechanisms and COTS components
4 Experimental platform
5 Concluding remarks
References
Autonomous Consistency Technique inDistributed Database withHeterogeneous Requirements
1 Background
1.1 Needs in SCM
2 Approach
2.1 Assurance
2.2 Goal
3 Accelerator
3.1 Allowable Volume
3.2 System Model in SCM
3.3 Accelerator
3.4 AV management
4 Simulation
5 Conclusion
References
Real-time Transaction ProcessingUsing Two-stage Validation in Broadcast Disks*
1 Introduction
2 Issues of Transaction Processing in BroadcastEnvironments
3 Protocol Design
3.1 Broadcasting of Validation Information
3.2 Timestamp Ordering
4 The New Protocol
4.1 Transaction Processing at Mobile Clients
4.2 The Server Functionality
5 Conclusions and Future Work
References
Using Logs to Increase Availability in Real-TimeMain-Memory Database
1 Introduction
2 RODAIN Database
3 Log Handling in the RODAIN Database Node
4 Experimental Study
5 Conclusion
References
Components are from Mars
1 Introduction
2 Basic Component Model and Qualification
Basic component model
Implications for distributed real-time systems
3 On Disputed Issues in Component Design
Do components have state?
Are objects components?
4 Concluding Remarks
Acknowledgements
References
2 + 10 ˜ 1 + 50 !
1 Introduction
2 Fault Modelling and Analysis
3 Testability Analysis
4 Conclusion and future challenges
References
A Framework for Embedded Real-time SystemDesign *
1 Introduction
2 A Fully Automatic Approach for the Analysis ofReal-time Systems.
3 Conclusion
References
Best-effort Scheduling of (m,k)-firm Real-time Streams in Multihop Networks
1 Introduction
2 E
DBP Scheduling Algorithm
3 Performance Study
4 Conclusions
References
Predictabili ty and Resource Management inDistribut ed Multimedia Presentations
1 Introduction
2
The Proposed Language Extensions for QoS de nition
3 The Proposed Runtime Environment
3.1 Determination of Task Blocking Time
4 Related Work
5 Conclusions
References
Quality of Service Negotiation for Distributed, DynamicReal-time Systems
1
Introduction
2 QoS Negotiation Architecture and Approach
3 QoS Negotiation Algorithm and Protocol
4 Experimental Results
5 Previous Work in QoS Negotiation
6 Conclusions and Future Work
References
An Open Framework for Real-TimeScheduling Simulation
1
Introduction
2 Related Work
3 Theory of Operation
3.1 Taks Model and Workload Generation
3.2 Scheduling and Dispatching
3.3 Logging and Statistics
4 Conclusions
References
5th International Workshop on Embedded/DistributedHPC Systems and Applications (EHPC 2000)
Preface
E
HPC 2000 Contents
Program Committee
Advisory Committee
A Probabilistic Power Prediction Tool for the Xilinx4000-Series FPGA
1
Introduction and Background
2 Overview of the Tool
3 Calibration of the Tool
4 Power Measurements
5 Experimental Evaluation of the Tool
6 Summary
Acknowledgements
References
Application Challenges: System Health Managementfor Complex Systems
1
Introduction
1.1 Challenges in system health management
1.2 Condition-Based Maintenance for Naval Ships
1.2.1 MPROS Architecture
1.2.2 Data Concentrator hardware
2 MPROS Software
2.1 PDME
2.2 Knowledge fusion
3 Validation
4 Conclusions
5 Acknowledgment
References
Accommodating QoS Prediction in an Adaptive Resource Management Framwork
1 Introduction
2 Overview of RM approach
3 Software and Hardware Profiling
4 QoS and Resource Utilization Monitoring
5 Resource Selection
6 Experiments
7 Conclusions and Ongoing Work
References
Network Load Monitoring in Distributed Systems
1 Introduction
2 Load Simulator
3 Previous Work
4 Experimental Procedure
5 Conclusion
References
A Novel Specification and Design MethodologyOf Embedded Multiprocessor Signal Processing SystemsUsing High-Performance Middleware
1 Introduction
2 The Need for Model Continuity in Specification & DesignMethodologies
3 The MAGIC Specification and Design Methodology
4 Model Continuity via Middleware
5 Using VSIPL & MPI for Model Continuity
6 Conclusion
References
Auto Source Code Generation and Run-TimeInfrastructure and Environment for High Performance,Distributed Computing Systems
1 Introduction
1.1 Systems and Applications Genesis Environment (SAGE)
2 Auto-Glue Code Generation and Run-Time Kernel
3 Experiments
3.1 Benchmark Applications
3.2 Target Machine
3.3 Experiments and Test Method
3.4 Results
4 Conclusions
References
Developing an Open Architecture for PerformanceData Mining
1 Introduction
2 Unified Modeling Language
3 A Performance Data Mining Architecture
4 Discussion
5 Future PDMA Research
6 Conclusions
Acknowledgements
References
A 90k gate “CLB” for Parallel Distributed Computing
1 Introduction
2 ManArray Parallel Distributed Computing
3 Evaluation
4 Conclusions
References
Po wer-Aware Replication of Data Structures inDistributed Embedded Real-Time Systems*
1 System Model
2 Numerical Results
2.1 E ect of Application Write Ratios
2.2 Impact of Per-hop Transfer Cost
2.3 Task Allocation and Network Topology
2.4 Routing Issues
2.5 Selective Replication
3 Conclusion
References
Comparison of MPI Implementations on aShared Memory Machine
1 Introduction
2 Approach
3 Results
3.1 Platform Con guration
3.2 Sun's HPC 3.0 MPI
3.3 LAM Shared Memory MPI
3.4 MPICH
4 Conclusions
References
A Genetic Algorithm Approach to SchedulingCommunications for a Class of Parallel Space-TimeAdaptive Processing Algorithms
1 Introduction and Background
2 Overview of Parallel STAP
3 Genetic Algorithm Methodology
4 Numerical Results
5. Conclusion
Acknowledgements
References
Reconfigurable Parallel Sorting and Load Balancing on aBeowulf Cluster: HeteroSort
1 Introduction
1.1 Dynamic Adaptability
1.2 Beowulf Clusters
1.3 Local Knowledge and Global Processes
1.4 Related Work
2 Approach
2.1 Beowulf Clusters
2.2 Optimization of HeteroSort
3 Fault Tolerance
5.1 Future Directions
Acknowledgments
Reference
7th Recon gurable Architectures Workshop(RAW2000)
Workshop Chair
Steering Chair
Program Chair
Publicity Chair
Programme Committee
Preface
Programme of RAW 2000:
Run-Time Recon guration at Xilinx(invited talk)
JRoute: A Run-Time Routing API for FPGAHardware
1 Introduction
2 Overview of the Virtex Routing Architecture
3 JRoute Features
3.1 Various Levels of Control
3.2 Support for Cores
3.3 Unrouter
3.4 Avoiding Contention
3.5 Debugging Features
4 JRoute versus Routing with JBits
5 Portability
6 Future Work
7 Conclusions
Acknowledgements
References
A Recon gurable Content Addressable Memory
1 Introduction
2 A Standard CAM Implementation
3 An FPGA CAM Implementation
4 The Recon gurable Content Addressable Memory(RCAM)
5 An RCAM Example
6 System Issues
7 Comparison to Other Approaches
8 Associative Processing
9 Conclusions
10 Acknowledgements
References
ATLANTIS – A Hybrid FPGA/RISC BasedRe-configurable System
1 Introduction
2 ATLANTIS System Architecture
2.1 ATLANTIS Computing Board (ACB)
2.2 ATLANTIS I/O Board (AIB)
2.3 ATLANTIS Active Backplane (AAB)
2.4 Host CPU
2.5 CHDL Development Environment
3 Applications
3.1 High Energy Physics
3.2 Image processing
3.3 Astronomy
3.4 Measured and Estimated Performance
4 Summary and Outlook
References
The Cellular Processor Architecture CEPRA{1Xand its Con guration by CDL
1 Introduction
2 Target Architectures
3 CDL, a Language for Cellular Processing
4 Transformation into a Hardware Description
5 Conclusion
References
Loop Pipelining and Optimization for Run Time Reconfiguration*
1 Introduction
2 Related Work
3 Pipeline Construction
3.1 Definitions
3.2 Phase 0: Pre-processing and Mapping
3.3 Phase 1: Partitioning
3.4 Routing Considerations
3.5 Phase 2: Pipeline Segmentation
3.6 Recon guration of null stages
4 Results
5 Conclusions
References
Compiling Process Algebraic Descriptionsinto Recon gurable Logic
1 Introduction
2 The Circal process algebra
3 Overview of compiler operation
4 A circuit model of Circal
4.1 Design outline
4.2 Process logic design
4.3 The complete process logic block
5 Mapping circuits to recon gurable logic
6 Deriving modules from process descriptions
7 Conclusions
Acknowledgements
References
Behavioral Partitioning with Synthesisfor Multi-FPGA Architecturesunder Interconnect, Area, and Latency Constraints *
1 Introduction
2 Partitioning and Synthesis Framework
3 The FMPAR Partitioner with the Exploration Engine
3.1 The FMPAR Algorithm
4 Experimental Results
4.1 Effectiveness of Dynamic Exploration with FMPAR
4.2 Comparison of FMPAR against a Simulated Annealing Partitioner
4.3 On-Board Implementations
5 Summary
References
Module Allocation for DynamicallyRecon gurable Systems
1 Introduction
2 Problem Formulation
3 Con guration Bundling
3.1 Bundling Compatibility of Temporal Templates
3.2 Measure of Con guration Bundling
4 Con guration Bundling Driven Module AllocationAlgorithm
4.1 Initial Module Allocation
4.2 Ordering and Allocating Temporal Templates
5 Experimental Results
6 Conclusions and Acknowledgments
References
Augmenting Modern Superscalar Architectures withConfigurable Extended Instructions
1 Introduction
2 T1000 Architecture1
2.1 Background
2.2 T1000 Details
3 Methodology
3.1 Performance evaluation
3.2 Hardware Cost
4 Potential Performance Payoff of Aggressive Instruction Selection
4.1 Performance Results Using the Greedy Selection Algorithm
5 A Selective Algorithm for Choosing Extended Instructions
5.1 Selective Algorithm Overview
5.2 Performance Improvements Using the Selective Algorithm
6 Configurable Hardware Cost
7 Prior Work
8 Conclusions and Future Work
References
Complexity Bounds for Lookup TableImplementation of Factored Forms in FPGATechnology Mapping
1 Introduction
2 Preliminaries
3 Worst Case Mapping to K-
LUTs
4 Conclusion
References
Optimization of Motion Estimator for Run-Time-Reconfiguration Implementation.
1. Introduction.
2. Qualitative motion estimation in the Log-Polar space.
3. Determination of the possible number of steps for RTRimplementation.
3.1. Evaluation of the possible number of steps.
3.2. Modelling and parameters determination.
4. Results.
5. Conclusion and future work.
References.
Constan t-Time Hough Transform On A 3D Reconfigurable Mesh Using Few er Processors
1
Introduction
2 The Computational Model
3 The Constant-Time Algorithm
References
Fifth International Workshop on FormalMethods for Parallel Programming: Theory andApplications FMPPTA2000
Program and Organizing Chair's Message
Foreword
Programme Committee
978-3-540-45591-2_134_OnlinePDF.pdf
A Method for Automatic Cryptographic ProtocolVerification (Extended Abstract)
1 Introduction
2 Terms, Formulae, _-Parameterized Tree Automata
3 Messages, What Intruders Know, and Simulating Protocol Runs
4 Experimental Results
5 Conclusion
Acknowledgments
References
Verification
Methods forWeaker Shared Memory Consistency Models
1
Introduction
Contribution 1: Architectural tests for Weaker Memory Models
Contribution 2: New Abstraction Methods for Architectural Tests
2 Summary of Results
3 Conclusions and Future Work
References
Models Supporting Nondeterminismand Probabilistic Choice
1 Introduction
2 Domains
3 CSP
4 The probabilistic power domain
4.1 Probabilistic CSP
5 Constructing a new model
6 Summary
References
Concurrent Specification And Timing Analysisof Digital Hardware using SDL
1
Introduction
2 Approach
3 Validation and Verification
4 Abstract Sequencing Constraints
5 A Typical Component: A Delay Flip-Flop
6 A Simple Circuit: The Single Pulser
7 A More Complex Circuit: A Bus Arbiter
8 Conclusions
Acknowledgements
References
Incorporating Non-functional Requirements intoSoftware Architectures
1 Introduction
2 ZCL Framework
2.1 CL Model
2.2 ZCL Framework
3 Formalising and Incorporating NFRs into DynamicSoftware Architectures
3.1 Formalising NFRs
3.2 Integrating NFRs into the ZCL Framework
4 Case Study: an Appointment System
5 Conclusion and Future Works
References
Automatic Implementation of Distributed SystemsFormal Specifications
Introduction
Formal Description Techniques
Mondel Language
Implementation Approaches
The DRACO-PUC Environment
Setting up the Implementation Environment
Conclusions
References
Refinement based validation of an algorithm fordetecting distributed termination
1 Introduction
2 Description of the Algorithm
2.1 Diffusing Computation
2.2 Termination Detection
2.3 Path Vectors
3 TheUNITY Formalism
3.1 UNITY Logic Predicates
3.2 Refinements
4 Validation
4.1 Specification of the Termination
4.2 Structure of the Validation
4.3 Diffusing Computation Pattern
4.4 The Concrete Model
4.5 The Abstract Model
4.6 The AuxiliaryModel
4.7 Liveness
4.8 Mechanizing the Development
5 Conclusion
References
978-3-540-45591-2_141_OnlinePDF.pdf
Tutorial 1 : Abstraction and Refinement ofConcurrent Programs and FormalSpecificationA Practical View
References
Tutorial 2: A Foundation for ComposingConcurrent Objects
References
Workshop on Optics and Computer Science(WOCS 2000)
Organizers:
Preface
Program Chair
Steering Committee
Program Committee
Fault Tolerant Algorithms for a Linear Arraywith a Reconfigurable Pipelined Bus System
1 Introduction
2 Model Descriptions
2.1 LARPBS Model
2.2 Fault Model
3 Preprocessing Phase
4 Fault Tolerant Algorithms
5 Conclusions
References
Fast and Scalable Parallel Matrix Computationswith Optical Buses(Extended Abstract*
)
1 Introduction
2 Scalable Parallelization
3 Optical Buses
4 Matrix Multiplication, Chain Product, and Powers
5 Inversion of Lower and Upper Triangular Matrices
6 Determinants, Characteristic Polynomials, and Ranks
7 Inversion of Arbitrary Matrices
8 Linear Systems of Equations
9 LU- and QR-Factorizations
References
Pulse-Modulated Vision Chips with Versatile-Interconnected Pixels
1 Introduction
2 Vision Chip Based on PWM
3 Vision Chip Based on PFM
4 Discussion
5 Conclusion
Acknowledgments
References
Connectivity Models for OptoelectronicComputing Systems
1 Connectivity ,Dimensionality ,and Rent's Rule
2 Discontinuities and the Origin of Rent's Rule
3 Free-Space Optical Interconnections
4 Fundamental Studies of Interconnections
5 Conclusion
References
Optoelectronic-VLSI Technology: Terabit/s I/O to aVLSI Chip
R
eferences
Three Dimensional VLSI-Scale Interconnects
Introduction
PIM Motivation
Optoelectronic Technologies
Summary
References
Present and Future Needs of Free-Space OpticalInterconnects
1 Introduction
2 Present Status of FSOI
3 Present limitations in FSOI and future directions
4 Conclusions
References
Fast Sorting on a Linear Array with a ReconfigurablePipelined Bus System*
1 Introduction
2 Fast sorting on the LARPBS
2.1 Definitions and properties
2.2 An O(log logN) time merging algorithm on the LARPBS
2.3 The sorting algorithm
References
Architecture description and prototypedemonstration of optoelectronicparallel-matching architecture
1 Introduction
2 Parallel Matching Architecture
3 Experimental prototype system
4 Conclusions
Acknowledgment
References
A Distributed Computing Demonstration System UsingFSOI Inter−Processor Communication
1 Introduction
2 System Topology
2.1Carrier Boards
2.2 System Board
3 Processor Interconnection
4 Processing Element
5 Conclusion
References
Optoelectronic Multi-Chip Modules Based on ImagingFiber Bundle Structures
VCSEL based smart pixel array technology enableschip-to-chip optical interconnect
978-3-540-45591-2_156_OnlinePDF.pdf
Run-Time Systems for Parallel Programming
Preface
A Portable and Adaptative Multi-ProtocolCommunication Library for MultithreadedRuntime Systems
1 EÆcient Communication in MultithreadedEnvironments
2 The Madeleine II Multi-Protocol CommunicationInterface
3 Inside Madeleine II : from the Application to theNetwork
4 Implementation and Performances
5 Related work
6 Conclusion
References
CORBA Based Runtime Support for LoadDistribution and Fault Tolerance
1 Introduction
2 Integrating Load Distribution into CORBA
3 Runtime Support for Fault Tolerance in CORBA BasedSystems
4 Experimental Results
5 Conclusions
References
Run-time Support for Adaptive Load Balancing
1 Motivation and Related Work
2 Load Balancing Framework
3 Load Balancing Strategies
4 Application Performance
5 Conclusion
References
Integrating Kernel Activations in aMultithreaded Runtime System on top of Linux
1 Kernel Support for User Level Thread Schedulers
1.1 The Marcel Mixed Thread Scheduler
1.2 Better Support: Kernel Activations
2 Marcel on Top of Linux Activations
2.1 How it works
2.2 Extensions to the original proposal
2.3 Modi˝cations to Marcel
3 Performance and Evaluation
3.1 Performance
4 Conclusion
References
DyRecT: Software Support for AdaptiveParallelism on NOWs
1 Introduction
2 High-Level Primitives
3 Low-Level Primitives
4 Performance Results
5 Conclusion
References
Fast Measurement of LogP Parametersfor Message Passing Platforms
1 Introduction
2 Parameterized LogP
3 Fast parameter measurement
3.1 Limitations of the method
4 Result evaluation
5 Conclusions
Acknowledgements
References
Supporting exible safety and sharing inmulti-threaded environments*
1 Introduction
2 Safe Threads package
2.1 Support for Threads
2.2 Protected domains and Permission relationships
2.3 Implementation
3 Performance Analysis
3.1 Thread Creation
3.2 Context Switch
4 Existing Safety Solutions
5 Conclusion
References
A Runtime System for Dynamic DAGProgramming
1 Introduction
2 DAG and Compact DAG
3 The Incremental Execution Model
4 The Parallel Scheduling Algorithm
5 Runtime System Organization
6 Experimental Study
7 Conclusion
Acknowledgments
References
Workshop on Fault-Tolerant Parallel andDistributed Systems (FTPDS '00)
Workshop Chair
Invited speakers
Papers
Certification of system architecture dependability
Computing in the RAIN:A Reliable Array of Independent Nodes*
1 Introduction
1.1 Related Work
1.2 Novel Features of RAIN
2 Communication
2.1 Fault-Tolerant Interconnect Topologies
2.2 Consistent-History Protocol for Link Failures
2.3 A Port of MPI
3 Group Membership
3.1 Novel Features
4 Data Storage
4.1 Array Codes
4.2 Distributed Store/Retrieve Operations
5 Proof-of-Concept Applications
5.1 High-Availability Video Server
5.2 High-Availability Web Server
5.3 Distributed Checkpointing Mechanism
6 Conclusions
References
Fault Tolerant Wide-Area Parallel Computing
1.0 Introduction
2.0 Related Work
3.0 Fault Tolerance Options for SPMD Applications
4.0 Performance Models
5.0 Results
5.1 Validating the Models
5.2 Head-to-head Comparison
6.0 Summary
7.0 References
Transient Analysis ofDependability/P erformabilit y Models b yRegenerativ e Randomization with LaplaceTransform In version
1
Introduction
2 The New Variant
2.1 Closed form solution in the Laplace transform domain
2.2 Numerical Laplace inversion
3 Analysis and Comparison
4 Conclusions
References
FANTOMASFault Tolerance for Mobile Agents in Clusters
1
Introduction and Motivation
2 Related Work: Fault Tolerance for Mobile Agents
3 Concepts for a Fault Tolerance Approach for Mobile Agents
3.1 Goals and Requirements
3.2 Fault Model
3.3 Discussion of Fault Tolerance Methods
3.4 The FANTOMAS Concept
3.5 Diagnosis
4 Analytic Evaluation
5 Conclusions and Future Work
References
Metrics, Methodologies, and Tools forAnalyzing Network Fault Recovery Performancein Real-Time Distributed Systems
1 Introduction
2 Network Fault Recovery Technologies
3 Network Fault Recovery Performance
3.1 Testing Model
3.2 Fault Recovery Performance Metrics
4 Testing Methodology
5 Network Fault Recovery Performance Measurement Toolset
5.1 Test Orchestration Tools
5.2 Data Collection Tools
5.3 Analysis/Visualization Tools
6 Applying the Metrics, Tools, and Testing Methodology
6.1 General Test Setup
6.2 Example FDDI Test Results
6.3 Example Fast Ethernet Test Results
7 Conclusions and Ongoing Work
8 References
Consensus Based on Strong Failure Detectors:A Time and Message-Ecient Protocol
1 Introduction
2 Asynchronous Distributed Systems, Failure Detectorsand the Consensus Problem
2.1 Asynchronous Distributed System with Process Crash Failures
2.2 The Class S of Unreliable Failure Detectors
2.3 The Consensus Problem
3 The S-Based Consensus Protocol
3.1 The Protocol
3.2 Underlying Principles
3.3 Structure
3.4 Proof
4 Cost of the Protocol
5 Conclusion
References
Implementation of Finite Lattices in VLSI forFault-State Encoding in High-speed Networks
1 Introduction
2 Lattices and Fault-Tolerance
3 Application to Selected Fault-Tolerant RoutingAlgorithms
4 Implementatio
4.1 Simple Table-Based Method
4.2 Implementation with Boolean Lattice
4.3 Hybrid Implementations
5 Conclusion
References
Building a Reliable Message Delivery System Using theCORBA Event Service
1 Introduction
2 Log files and retry policies – Are they adequate?
3 Application-level reliability mechanism to provide resilience
3.1 Model for reliability: Resynchronization
4 Effectiveness of the reliability mechanism - Experiments
5 Summary and Concluding Remarks
References
Network Survivability Simulation of a CommerciallyDeployed Dynamic Routing System Protocol
INTRODUCTION
DRS ALGORITHM
DRS PROACTIVE COST
NETWORK SURVIVABILITY ANALYSIS
References
Fault-tolerant Distributed-Shared-Memory on aBroadcast-based Interconnection Network
1 Introduction
2 Fault Tolerant DSM on the SOME-bus
3 Conclusion
References
An Efficient Backup-Overloading for Fault-Tolerant Scheduling of Real-Time Tasks
1 Introduction
2 Dynamic Logical Groups
3 Performance Study
4 Conclusions
References
Mobile Agents to Automate Fault Management inWireless and Mobile Networks1
1
. Introduction
2. The Fault-Tolerant Wireless Network Management Architecture
3. Overall Description of Methodology
4. Fault Management
4.1 An Example of the Steps in Fault Correction and Recovery
4.2 A High-Level View of the System
5. Conclusion
References
9th Heterogeneous Computing Workshop(HCW 2000)
Session 1-AGrid Environmen t
Session 1-BResource Discovery and Management
Session 2-ACommunication and Data Management
Session 2-BModeling and Metrics
Session 3-ATheory and Modeling
Session 3-BScheduling I
Session 4-AGrid Applications
Session 4-BResource Management
Session 5-BScheduling II
Author Index