A parallel workload has extreme variability in a production environment

01/11/2018
by   R. Henwood, et al.
0

Writing data in parallel is a common operation in some computing environments and a good proxy for a number of other parallel processing patterns. The duration of time taken to write data in large-scale compute environments can vary considerably. This variation comes from a number of sources, both systematic and transient. The result is a highly complex behavior that is difficult to characterize. This paper further develops the model for parallel task variability proposed in the paper "A parallel workload has extreme variability" (Henwood et. al 2016). This model is the Generalized Extreme Value (GEV) distribution. This paper further develops the systematic analysis that leads to the GEV model with the addition of a traffic congestion term. Observations of a parallel workload are presented from a High Performance Computing environment under typical production conditions, which include traffic congestion. An analysis of the workload is performed and shows the variability tends towards GEV as the order of parallelism is increased. The results are presented in the context of Amdahl's law and the predictive properties of a GEV models are discussed. A optimization for certain machine designs is also suggested.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset