Transparent Serverless execution of Python multiprocessing applications

05/18/2022
by   Aitor Arjona, et al.
0

Access transparency means that both local and remote resources are accessed using identical operations. With transparency, unmodified single-machine applications could run over disaggregated compute, storage, and memory resources. Hiding the complexity of distributed systems through transparency would have great benefits, like scaling-out local-parallel scientific applications over flexible disaggregated resources. This paper presents a performance evaluation where we assess the feasibility of access transparency over state-of-the-art Cloud disaggregated resources for Python multiprocessing applications. We have interfaced the multiprocessing module with an implementation that transparently runs processes on serverless functions and uses an in-memory data store for shared state. To evaluate transparency, we run in the Cloud four unmodified applications: Uber Research's Evolution Strategies, Baselines-AI's Proximal Policy Optimization, Pandaral.lel's dataframe, and ScikitLearn's Hyperparameter tuning. We compare execution time and scalability of the same application running over disaggregated resources using our library, with the single-machine Python libraries in a large VM. Despite the significant overheads of remote communication, we achieve comparable results and we observe that the applications can continue to scale beyond VM limited resources leading to a better speedup and parallelism without changing the underlying code or application architecture.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset