Empirical Analysis and Offering of Various Historical Spot Instance Datasets
Public cloud service vendors provide a surplus of computing resources at a cheaper price as a spot instance. The first spot instance provider, Amazon Web Services(AWS), releases the history of spot price changes so that users can estimate the availability of spot instances, and it triggered lots of research work in literature. However, a change in spot pricing policy in 2017 rendered a large portion of spot instance availability analysis work obsolete. Instead, AWS publishes new spot instance datasets, the spot placement score, and the interruption frequency for the previous month, but the new datasets received far less attention than the spot price history dataset. Furthermore, the datasets only provide the current dataset without historical information, and there are few restrictions when querying the dataset. In addition, the various spot datasets can provide contradicting information about spot instance availability of the same entity at the same time quite often. In this work, we develop a web-based spot instance data-archive service that provides historical information of various spot instance datasets. We describe heuristics how we overcame limitations imposed when querying multiple spot instance datasets. We conducted a real-world evaluation measuring spot instance fulfillment and interruption events to identify a credible spot instance dataset, especially when different datasets represent contradictory information. Based on the findings of the empirical analysis, we propose SpotScore, a new spot instance metric that provides a spot recommendation score based on the composition of spot price savings, spot placement score, and interruption frequency. The data-archive service with the SpotScore is now publicly available as a web service to speed up system research in the spot instance community and to improve spot instance usage with a higher level of availability and cost savings.
READ FULL TEXT