Mishra, Nikita (The University of Chicago) “Statistical Methods for Improving Dynamic Scheduling and Resource Usage in Computing Systems” (2017)

You are here

Inside Signal Processing Newsletter Home Page

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

News and Resources for Members of the IEEE Signal Processing Society

Mishra, Nikita (The University of Chicago) “Statistical Methods for Improving Dynamic Scheduling and Resource Usage in Computing Systems” (2017)

Yuhong Liu

Advisor: Lafferty, John D and Hoffmann, Henry 

This thesis is about using statistical methods for performance and power estimation which would allow us to develop better scheduling algorithms and also more energy efficient systems. In many deployments, computer systems are underutilized – meaning that applications have performance requirements that demand less than full system capacity. Ideally, the authors would take advantage of this under-utilization by allocating system resources so that the performance requirements are met and energy is minimized. This optimization problem is complicated by the fact that the performance and power consumption of various system configurations are often application – or even input – dependent. Thus, practically, minimizing energy for a performance constraint requires fast, accurate estimations of application-dependent performance and power tradeoffs. They propose a set of algorithms for different scenarios to tackle this problem. First, they propose LEO, a probabilistic graphical model-based learning system that provides accurate online estimates of an application’s power and performance as a function of system configuration. This work mostly focuses on the performance estimation for single applications. As the second part of their work, they design a system called CALOREE which allows the learnt models to be combined with a controller so that the system is robust to dynamic situations with changing resource requirement. Finally, as the third part of their work, they look into the estimation for application’s performance when they are co-scheduled with other applications. Applications co-scheduled on the same physical hardware interfere with one another by contending for shared resources. Predicting this interference ahead of time would be particularly valuable for job scheduling. They therefore propose an efficient technique for estimating application interference based on sparse regression. They call their approach ESP for Estimating co-Scheduled Performance.

LEO uses a graphical model to integrate a small number of observations of the current application with knowledge of the previously observed applications to produce accurate estimations of power and performance trade-offs for the current application in all configurations. LEO produces the most accurate estimates and near optimal energy savings. These estimates can greatly resource allocation in static situation. But the second major challenge in real systems is dynamics, dynamics—performance must be maintained despite unpredictable changes in operating environment or input. Machine learning accurately predicts the performance of complex, interacting resources, but does not address system dynamics; control theory adjusts resource usage dynamically, but struggles with complex resource interaction. They therefore propose CALOREE, a combination of learn- ing and control that automatically adjusts resource usage to meet performance requirements with minimal energy in complex, dynamic environments. CALOREE breaks resource allocation into two sub-tasks: learning speedup as a function of resource usage, and controlling speedup to meet performance requirements. CALOREE also defines a general interface allowing different learners to be combined with a controller while maintaining control’s formal guarantees that performance will converge to the goal. They implement CALOREE and test its ability to deliver reliable performance on heterogeneous ARM big.LITTLE architectures in both single and multi-application scenarios. Compared to state-of-the-art learning and control solutions, they find that CALOREE reduces deadline misses by 2–6x while reducing energy consumption by 7–10%. Finally, the additional challenge that real systems face is performance loss due to application interference. They quantify interference as slowdown, or the performance loss one application experiences in the presence of co-scheduled applications. Given an accurate interference prediction, a scheduler can determine optimal assignments of applications to physical machines, leading to higher throughput in batch systems and better quality-of-service for latency-sensitive applications. In data centers and super computers schedulers often have a great deal of accumulated data about past jobs and their interference, yet turning this data into effective interference predictors is difficult. They explore such state-of-the-art regularized regression models for estimating application interference. They find that regularized linear regression methods require a relatively small number of features, but produce inaccurate models. In contrast, non-linear models that include interaction terms – i.e., permit features to be multiplied together – are more accurate, but are extremely inefficient and not practical for online scheduling. The key insight in ESP is to split regression modeling into two parts: feature selection and model building. ESP uses linear techniques to perform feature selection, but uses quadratic techniques for model building. The result is a highly accurate predictor that is still practical and can be integrated into a real application scheduler.

SPS on Twitter

  • DEADLINE EXTENDED: The 2023 IEEE International Workshop on Machine Learning for Signal Processing is now accepting… https://t.co/NLH2u19a3y
  • ONE MONTH OUT! We are celebrating the inaugural SPS Day on 2 June, honoring the date the Society was established in… https://t.co/V6Z3wKGK1O
  • The new SPS Scholarship Program welcomes applications from students interested in pursuing signal processing educat… https://t.co/0aYPMDSWDj
  • CALL FOR PAPERS: The IEEE Journal of Selected Topics in Signal Processing is now seeking submissions for a Special… https://t.co/NPCGrSjQbh
  • Test your knowledge of signal processing history with our April trivia! Our 75th anniversary celebration continues:… https://t.co/4xal7voFER

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel