Classification and Ensemble Machine Learning Algorithms to Predict Memory Requirements for Compute Farm Jobs

Authors

  • Esraa Faisal Malik School of Management, Universiti Sains Malaysia
  • Khai Wah Khaw School of Management, Universiti Sains Malaysia
  • Xinying Chew School of Computer Sciences, Universiti Sains Malaysia, USM. http://orcid.org/0000-0001-5539-1959
  • Alhamzah Alnoor Management Technical College, Southern Technical University, Basrah, Iraq
  • Mariam Al Akasheh Department of Analytics in the Digital Era, College of Business and Economics, United Arab Emirates University, UAE

Keywords:

Chip design, Compute farm scheduler, Machine learning algorithms, Memory prediction, Resource management.

Abstract

Tasks ranging from synthesis to regression are executed within a computational farm environment during chip design. This process is managed by a compute farm scheduler, which handles job scheduling based on the availability of such computational resources as central processing units, memory, and storage. The increasing complexity of chip design over the years, combined with a growing number of cores per chip, has resulted in memory-intensive applications often being executed as compute jobs. Jobs submitted with inaccurate resource-related requests, especially those concerning memory, can overload a compute farm and lead to wasted resources. This study addresses this issue by using a data science-driven, machine learning-based approach to predict the memory required for a compute job at the time of its submission. Improving the accuracy of such predictions can significantly reduce the overall wait times of jobs and enable efficient use of the compute farm to reduce the overall cost because fewer machines are required to complete a set of jobs. We explored the use of the K-nearest neighbor, random forest, and ensemble methods to this end. The proposed approach yielded an accuracy of 80% in experiments, where this demonstrates the success of predicting the memory-related requirements of compute jobs across a diverse suite of applications used in the process of chip design.

References

Prisma, Speed to market: Why is it so important? https://www.poweredbyprisma.com/speed-to-market-why-is-it-so-important/, 2022 (accessed 23.12.2022).

Asicnorth, ASIC vs FPGA: What's the difference?, https://www.asicnorth.com/blog/asic-vs-fpga-difference/, 2020 (accessed 01.03.2023).

A. Reuther et al., Scheduler technologies in support of high performance data analysis, 2016 IEEE High Performance Extreme Computing Conference HPEC 2016, Waltham, USA, 2016, 1-6.

M. Tanash, D. Andresen and W. Hsu, AMPRO-HPCC: A machine-learning tool for predicting resources on slurm HPC clusters, ADVCOMP International Conference on Advanced Engineering Computing and Applications in Sciences, Barcelona, Spain, 2021, 20-27.

T. Taghavi, M. Lupetini and Y. Kretchmer, Compute job memory recommender system using machine learning, Proceedings ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, USA, 2016, 609-616.

M. Tanash, B. Dunn, D. Andresen, W. Hsu, H. Yang and A. Okanlawon, Improving HPC system performance by predicting job resources via supervised machine learning, ACM International Conference Proceedings Series, Daejeon, South Korea, 2019, 1-8.

X. Li, N. Qi, Y. He and B. McMillan, Practical resource usage prediction method for large memory jobs in HPC clusters, Asian Conference on Supercomputing Frontiers, Singapore, 2019, 1-18.

S. Salehian and L. Lu, Memory bandwidth prediction in NUMA architecture using supervised machine learning, International Conference on Computational Science and Computational Intelligence (CSCI 2019), Las Vegas, USA, 2019, 1517-1522.

D. Oliveira, F. B. Moreira, P. Rech and P. Navaux, Predicting the reliability behavior of HPC applications, 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2018), Lyon, France, 2019, 124-131.

E. R. Rodrigues, R. L. F. Cunha, M. A. S. Netto and M. Spriggs, 2016 Third International Workshop on HPC User Support Tools (HUST), Salt Lake City, USA, 45, 2016.

A. Sethi, One-hot encoding vs. label encoding using Scikit-Learn , https://www.analyticsvidhya.com/blog/2020/03/one-hot-encoding-vs-label-encoding-using-scikit-learn/, 2020 (accessed 21.12.2022).

R. Bellman, Adaptive Control Processes: A Guided Tour. Princeton University Press, 1961.

A. Ethem, Introduction to Machine Learning, Fourth Edi. The MIT Press, 2014.

B. Keen, Feature scaling with Scikit-Learn, http://benalexkeen.com/feature-scaling-with-scikit-learn/, 2017 (accessed 21.12.2022).

P. Gibbs and H. Sugihara, What is Occam’s Razor? https://math.ucr.edu/home/baez/physics/General/occam.html, 1997 (accessed 20.01.2023).

X. Zhu, J. Li, J. Ren, J. Wang and G. Wang, Dynamic ensemble learning for multi-label classification, Information Sciences, 623, 2023, 94-111.

M. H. D. M. Ribeiro, R. G. da Silva, G. T. Ribeiro, V. C. Mariani and L. dos S. Coelho, Cooperative ensemble learning model improves electric short-term load forecasting, Chaos, Solitons and Fractals, 166, 2023, 112982.

C. Choudhary, I. Singh and M. Kumar, SARWAS: Deep ensemble learning techniques for sentiment based recommendation system, Expert System with Applications, 216, 2023, 119420.

C. Cakiroglu, K. Islam, G. Bekdaş and M. L. Nehdi, Data-driven ensemble learning approach for optimal design of cantilever soldier pile retaining walls, Structures, 51, 2023, 1268-1280.

X. Liu, A. Liu, J. L. Chen and G. Li, Impact of decomposition on time series bagging forecasting performance, Tourism Management, 97, 2023, 104725.

I. F. Kilincer, F. Ertam and A. Sengur, A comprehensive intrusion detection framework using boosting algorithms, Computers and Electrical Engineering, 100, 2022, 107869.

S. Widodo, H. Brawijaya and S. Samudi, Stratified K-fold cross validation optimization on machine learning for prediction, Sinkron, 7, 2022, 2407-2414.

H. Zou and Z. Jin, Comparative study of big data classification algorithm based on SVM, 2018 Cross Strait Quad-Regional Radio Science and Wireless Technology Conference (CSQRWC 2018), Xuzhou, China, 2, 2018, 1-3.

R. -E. Fan, K. -W. Chang, C. -J. Hsieh, X. -R. Wang and C. -J. Lin, LIBLINEAR: A library for large linear classification, The Journal of Machine Learning Research, 9, 2008, 1871-1874.

J. Brownlee, Parametric and Nonparametric Machine Learning Algorithms, https://machinelearningmastery.com/parametric-and-nonparametric-machine-learning-algorithms/, 2016 (accessed 15.02.2023).

R. Devika, S. V. Avilala and V. Subramaniyaswamy, Comparative study of classifier for chronic kidney disease prediction using naive bayes, KNN and random forest, 3rd International Conference on Computing Methodologies and Communication (ICCMC 2019), Erode, India, 2019, 679-684.

R. Nugrahaeni and K. Mutijarsa, Comparative analysis of machine learning KNN, SVM, and random forests algorithm for facial expression classification, 2016 International Seminar on Application for Technology of Information Communication, Minna, Nigeria, 2016, 163-168.

N. S. Intizhami, A. Y. Husodo and W. Jatmiko, Warfare simulation: Predicting battleship winner using random forest, 2019 IEEE International Conference on Communication, Networks and Satellite (ComNetSat), Makassar, Indonesia, 2019, 30-34.

Z. Tan, Z. Yan and G. Zhu, Stock selection with random forest: An exploitation of excess return in the Chinese stock market, Heliyon, 5, 2019, e02310.

T. Hastie, T. Robert and J. Friedman, The Elements of Statistical Learning, 2nd Edition, Springer New York, 2009.

A. Mayr, H. Binder, O. Gefeller and M. Schmid, The evolution of boosting algorithms: From machine learning to statistical modelling, Methods of Information in Medicine, 53, 2014, 419-427.

T. Parr and J. Howard, Gradient boosting: Distance to target, https://explained.ai/gradient-boosting/L2-loss.html, 2018 (accessed 22.02.2023).

M. Soltanshahi, R. Asemi and N. Shafiei, Energy-aware virtual machines allocation by krill herd algorithm in cloud data centers, Heliyon, 5, 2019, 3-8.

Downloads

Published

04-12-2023

How to Cite

Malik, E. F., Khaw, K. W., Chew, X., Alnoor, A., & Al Akasheh, M. (2023). Classification and Ensemble Machine Learning Algorithms to Predict Memory Requirements for Compute Farm Jobs. Applications of Modelling and Simulation, 7, 190–200. Retrieved from https://www.ojs.arqiipubl.com/index.php/AMS_Journal/article/view/481

Issue

Section

Articles

Similar Articles

1 2 3 4 5 6 7 > >> 

You may also start an advanced similarity search for this article.