Ensemble-Based Predictive Framework for Computational Workload Forecasting Using Google Borg Traces

Sanjeev Kumar,(Dr.) Saurabh Charaya, Dr. Rachna Mehta

doi:10.64882/ijrt.v13.i3.482

Authors

Sanjeev Kumar,(Dr.) Saurabh Charaya, Dr. Rachna Mehta

DOI:

https://doi.org/10.64882/ijrt.v13.i3.482

Keywords:

Computational Workload Prediction, Google Borg Traces, Ensemble Machine Learning, Resource Utilization Forecasting, Feature Engineering and Explainability

Abstract

This paper introduces a powerful framework based on data regarding future predatory workloads using Google Borg Traces, which is one among the largest examinations of cluster-extensive resource management figures. In the study, the researchers are going to create a predictive model in the form of an accurate and interpretable prediction of CPU utilization using systematic data preparation, advanced preprocessing, feature engineering, and ensemble-based machine learning methods. The data, which was comprehensive in terms of scheduling of tasks, CPU and memory consumption, system performance indicators, were completely cleaned, standardized and converted into formats that are analytical reliable. To select the features, the Mutual Information analysis was utilized, and the Quantile, Power and Z-score transformations were used to normalize the features to improve the stability of the model. Complex feature engineering procedures were used to capture both temporal, frequency and statistical dynamics, and Truncated Singular Value Decomposition (SVD) was used to dimensionality reduce computational efficiency. Three ensemble regression algorithms were used: Random Forest, Light Gradient Boosting Machine (LightGBM), and CatBoost and tested on Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Coefficient of Determination (R 2 ). The best of them was the Random Forest Regressor (MAE = 0.000003, RMSE = 0.000109, R 2 = 0.999915), with LightGBM and CatBoost trailing behind it with very close predictive accuracy. To further improve interpretability, SHAP analysis was used to determine the most significant predictors, including the average and maximum use of CPU, which has a significant effect on the outputs of the model. The findings substantiate that ensemble-based models are effective in modeling non-linear relationships that are very complex, and exist amongst system metrics and resource utilization. Altogether, the work provides a scalable, interpretable, and high-performance model of the intelligent workload prediction and optimization model within large-scale computing systems that can be useful in enhancing efficiency, reliability, and predictive control in recent cloud and cluster computing systems.

References

S. AlZu’bi, F. Quiam, A. M. Al-Zoubi, and M. Almiani, “Neural Network Architectures for Secure and Sustainable Data Processing in E-Government Systems,” Algorithms, vol. 18, no. 10, p. 601, 2025, doi: 10.3390/a18100601.

O. E. Aboulqassim, F. Embarak, S. Jayashree, and A. Eltheeb, “Deep Learning-Driven Forecasting Models for Iot Data in Cloud Computing Environments: Leveraging Temporal Convolutional Networks,” J. Theor. Appl. Inf. Technol., vol. 103, no. 6, pp. 2108–2122, 2025.

Y. Wang and X. Yang, “Intelligent Resource Allocation Optimization for Cloud Computing via Machine Learning,” Adv. Comput. Signals Syst., vol. 9, no. 1, pp. 55–63, 2025, doi: 10.23977/acss.2025.090109.

T. Le Duc, C. Nguyen, and P. O. Östberg, “Workload Prediction for Proactive Resource Allocation in Large-Scale Cloud-Edge Applications,” Electron., vol. 14, no. 16, pp. 1–36, 2025, doi: 10.3390/electronics14163333.

A. Rossi, A. Visentin, D. Carraro, S. Prestwich, and K. N. Brown, “Forecasting workload in cloud computing: towards uncertainty-aware predictions and transfer learning,” Cluster Comput., vol. 28, no. 4, pp. 1–20, 2025, doi: 10.1007/s10586-024-04933-2.

S. S. Sefati, A. M. Nor, B. Arasteh, R. Craciunescu, and C. R. Comsa, “A Probabilistic Approach to Load Balancing in Multi-Cloud Environments via Machine Learning and Optimization Algorithms,” J. Grid Comput., vol. 23, no. 2, 2025, doi: 10.1007/s10723-025-09805-6.

H. Chaudhary, G. Sharma, D. K. Nishad, and S. Khalid, Advanced queueing and scheduling techniques in cloud computing using AI-based model order reduction, vol. 28, no. 1. Springer Netherlands, 2025. doi: 10.1007/s10791-025-09581-7.

U. K. Lilhore et al., “Cloud-edge hybrid deep learning framework for scalable IoT resource optimization,” J. Cloud Comput., vol. 14, no. 1, 2025, doi: 10.1186/s13677-025-00729-w.

A. B. Kathole et al., “Novel load balancing mechanism for cloud networks using dilated and attention-based federated learning with Coati Optimization,” Sci. Rep., vol. 15, no. 1, pp. 1–15, 2025, doi: 10.1038/s41598-025-99559-8.

Raviteja Guntupalli, “Predictive cloud resource management: Developing ml models for accurately predicting workload demands (CPU, memory, network, storage) to enable proactive auto-scaling. AI-driven instance type selection and rightsizing. predicting spot instance interruptio,” World J. Adv. Res. Rev., vol. 26, no. 2, pp. 880–885, 2025, doi: 10.30574/wjarr.2025.26.2.1522.

“International Journal of Intelligent Systems - 2025 - Ali - Energy‐Efficient Resource Allocation for Urban Traffic Flow.pdf.”

Z. Xu, Y. Gong, Y. Zhou, Q. Bao, and W. Qian, “Enhancing Kubernetes automated scheduling with deep learning and reinforcement techniques for large-scale cloud computing optimization,” p. 175, 2024, doi: 10.1117/12.3034052.

M. Abouelyazid, “Deep-Hill: An Innovative Cloud Resource Optimization Algorithm by Predicting SaaS Instance Configuration Using Deep Learning,” IEEE Access, vol. 12, no. July, pp. 92573–92584, 2024, doi: 10.1109/ACCESS.2024.3423339.

K. Jia, J. Xiang, and B. Li, “DuCFF: A Dual-Channel Feature-Fusion Network for Workload Prediction in a Cloud Infrastructure,” Electron., vol. 13, no. 18, pp. 1–24, 2024, doi: 10.3390/electronics13183588.

S. Meera and K. Valarmathi, “Optimizing cloud resource allocation: A long short-term memory and DForest-based load balancing approach,” J. Intell. Fuzzy Syst., vol. 46, no. 1, pp. 2311–2330, 2024, doi: 10.3233/JIFS-234054.

S. Simaiya et al., “A hybrid cloud load balancing and host utilization prediction method using deep learning and optimization techniques,” Sci. Rep., vol. 14, no. 1, pp. 1–18, 2024, doi: 10.1038/s41598-024-51466-0.

Chandrakanth Lekkala, “AI-Driven Dynamic Resource Allocation in Cloud Computing : Predictive Journal of Artificial Intelligence , Machine Learning and Data Science AI-Driven Dynamic Resource Allocation in Cloud Computing : Predictive Models and Real-Time optimization,” Res. gate, vol. 2, no. 2, 2024.

A. Chauhan, “Designing Robust and Scalable Infrastructure Solutions to Ensure High Availability and Security in E- Governance Platforms,” vol. 2, no. 6, pp. 1–4, 2024.

A. K. Shaikh, A. Nazir, N. Khalique, A. S. Shah, and N. Adhikari, “A new approach to seasonal energy consumption forecasting using temporal convolutional networks,” Results Eng., vol. 19, no. March, p. 101296, 2023, doi: 10.1016/j.rineng.2023.101296.

T. Selvan Chenni Chetty et al., “Optimized Hierarchical Tree Deep Convolutional Neural Network of a Tree-Based Workload Prediction Scheme for Enhancing Power Efficiency in Cloud Computing,” Energies, vol. 16, no. 6, 2023, doi: 10.3390/en16062900.

Z. Ahamed, M. Khemakhem, F. Eassa, F. Alsolami, A. Basuhail, and K. Jambi, “Deep Reinforcement Learning for Workload Prediction in Federated Cloud Environments,” Sensors, vol. 23, no. 15, 2023, doi: 10.3390/s23156911.

Y. Lohumi, D. Gangodkar, P. Srivastava, M. Z. Khan, A. Alahmadi, and A. H. Alahmadi, “Load Balancing in Cloud Environment: A State-of-the-Art Review,” IEEE Access, vol. 11, no. November, pp. 134517–134530, 2023, doi: 10.1109/ACCESS.2023.3337146.

A. Al-Besher and K. Kumar, “Use of artificial intelligence to enhance e-government services,” Meas. Sensors, vol. 24, no. August, p. 100484, 2022, doi: 10.1016/j.measen.2022.100484.

S. A. M. Naqvi, T. Alyas, N. Tabassum, A. Namoun, and H. H. Naqvi, “Post Pandemic World and Challenges for E-Governance Framework,” Int. J. Adv. Trends Comput. Sci. Eng., vol. 10, no. 3, pp. 2630–2636, 2021, doi: 10.30534/ijatcse/2021/1571032021.

F. Mohammad Ebrahimzadeh Sepasgozar, U. Ramzani, S. Ebrahimzadeh, S. Sargolzae, and S. Sepasgozar, “Technology Acceptance in e-Governance: A Case of a Finance Organization,” J. Risk Financ. Manag., vol. 13, no. 7, 2020, doi: 10.3390/jrfm13070138.

Ensemble-Based Predictive Framework for Computational Workload Forecasting Using Google Borg Traces

Authors

DOI:

Keywords:

Abstract

References

Downloads

How to Cite

Issue

Section

License

Similar Articles

Make a Submission

Keywords

Abstracting & Indexing

Flag Counter