Cover
Vol. 20 No. 2 (2024)

Published: December 31, 2024

Pages: 275-283

Original Article

Understanding Power Gating Mechanism Based on Workload Classification of Modern Heterogeneous Many-Core Mobile Platform in the Dark Silicon Era

Abstract

The rapid progress in mobile computing necessitates energy efficient solutions to support substantially diverse and complex workloads. Heterogeneous many core platforms are progressively being adopted in contemporary embedded implementations for high performance at low power cost estimations. These implementations experience diverse workloads that offer drastic opportunities to improve energy efficiency. In this paper, we propose a novel per core power gating (PCPG) approach based on workload classifications (WLC) for drastic energy cost minimization in the dark silicon era. Core of our paradigm is to use an integrated sleep mode management based on workloads classification indicated by the performance counters. A number of real applications benchmark (PARSEC) are adopted as a practical example of diverse workloads, including memory- and CPU-intensive ones. In this paper, these applications are exercised on Samsung Exynos 5422 heterogeneous many core system showing up to 37% to 110% energy efficient when compared with our most recent published work, and ondemand governor, respectively. Furthermore, we illustrate low-complexity and low-cost runtime per core power gating algorithm that consistently maximize IPS/Watt at all state space.

References

  1. H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankar- alingam, and D. Burger, “Dark silicon and the end of multicore scaling,” in Proceedings of the 38th an- nual international symposium on Computer architecture, pp. 365–376, 2011. 282 | Alrudainy, Marzook, Hussein & Shafik
  2. H. Esmaeilzadeh, E. Blem, R. S. Amant, K. Sankar- alingam, and D. Burger, “Dark silicon and the end of multicore scaling,” IEEE micro, vol. 32, no. 3, pp. 122– 134, 2012.
  3. A. Shafaei Bejestan, Y. Wang, S. Ramadurgam, Y. Xue, P. Bogdan, and M. Pedram, “Analyzing the dark silicon phenomenon in a many-core chip multi-processor under deeply-scaled process technologies,” in Proceedings of the 25th edition on Great Lakes Symposium on VLSI, pp. 127–132, 2015.
  4. J. Henkel, H. Bukhari, S. Garg, M. U. K. Khan, H. Khdr, F. Kriebel, U. Ogras, S. Parameswaran, and M. Shafique, “Dark silicon: From computation to communication,” in Proceedings of the 9th International Symposium on Networks-on-Chip, pp. 1–8, 2015.
  5. X. Wang, A. K. Singh, B. Li, Y. Yang, H. Li, and T. Mak, “Bubble budgeting: Throughput optimization for dy- namic workloads by exploiting dark cores in many core systems,” IEEE Transactions on Computers, vol. 67, no. 2, pp. 178–192, 2017.
  6. X. Wang, B. Zhao, L. Wang, T. Mak, M. Yang, Y. Jiang, and M. Daneshtalab, “A pareto-optimal runtime power budgeting scheme for many-core systems,” Micropro- cessors and Microsystems, vol. 46, pp. 136–148, 2016.
  7. E. Musoll, “Hardware-based load balancing for mas- sive multicore architectures implementing power gating,” IEEE Transactions on Computer-Aided Design of Inte- grated Circuits and Systems, vol. 29, no. 3, pp. 493–497, 2010.
  8. J. N. Mistry, B. M. Al-Hashimi, D. Flynn, and S. Hill, “Sub-clock power-gating technique for minimising leak- age power during active mode,” in 2011 Design, Automa- tion & Test in Europe, pp. 1–6, IEEE, 2011.
  9. J. Charles, P. Jassi, N. S. Ananth, A. Sadat, and A. Fe- dorova, “Evaluation of the intel® core™i7 turbo boost feature,” in 2009 IEEE International Symposium on Workload Characterization (IISWC), pp. 188–197, IEEE, 2009.
  10. K. Ma and X. Wang, “Pgcapping: Exploiting power gating for power capping and core lifetime balancing in cmps,” in Proceedings of the 21st international con- ference on Parallel architectures and compilation tech- niques, pp. 13–22, 2012.
  11. A. Aalsaud, F. Xia, A. Rafiev, R. Shafik, A. Romanovsky, and A. Yakovlev, “Low-complexity run-time manage- ment of concurrent workloads for energy-efficient multi- core systems,” Journal of Low Power Electronics and Applications, vol. 10, no. 3, p. 25, 2020.
  12. S. Tzilis, P. Trancoso, and I. Sourdis, “Energy-efficient runtime management of heterogeneous multicores using online projection,” ACM Transactions on Architecture and Code Optimization (TACO), vol. 15, no. 4, pp. 1–26, 2019.
  13. A. K. Singh, A. Prakash, K. R. Basireddy, G. V. Mer- rett, and B. M. Al-Hashimi, “Energy-efficient run-time mapping and thread partitioning of concurrent opencl applications on cpu-gpu mpsocs,” ACM Transactions on Embedded Computing Systems (TECS), vol. 16, no. 5s, pp. 1–22, 2017.
  14. C. Hankendi and A. K. Coskun, “Adaptive power and re- source management techniques for multi-threaded work- loads,” in 2013 IEEE International Symposium on Paral- lel & Distributed Processing, Workshops and Phd Forum, pp. 2302–2305, IEEE, 2013.
  15. R. A. Shafik, S. Yang, A. Das, L. A. Maeda-Nunez, G. V. Merrett, and B. M. Al-Hashimi, “Learning transfer- based adaptive energy minimization in embedded sys- tems,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 35, no. 6, pp. 877– 890, 2015.
  16. A. Das, A. Kumar, B. Veeravalli, R. Shafik, G. Mer- rett, and B. Al-Hashimi, “Workload uncertainty char- acterization and adaptive frequency scaling for energy minimization of embedded systems,” in 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 43–48, IEEE, 2015.
  17. B. K. Reddy, M. J. Walker, D. Balsamo, S. Diestelhorst, B. M. Al-Hashimi, and G. V. Merrett, “Empirical cpu power modelling and estimation in the gem5 simulator,” in 2017 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PAT- MOS), pp. 1–8, IEEE, 2017.
  18. A. Aalsaud, A. Rafiev, F. Xia, R. Shafik, and A. Yakovlev, “Model-free runtime management of con- current workloads for energy-efficient many-core het- erogeneous systems,” in 2018 28th International Sympo- sium on Power and Timing Modeling, Optimization and Simulation (PATMOS), pp. 206–213, IEEE, 2018.
  19. A. Aalsaud, R. Shafik, A. Rafiev, F. Xia, S. Yang, and A. Yakovlev, “Power–aware performance adaptation of 283 | Alrudainy, Marzook, Hussein & Shafik concurrent applications in heterogeneous many-core sys- tems,” in Proceedings of the 2016 International Sympo- sium on Low Power Electronics and Design, pp. 368– 373, 2016.
  20. S. K. Mandal, G. Bhat, J. R. Doppa, P. P. Pande, and U. Y. Ogras, “An energy-aware online learning framework for resource management in heterogeneous platforms,” ACM Transactions on Design Automation of Electronic Systems (TODAES), vol. 25, no. 3, pp. 1–26, 2020.
  21. H. M. Alrudainy, A. Mokhov, F. Xia, and A. Yakovlev, “Ultra-low energy data driven computing using asyn- chronous micropipelines and nano-electro-mechanical relays,” in 2017 IEEE Computer Society Annual Sympo- sium on VLSI (ISVLSI), pp. 158–163, 2017.
  22. H. A. Leftah and M. H. Al-Ali, “Index modulated spread spectrum ofdm with c-transform,” IEEE Communica- tions Letters, vol. 25, no. 9, pp. 3119–3123, 2021.
  23. M. Al-Momin, I. A. Abed, and H. A. Leftah, “A new approach for enhancing lsb steganography using bidirec- tional coding scheme,” International Journal of Electri- cal and Computer Engineering (IJECE), vol. 9, no. 6, pp. 5286–5294, 2019.
  24. H. Alrudainy, A. Mokhov, and A. Yakovlev, “A scalable physical model for nano-electro-mechanical relays,” in 2014 24th International Workshop on Power and Tim- ing Modeling, Optimization and Simulation (PATMOS), pp. 1–7, 2014.
  25. H. Alrudainy, A. Mokhov, N. S. Dahir, and A. Yakovlev, “Mems-based power delivery control for bursty applica- tions,” in 2016 IEEE International Symposium on Cir- cuits and Systems (ISCAS), pp. 790–793, 2016.
  26. H. Alrudainy, R. Shafik, A. Mokhov, and A. Yakovlev, “Lifetime reliability characterization of n/mems used in power gating of digital integrated circuits,” in 2017 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), pp. 1–6, 2017.
  27. C. Bienia and K. Li, “Parsec 2.0: A new benchmark suite for chip-multiprocessors,” in Proceedings of the 5th Annual Workshop on Modeling, Benchmarking and Simulation, vol. 2011, p. 37, 2009.
  28. A. Torrey, J. Cleman, and P. Miller, “Comparing interac- tive scheduling in linux,” Software-Practices & Experi- ence, vol. 34, no. 4, pp. 347–364, 2007.
  29. M. J. Walker, S. Diestelhorst, A. Hansson, A. K. Das, S. Yang, B. M. Al-Hashimi, and G. V. Merrett, “Accurate and stable run-time power modeling for mobile and em- bedded cpus,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 36, no. 1, pp. 106–119, 2016.
  30. B. Amelifard and M. Pedram, “Optimal selection of voltage regulator modules in a power delivery network,” in Proceedings of the 44th annual Design Automation Conference, pp. 168–173, 2007.
  31. A. Aalsaud, H. Alrudainv, R. Shafik, F. Xia, and A. Yakovlev, “Mems-based runtime idle energy min- imization for bursty workloads in heterogeneous many- core systems,” in 2018 28th International Symposium on Power and Timing Modeling, Optimization and Simu- lation (PATMOS), pp. 198–205, 2018.