Page 281 - 2024-Vol20-Issue2
P. 281

277 |                                                                           Alrudainy, Marzook, Hussein & Shafik

tions and contributions of the most recent existing approaches.         III. SYSTEM ARCHITECTURE AND
Over recent years significant research has been conducted                               APPLICATIONS
addressing real-time energy reduction approaches. These tech-
niques have taken into account single metric based optimiza-     The impetus of adopting heterogeneous architectures, com-
tion: mainly performance improvement within a particular         prising two or various types of CPUs, is recently increasing.
power budget, or performance-constrained for power reduc-        Although these platforms provide superior performance, it
tion [14]. For instance, real-time dynamic voltage frequency     is essential to ensure optimum energy consumption while
scaling (DVFS) control method for power reduction of many-       exercising various types of workloads. The Odroid-XU3
core embedded platforms has been proposed in [15, 21–23].        board facilitates approaches including affinity, DVFS, and
Their method utilizes performance and user experience con-       core manually disabling, normally utilized to enhance sys-
straints to obtain the minimum DVFS combinations by adopt-       tem operation in respect of energy consumption and perfor-
ing reinforcement learning and transfer principles. Others       mance. The Odroid-XU3 board is a small heterogeneous
illustrated another power reduction method that models real-     8-cores computational platform. This board can run Android
time workload analysis to constantly maintain the core allo-     4.4 or Ubuntu 14.04 operating systems. The primary element
cations and DVFS combination through predictive controls         of Odroid-XU3 board is the 28 nm Application Processor
using multinomial logic regression [16]. A number of re-         Exynos 5422. The main processor architecture depicted in
search papers have also demonstrated analytical investiga-       Fig. 1. This multiprocessor system on chip (MPSoC) is de-
tions adopting simulation frameworks, including McPAT, and       veloped by ARM big.LITTLE heterogeneous architecture and
gem5. These studies have utilized task mapping, DVFS, and        comprises of a low power Cortex-A7 quad core block, a high
offline optimization methods to significantly reduce the power   performance Cortex-A15 quad core processor block, 2GB
dissipation under workloads variations [17, 24–26]. A novel      DRAM LPDDR3, and a Mali-T628 GPU. Further, this board
work in [11] presented low complexity runtime management         comprises of 4 real time current sensors that provide the op-
approach based on workload classification for heterogeneous      portunity to measure power consumption on the 4 separated
many core platforms. This approach addresses most config-        power blocks: little (A7) CPUs, big (A15) CPUs, DRAM,
uration space of odroid-xu3 platform including core types,       and GPU. In addition, there are also 1 temperature sensor
threads allocation, optimum dynamic voltage and frequency        for the GPU and 4 temperature sensor for each of the A15
scaling.                                                         CPUs. The clock frequency and supply voltage (Vdd) of the
A hardware based load balancing scheme for homogeneous           Odroid-XU3 board, for each power block, can be adjusted
many-core system is assessed in aspect of power consumption      using a range of pre-defined range of values. For example, the
and thermal behavior [7]. In this scheme, a power minimiza-      low power Cortex-A7 quad core block has a set of frequencies
tion is reached by powering off the dark silicon area. In [8],   ranged between 200 MHz and 1400 MHz with a step size of
to minimize static power consumption during the sub-clock        100 MHz, while the performance Cortex-A15 quad core block
cycle, a power gating based sub-clock approach was imple-        features a set of frequencies ranged between 200 MHz and 2
mented in ARM Cortex-M0 processor. In the same context,          GHz with a step size equal to 100MHz.
Charles et al. [9] performed per core power gating (PCPG)
in contemporary homogeneous Intel Core i7 processor. It is           The PARSEC real application benchmark suite supports
illustrated that additional power headroom can be transferred    both emerging and current workloads for multi processing
to the active cores by power gating dark silicon area, idle      hardware [27]. It contains a various set of workloads from
cores, to boost their frequency and voltage without overstep     diverse domains including systems applications or interactive
the thermal and power envelop. Likewise, transferring energy     animation that mimic large-scale commercial workloads. In
saving from dark silicon area into enabled cores was studied     our paper, Therefore, PARSEC applications has been adopted
in [10] using a homogeneous many core platforms named as         and exercised on the Odroid-XU3 system on chip (SoC) whose
AMD Opteron 6168. The practical outcomes of this work
are relied on manually adjustment of dynamic voltage scaling                             TABLE II.
(DVS) combination integrated with per core power gating          CHARACTERISTIC OF PARSEC BENCHMARK [27]
approach.
                                                                 Application         Domain           Type
                                                                     ferret     Similarity Search     CPU
                                                                     cannel                           CPU
                                                                                   Engineering     CPU+mem
                                                                   bodytrack    Computer Vision       mem
                                                                 streamcluster                        mem
                                                                 fluidanimate     Data Mining
                                                                                    Animation
   276   277   278   279   280   281   282   283   284   285   286