THERMAL MANAGEMENT OF THE HYBRID MEMORY CUBE
Abstract
Main memory performance is becoming an increasingly important factor contributing
to overall system performance, especially due to the so-called memory wall. The
Hybrid Memory Cube (HMC) is a novel 3D heterogenous architecture of DRAM, designed
to improve DRAM performance, consisting of DRAM dies stacked on top of
each other, with a logic die at the bottom - all interconnected with highly dense through
silicon vias (TSVs). This logic die communicates with an on chip memory controller
for receiving/sending requests using high speed links. Modelling the Hybrid Memory
Cube in HotSpot has indicated that this cube has a natural temperature variation, with
the hottest layers at the bottom and the cooler layers at the top. High temperatures and
variations within a DRAM can result in reduced performance and efficiency, especially
when dynamic thermal management (DTM) schemes are used to throttle DRAM bandwidth
whenever temperature gets too high. Hence this dissertation attempts to reduce
the maximum temperature and also this variation, by using data compression - where
the compression is performed on the on chip memory controller, and the compressed
blocks are read/written using fewer bursts in the Hybrid Memory Cube, hence reducing
power dissipation. Compressed blocks are stored only in the hotter banks of the cube
to mitigate the thermal gradient in the cube. Maximum temperature was reduced by
as much as 6 C, and since the HMC spent lesser time throttling when DTM schemes
were used, a maximum of 12.5% reduction in execution time was observed, at an average
reduction of 2.6%.