Artificial intelligence has currently become widespread and has permeated social life. Electronic devices are connected among each other, wirelessly and via other networks, and can constantly communicate. Thus, a substantial amount of data is generated every second worldwide, and the data creation period is shortening. With the unprecedented explosion in data, a new industry has been launched to extract more valuable information and utilize it beyond simply storing and managing data traffic worldwide. For example, driving skills of autonomous vehicles have advanced rapidly by recognizing information about the surrounding environment that is constantly being input to the system in real time and accurately classifying them into specific objects and signals. One of the reasons for the new wave of data-centric paradigms was the development of semiconductor technology in the past few decades. The performance and cost of transistors, a representative semiconductor device, have been improved due to Moore's law scaling.
[1] Consequently, several innovative products have been manufactured at reasonable prices, thereby creating numerous derivative industries. More specifically, increasing the number of tiny transistor elements integrated into a given silicon chip allows more versatile processing and arithmetic operations per clock cycle to be performed promptly. The memory elements based on laterally scaled and vertically stacked structures can also significantly increase memory capacity.
[2, 3] As we advance into the big-data era, the demand for improved performance of computing systems primarily consisting of these two fundamental components, i.e., central processing units (CPUs) and memories, to handle the exponentially growing amount of data is increasing. However, in the conventional von Neumann computing architecture, data executed at the CPU must be frequently moved back and forth to the memory for storage, which can lead to a memory wall or the von Neumann bottleneck, as shown in Figure 2.
[4] Power-constrained computing systems are gaining further importance because all electronic devices should function continuously in always-connected environments. Analyses of the workload of traditional computing systems have clearly indicated that real-time applications, such as hand-tracking services and audio recognition, consume more than half of the total energy when moving and storing their data rather than performing computations.
[5] These problems have necessitated the development of new computing systems to overcome power inefficiencies by minimizing the sequential processing.