Uploaded on Sep 10, 2022
“But what about GPUs?” one might ask. The GPU was a development of “generalized execution” for graphic applications. It was a step towards specialization, albeit a wide one. The very repetitive nature of graphics operations allowed to “divide” the instruction overhead between large amount of data executions. It achieves greater efficiency than the CPU for a certain kind of applications, but it is still an ISA-based machine and, as such, it is hitting the same evolutionary wall.
Deep Learning Hardware
Deep Learning
Hardware
On top of this, we hit the end of Dennard
Scaling due to the decrease in our ability to
reduce supply voltage when moving from one
process technology to the next.
For over 30 years, the evolution of general-
purpose computing was impressive and fast-
paced, doubling performance every 2 years or
so. But then, some would say not
unexpectedly, it hit the metaphorical wall.
Know More-Deep Learning Hardware
As so many have commented, the process
progress has started to slow down, and
Moore’s Law entered its last throws. The
process technology, which did most of the
heavy lifting in increasing performance from
generation to generation, is slowing down.
Additionally, the growth in single thread
general purpose performance has peaked, and
cooling devices have reached their "power
envelop" (1). As a result, we have reached the
stage where not every transistor on a die can
operate simultaneously.
Only a portion of the on-die functions may be
fully active at any given time. This
circumstance is known as "Black Silicon." The
same is true for multi-core chips, as not all
cores can operate at full speed simultaneously
We had observed the usage of various
applications in the cloud and on the edge
increase exponentially. At any given time, only
a portion of the on-die functions may be fully
active. This situation is referred to as "Black
Silicon."
Due to the performance demands of these
new applications, compute performance was
quickly crowned as king, with an increasing
demand for power (2).
Additionally, the increase in single thread
general purpose performance has peaked, and
the "power envelop" of cooling devices has
been reached (1). As a result, we have reached
a point where not all of the transistors on a die
can operate at once.
Thus, the inevitable target of our current
computing environment is to increase the
possible performance per unit of power, i.e.
reach the best possible power efficiency.
Comments