Approximate Acceleration for a Post Multicore Era
MetadataShow full item record
Starting in 2004, the microprocessor industry has shifted to multicore scaling--increasing the number of cores per die each technology generation--as its principal strategy for continuing performance growth. This work first studies the interplay between the rise of multicore processors and the rise of managed languages--e.g. Java--in the past decade. Then, this dissertation looks into future, studies the trends in transistor scaling, and investigates whether multicore scaling will sustain traditional performance improvements that have been the driving force for the entire computing industry over the past forty years. The results from our work challenges the conventional wisdom that advocates multicore scaling is a viable path for exploiting increased transistor counts and sustaining historical performance trends. Furthermore, the results show that dark silicon--the fraction of chip that needs to be powered off at all times due to power constraints--may break the economics of continued silicon scaling. Our study highlights that radical departures from conventional approaches are necessary to sustain the traditional rate of performance improvements in general-purpose computing. These techniques should provide significant performance and energy efficiency gains across a wide range of applications. This dissertation then proposes a new direction for general-purpose computing that leverages approximation to address the dark silicon challenge. While conventional techniques--such as dynamic voltage and frequency scaling--trade performance for energy, general-purpose approximate computing trades error for both performance and energy gains. We propose variable-precision architectures, a framework from the ISA--Instruction Set Architecture--to the transistor-level implementations that allow conventional von Neumann processors to trade accuracy for energy at the granularity of single instructions. Then, we propose an end-to-end solution, from the programming model to the microarchitecture that leverages an approximate algorithmic transformation to automatically convert a hot code region from a von Neumann model to a neural model. This solution and its associated algorithmic transformation enables a new class of accelerators, called Neural Processing Units (NPUs) with implementation potential in both the digital and the analog domain. This work shows significant gains both in performance and energy when the abstraction of full accuracy is relaxed in general-purpose computing. Furthermore, the proposed approaches open new venues for research in programming languages, architecture, mixed-signal circuit design, and even machine learning. The significant gains from the proposed techniques show that general-purpose approximate computing can be a path forward when the gains from conventional approaches are diminishing.