ISPD 2020 Physical Mapping of Neural Networks on a Wafer-Scale Deep Learning Accelerator

ISPD '20: International Symposium on Physical Design Taipei Taiwan September, 2020(2020)

引用 25|浏览13
暂无评分
摘要
This paper introduces a special case of the floorplanning problem for optimizing neural networks to run on a wafer-scale computing engine. From a compute perspective, neural networks can be represented by a deeply layered structure of compute kernels. During the training of a neural network, gradient descent is used to determine the weight factors. Each layer then uses a local weight tensor to transform "activations" and "gradients" that are shared among connected kernels according to the topology of the network. This process is computationally intensive and requires high memory and communication bandwidth. Cerebras has developed a novel computer system designed for this work that is powered by a 21.5cm by 21.5cm wafer-scale processor with 400,000 programmable compute cores. It is structured as a regular array of 633 by 633 processing elements, each with its own local high bandwidth SRAM memory and direct high bandwidth connection to its neighboring cores. In addition to supporting traditional execution models for neural network training and inference, this engine has a unique capability to compile and compute every layer of a complete neural network simultaneously. Mapping a neural network in this fashion onto Cerebras' Wafer-Scale Engine (WSE) is reminiscent of the traditional floorplanning problem in physical design. A kernel ends up as a rectangle of x by y compute elements. These are the flexible blocks that need to be placed to optimize performance. This paper describes an ISPD 2020 challenge to develop algorithms and heuristics that produce compiled neural networks that achieve the highest possible performance on the Cerebras WSE.
更多
查看译文
关键词
Machine Learning, Physical Design, Floorplanning, Training of Neural Networks, Wafer-Scale Circuits
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要