Optimizing Deep Learning Inference via Global Analysis and Tensor Expression

Dec 30, 2018ยท
Jiacheng Zhao
,
Yisong Chang
,
Denghui Li
Chunwei Xia
Chunwei Xia
,
Huimin Cui
,
Ke Zhang
,
Xiaobing Feng
ยท 0 min read
Abstract
Nowadays, a large number of accelerators are proposed to increase the performance of AI applications, making it a big challenge to enhance existing AI programming frameworks to support these new accelerators. In this paper, we select TensorFlow to demonstrate how to port the AI programming framework to new hardwares, i.e., FPGA and Sunway TaihuLight here. FPGA and Sunway TaihuLight represent two distinct and significant hardware architectures for considering the retargeting process. We introduce our retargeting processes and experiences for these two platforms, from the source codes to the compilation processes. We compare the two retargeting approaches and demonstrate some preliminary experimental results.
Type
Publication
In Network and Parallel Computing