HOPE: a heterogeneity-oriented parallel execution engine for inference on mobiles

Apr 29, 2022·
Chunwei Xia
Chunwei Xia
,
Jiacheng Zhao
,
Huimin Cui
,
Xiaobing Feng
· 0 min read
Abstract
It is significant to efficiently support artificial intelligence (AI) applications on heterogeneous mobile platforms, especially coordinately execute a deep neural network (DNN) model on multiple computing devices of one mobile platform. This paper proposes HOPE, an end-to-end heterogeneous inference framework running on mobile platforms to distribute the operators in a DNN model to different computing devices. The problem is formalized into an integer linear programming (ILP) problem and a heuristic algorithm is proposed to determine the near-optimal heterogeneous execution plan. The experimental results demonstrate that HOPE can reduce up to 36.2% inference latency (with an average of 22.0%) than MOSAIC, 22.0% (with an average of 10.2%) than StarPU and 41.8% (with an average of 18.4%) than μLayer respectively.
Type
Publication
In 2022 High Techbology Letters