Shuaijiang Li, Jiacheng Zhao, Ying Liu, Shuoming Zhang, Lei Chen, Yijin Li, Yangyu Zhang, Zhicheng Li, Runyu Zhou, Xiyu Shi, Chunwei Xia, Yuan Wen, Xiaobing Feng, Huimin Cui
(2026).
From Threads to Tiles: T2T, a Compiler for CUDA-to-NPU Translation via 2D Vectorization.
In
CGO 2026.