Deep learning (DL) technology has gained great success in today’s artificial intelligence (AI) applications. However, executions of the DL algorithms are extremely resourced intensive and power hungry because of the involved large-scale data and ultra-complex computation model. Hence, computation efficiency of the DL computing systems is largely constrained. The project targets to completely solve this challenge in a hardware and software co-optimization strategy which is overlooked in previous research. The high-performance deep learning system is enabled with the capability to automatically select the best parameter configurations of advanced algorithms based on the hardware architectures of the computing system.