Popular deep learning(DL) frameworks(e.g. Caffe, TensorFlow) provide convenient APIs for developing deep neural networks(DNN). However, these frameworks only support its own interfaces including programming languages and execution environments. Therefore, we developed IDLE to provide a unified development environment that is convenient for users and independent of the framework.
The IDLE has the following features.
– A web-based graphical user interface that allows users to easily configure deep neural networks and execution environments.
– Intuitive JSON-based DL model description language for representing DNN models.
– Compiler converting to executable file for existing DL framework such as Caffe, TensorFlow and MxNet.
Performance of DL models are different according to DL frameworks. We studied DL scheduler which automatically configure framework and computing devices when executing large-scale DL applications. Through the scheduler, we expect to efficiently control energy consumption during execution.
For synchronous training of distributed DL, mini batch of each node is equally divided. in a heterogeneous GPU cluster, the slowest node named straggler resulted in increasing the overall training time. The batch-orchestration algorithm determines the different local mini batch according to GPU performance to address straggler problem.