HPC Kit is an integrated suite of tools for measurement and analysis of program performance on computers ranging from multicore desktop systems to the nation’s largest supercomputers. By using statistical sampling of timers and hardware performance counters, HPC Kit collects accurate measurements of a program’s work, resource consumption, and inefficiency and attributes them to the full calling context in which they occur. HPC Kit supports measurement and analysis of serial codes, threaded codes (e.g. pthreads, OpenMP), MPI, and hybrid (MPI+threads) parallel codes. With the prevalence of using GPU as an accelerator for scientific computation, we are extending HPC Kitto support applications accelerated with GPUs from several vendors. In this presentation, we are going to discuss our recent development for supporting Intel GPUs and our experience of profiling several applications porting to Intel GPUs using Data Parallel C++.