Project Proposal:
ISPC Hardware Performance Monitoring Tool
Nipunn Koorapati
Main Project Page
Summary
I plan on using x86 hardware performance monitoring instructions and tools to measure performance on ISPC compiler generated code. I will insert hardware counter instructions into the emitted ISPC code and use the results to measure runtime performance.
Background

ISPC is a C-like language designed to expose single-program-multiple-data (SPMD) parallelism. ispc is an open source compiler for this language which produces vector instructions. Programs written in ISPC frequently see significant speedup due to parallelism in vector hardware units (in SSE or AVX vector units).

Newer Intel processors (Westmere and Sandy Bridge) include hardware performance counters for measuring many interesting hardware-level performance metrics including instructions per clock (IPC), L2 and L3 cache miss rates and associated coherency traffic, and TLB miss rates.

Challenge

The challenge is to combine Intel performance counter instructions with the ISPC compiler to obtain useful performance metrics on arbitrary programs.

Resources

I will be modifying the ISPC Compiler using Intel Performance Counters.

Matt Pharr is a Principal Engineer at Intel and the main developer of ISPC. I've emailed him about this project and will certainly continue to get his help.

Agner Fog has written code using these counters which will be useful as a reference.

gprof is a profiling tool used with the gcc compiler for profiling CPU code without any special hardware counters. Its architecture is described in a paper. I will likely design the compiler/profiling architecture in a similar way.

Goals/Deliverables

I plan to add a mode to the ISPC compiler which will emit code instrumented with hardware performance instructions. When the instrumented program is run, it should output hardware performance statistics on the run.

If I have time, I will try to produce a visualization tool to display the statistics across time and/or parts of the code.

Platform

The ISPC compiler is a good place to use hardware performance counters because code written with the ISPC compiler is explicitly attempting to take advantage of hardware level parallelism.

Proposed Schedule
[Please do not modify the schedule on this page after the proposal process (it is your proposed schedule, and it will be useful to compare it to your actually project logs at the end of the project). Update the schedule on your project main page if your goals or timeline changes over the course of the project.]

Week What I Plan To Do
Apr 1-7Run x86 code using performance counters
Apr 8-14Run/modify aobench example in ISPC;
Apr 15-21Emit x86 code in ISPC in interesting spots. Output results
Apr 22-28Analyze results on various ISPC programs
Apr 29-May 5Output results in visual way
May 6-11Write example ISPC programs to show features