Solaris Application Programming, 1/e - Chapter 4 - Parent Directory

Solaris Application Programming, 1/e - Chapter 4 - Parent Directory Solaris Application Programming, 1/e - Chapter 4 - Parent Directory

05.08.2013 Views

74 Chapter 4 Informational Tools When the -c flag is passed to cpustat (and cputrack) it provides a pair of counters on which to collect. These are referred to as pic0 and pic1. More than 60 event types are available to select from on the UltraSPARC IIICu processor, and two can be selected at once. Some of the event types are available on only one of the counters, so not every pairing is possible. The ,sys appended at the end of the pair of counter descriptions indicates that the counters should also be collected during system time. The counters are collected in rotation, so each pair of counters is collected for a short period of time. The default interval is five seconds. If the program is not in a steady state—suppose it reads some data from memory and then spends the next few seconds in intensive floating-point operations—it is quite possible that the coarse sampling used earlier will miss the interesting points (e.g., looking for cache misses during the floating-point-intensive code, and looking for floating-point operations when the data is being fetched from memory). Example 4.32 shows the command line for cputrack to rotate through a selection of performance counters, and partial output from the command. Example 4.32 Example of cpustat Output $ cpustat -c pic0=Rstall_storeQ,pic1=Re_DC_miss,sys \ > -c pic0=EC_rd_miss,pic1=Re_EC_miss,sys \ > -c pic0=Rstall_IU_use,pic1=Rstall_FP_use,sys \ > -c pic0=Cycle_cnt,pic1=Re_PC_miss,sys \ > -c pic0=Instr_cnt,pic1=DTLB_miss,sys \ > -c pic0=Cycle_cnt,pic1=Re_RAW_miss,sys time cpu event pic0 pic1 5.005 0 tick 294199 1036736 # pic0=Rstall_storeQ,pic1=Re_DC_miss,sys 5.005 1 tick 163596 12604317 # pic0=Rstall_storeQ,pic1=Re_DC_miss,sys 10.005 0 tick 5485 965974 # pic0=EC_rd_miss,pic1=Re_EC_miss,sys 10.005 1 tick 76669 11598139 # pic0=EC_rd_miss,pic1=Re_EC_miss,sys ... The columns of cpustat output shown in Example 4.32 are as follows. The first column reports the time of the sample. In this example, the samples are being taken every five seconds. The next column lists the CPU identifier. The samples are taken and reported for each CPU. The next column lists the type of event. For cpustat, the type of event is only going to be a tick. The next two columns list the counts for performance counters pic0 and pic1 since the last tick event. Finally, if cpustat is rotating through counters, the names of the counters are reported after the # sign.

4.4 PROCESS- AND PROCESSOR-SPECIFIC TOOLS 75 4.4.4 Reporting Hardware Performance Counter Activity for a Single Process (cputrack) cputrack first shipped with Solaris 8. It is another tool that reports the number of performance counter events. However, cputrack has the advantages of collecting events only for the process of interest and reporting the total number of such events at the end of the run. This makes it very useful for situations in which the application starts, does something, and then exits. The script in Example 4.33 shows one way that cputrack might be invoked on a process. Example 4.33 Script for Invoking cputrack on an Application $ cputrack -c pic0=Dispatch0_IC_miss,pic1=Dispatch0_mispred,sys \ -c pic0=Rstall_storeQ,pic1=Re_DC_miss,sys \ -c pic0=EC_rd_miss,pic1=Re_EC_miss,sys \ -c pic0=Rstall_IU_use,pic1=Rstall_FP_use,sys \ -c pic0=Cycle_cnt,pic1=Re_PC_miss,sys \ -c pic0=Instr_cnt,pic1=DTLB_miss,sys \ -c pic0=Cycle_cnt,pic1=Re_RAW_miss,sys \ -o \ The script in Example 4.33 demonstrates how to use cputrack to rotate through the counters and capture data about the run of an application. The same caveat applies as for cpustat: Rotating through counters may miss the events of interest. An alternative way to invoke cputrack is to give it just a single pair of counters. The example in Example 4.34 shows this. Example 4.34 Example of cputrack on a Single Pair of Counters $ cputrack -c pic0=Cycle_cnt,pic1=Re_DC_miss testcode time lwp event pic0 pic1 1.118 1 tick 663243149 14353162 2.128 1 tick 899742583 9706444 3.118 1 tick 885525398 7786122 3.440 1 exit 2735203660 33964190 The output in Example 4.34 shows a short program that runs for three seconds. cputrack has counted the number of processor cycles consumed by the application using counter 0, and the number of data-cache miss events using counter 1; both numbers are per second, except for the line marked “exit,” which contains the total counts over the entire run. The columns in the output are as follows.

74 <strong>Chapter</strong> 4 Informational Tools<br />

When the -c flag is passed to cpustat (and cputrack) it provides a pair of<br />

counters on which to collect. These are referred to as pic0 and pic1. More than 60<br />

event types are available to select from on the UltraSPARC IIICu processor, and<br />

two can be selected at once. Some of the event types are available on only one of<br />

the counters, so not every pairing is possible. The ,sys appended at the end of the<br />

pair of counter descriptions indicates that the counters should also be collected<br />

during system time. The counters are collected in rotation, so each pair of counters<br />

is collected for a short period of time. The default interval is five seconds.<br />

If the program is not in a steady state—suppose it reads some data from memory<br />

and then spends the next few seconds in intensive floating-point operations—it<br />

is quite possible that the coarse sampling used earlier will miss the interesting<br />

points (e.g., looking for cache misses during the floating-point-intensive code, and<br />

looking for floating-point operations when the data is being fetched from memory).<br />

Example 4.32 shows the command line for cputrack to rotate through a selection<br />

of performance counters, and partial output from the command.<br />

Example 4.32 Example of cpustat Output<br />

$ cpustat -c pic0=Rstall_storeQ,pic1=Re_DC_miss,sys \<br />

> -c pic0=EC_rd_miss,pic1=Re_EC_miss,sys \<br />

> -c pic0=Rstall_IU_use,pic1=Rstall_FP_use,sys \<br />

> -c pic0=Cycle_cnt,pic1=Re_PC_miss,sys \<br />

> -c pic0=Instr_cnt,pic1=DTLB_miss,sys \<br />

> -c pic0=Cycle_cnt,pic1=Re_RAW_miss,sys<br />

time cpu event pic0 pic1<br />

5.005 0 tick 294199 1036736 # pic0=Rstall_storeQ,pic1=Re_DC_miss,sys<br />

5.005 1 tick 163596 12604317 # pic0=Rstall_storeQ,pic1=Re_DC_miss,sys<br />

10.005 0 tick 5485 965974 # pic0=EC_rd_miss,pic1=Re_EC_miss,sys<br />

10.005 1 tick 76669 11598139 # pic0=EC_rd_miss,pic1=Re_EC_miss,sys<br />

...<br />

The columns of cpustat output shown in Example 4.32 are as follows.<br />

The first column reports the time of the sample. In this example, the samples<br />

are being taken every five seconds.<br />

The next column lists the CPU identifier. The samples are taken and reported<br />

for each CPU.<br />

The next column lists the type of event. For cpustat, the type of event is<br />

only going to be a tick.<br />

The next two columns list the counts for performance counters pic0 and<br />

pic1 since the last tick event.<br />

Finally, if cpustat is rotating through counters, the names of the counters<br />

are reported after the # sign.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!