OProfile is a low overhead, system-wide performance monitoring tool. It uses the performance monitoring hardware on the processor to retrieve information about the kernel and executables on the system, such as when memory is referenced, the number of L2 cache requests, and the number of hardware interrupts received. On a Red Hat Enterprise Linux system, the oprofile RPM package must be installed to use this tool.
Many processors include dedicated performance monitoring hardware. This hardware makes it possible to detect when certain events happen (such as the requested data not being in cache). The hardware normally takes the form of one or more counters that are incremented each time an event takes place. When the counter value, essentially rolls over, an interrupt is generated, making it possible to control the amount of detail (and therefore, overhead) produced by performance monitoring.
OProfile uses this hardware (or a timer-based substitute in cases where performance monitoring hardware is not present) to collect samples of performance-related data each time a counter generates an interrupt. These samples are periodically written out to disk; later, the data contained in these samples can then be used to generate reports on system-level and application-level performance.
OProfile is a useful tool, but be aware of some limitations when using it:
Use of shared libraries — Samples for code in shared libraries are not attributed to the particular application unless the --separate=library option is used.
Performance monitoring samples are inexact — When a performance monitoring register triggers a sample, the interrupt handling is not precise like a divide by zero exception. Due to the out-of-order execution of instructions by the processor, the sample may be recorded on a nearby instruction.
opreport does not associate samples for inline functions' properly — opreport uses a simple address range mechanism to determine which function an address is in. Inline function samples are not attributed to the inline function but rather to the function the inline function was inserted into.
OProfile accumulates data from multiple runs — OProfile is a system-wide profiler and expects processes to start up and shut down multiple times. Thus, samples from multiple runs accumulate. Use the command opcontrol --reset to clear out the samples from previous runs.
Non-CPU-limited performance problems — OProfile is oriented to finding problems with CPU-limited processes. OProfile does not identify processes that are asleep because they are waiting on locks or for some other event to occur (for example an I/O device to finish an operation).
Table 41-1 provides a brief overview of the tools provided with the oprofile package.
Command | Description |
---|---|
op_help | Displays available events for the system's processor along with a brief description of each. |
op_import | Converts sample database files from a foreign binary format to the native format for the system. Only use this option when analyzing a sample database from a different architecture. |
opannotate | Creates annotated source for an executable if the application was compiled with debugging symbols. Refer to Section 41.5.3 Using opannotate for details. |
opcontrol | Configures what data is collected. Refer to Section 41.2 Configuring OProfile for details. |
opreport | Retrieves profile data. Refer to Section 41.5.1 Using opreport for details. |
oprofiled | Runs as a daemon to periodically write sample data to disk. |
Table 41-1. OProfile Commands