Monitoring GPU Usage per Engine or Application

GPUs, just like any other hardware, need to be sized properly. If there is unused capacity, money is being wasted. If, on the other hand, utilization is at maximum, the user experience is poor. Sizing requires information. In this case, about GPU usage, ideally per GPU engine and application. uberAgent delivers.

GPU Architecture

GPUs are comprised of thousands of cores that run the same instructions in parallel on multiple data. This GPU architecture was initially designed for 3D rendering but has been found to be useful for any kind of application where algorithms are highly parallelizable.

Combined, a GPU’s cores are often called the 3D engine. While 3D is typically the most important engine, GPUs also have specialized engines that add capabilities like video encoding or decoding. Without those, smartphones would never be able to record HD video or play it back in real-time.

Monitoring GPU Usage per Engine

GPU monitoring presents some unique challenges. Different GPU models have different capabilities, which results in different types and numbers of engines.

uberAgent is prepared for that. It dynamically detects a GPU’s engines and determines each engine’s utilization individually. When displayed in a chart over time, this allows a viewer to grasp any engine’s resource usage immediately:

Monitoring GPU Usage per Application

A GPU’s resources are available for all processes that are running on a machine. Being able to discern which application generates what kind of load is crucial. In some cases, similar applications are very different with regards to efficiency and GPU resource footprint. This applies to browsers, for example. In other cases, applications you would expect to make good use of the GPU don’t.

By providing GPU utilization metrics per process, uberAgent helps IT understand and optimize GPUs for their application set.

Monitoring GPU Usage per Machine

In addition to the resource consumption per GPU engine and per application uberAgent also collects the GPU usage per machine. If a machine has more than one GPU, the numbers are collected individually per GPU. This is useful for gaining an understanding of the overall GPU utilization, both in terms of GPU compute and GPU memory resources.

Leave a Reply

Your email address will not be published. Required fields are marked *