Experience Score
The experience score is a single view that shows the current and past status of all devices, users, and applications monitored by uberAgent.
The score is evaluated every full and half hour. It ranges from zero to ten. The higher the better. Scores from zero to four are highlighted in red, scores from four to seven in yellow, and scores from seven to ten in green. The experience score is currently available for uberAgent UXM.
Experience Score Dashboard
The experience score dashboard is the new entry point of the uberAgent UXM Splunk app. It calculates and visualizes experience scores for the entire estate, breaking the data down by category and component, highlighting components where potential issues are originating from.
The dashboard also provides quick access to important KPIs like logon duration, application responsiveness, or application errors.
Overall Score
On the left in the first row one can see the overall score and the trend compared to yesterday. On the right the score development over time is visualized.
Machine, User Session, and Application Scores
The overall score derives from three categories:
- Machine score: quality indicator for machine performance and health
- User session score: quality indicator for user session performance and health
- Application score: quality indicator for application performance and health
The charts show a trend indicator for the last day as well as a sparkline for the last seven days.
Score Components
Each category is calculated by different components. Components differ per category. For example, Stop errors is a component solely for the machine category, while the Protocol latency component is only part of the user session category. There are also common components, like CPU or RAM.
The categories allow to see issues in an environment and the components unveil the cause or causes. In the screenshot above, the low machine score is caused by a lot of stop errors.
In that case, checking the Stop Errors (Blue Screen & Power Loss) dashboard in the Machine menu shows the problematic machines.
Analyzing Individual Machines, User Sessions, and Applications
The tables for individual machines, user sessions, and applications are showing the lowest 20 scores seen today. These items may need attention the most. Click on an item in the table to get a drilldown.
A new chart opens showing the components over time for that item. That allows to see when the issue or the issues started. To analyze the item in detail, click on the troubleshoot button, which redirects to a new page.
More Details
Scores might be not enough to get an overview. One may want to see real numbers, for example login times, to get a better understanding of the performance.
Click on the plus sign next to the More details title to reveal charts with more details. Click an item of interest to get a drilldown.
Score Calculation
Component scores are evaluated every full and half hour for the last 30 minutes. Calculations are done for a span of three minutes, resulting in 10 sections (30 minutes/3 minutes = 10). If a section is above a threshold, a threshold counter is incremented.
Each score has two thresholds. One for low severity, one for high severity. Each threshold has a weight.
A score is calculated as follows: 10 – (Low severity threshold counter x low severity threshold weight + high severity threshold counter x high severity threshold weight)
Example 1: three sections above the low severity threshold as well as a weight of 0.5. The score would be: 10 – (3 x 0.5) = 8.5
Example 2: three sections above the low severity threshold (weight = 0.5) and two sections above the high severity threshold (weight = 1). The score would be: 10 – (3 x 0.5 + 2 x 1) = 6.5
Note the following: the higher the weight, the lower the score.
Below is a list of default thresholds and weights. To modify the defaults, see Modifying the Score Calculation.
Machine
Threshold | Setting | Default value | Unit | Default weight |
---|---|---|---|---|
CPU usage. Low severity. | ThresholdMachineCPUPercentLowerBound | 80 | % | 0.5 |
CPU usage. High severity. | ThresholdMachineCPUPercentHigherBound | 90 | % | 1 |
RAM usage. Low severity. | ThresholdMachineRAMPercentLowerBound | 80 | % | 0.5 |
RAM usage. High severity. | ThresholdMachineRAMPercentHigherBound | 90 | % | 1 |
DIsk IO usage. Low severity. | ThresholdMachineIOPercentLowerBound | 80 | % | 0.5 |
Disk IO usage. High severity. | ThresholdMachineIOPercentHigherBound | 90 | % | 1 |
Stop errors. Low severity. | ThresholdStopErrorCountLowerBound | 1 | Count | 0.7 |
Stop errors. High severity. | ThresholdStopErrorCountHigherBound | 2 | Count | 1 |
Disk usage. Low severity. | ThresholdMachineDiskUsagePercentLowerBound | 80 | % | 0.2 |
Disk usage. High severity. | ThresholdMachineDiskUsagePercentHigherBound | 90 | % | 0.5 |
Network availability. Low severity. Note: higher is better |
ThresholdMachineNetworkAvailabilityPercentLowerBound | 95 | % | 0.5 |
Network availability. High severity. Note: higher is better |
ThresholdMachineNetworkAvailabilityPercentHigherBound | 90 | % | 0.2 |
User session
Threshold | Setting | Default value | Unit | Default weight |
---|---|---|---|---|
CPU usage. Low severity. | ThresholdSessionCPUPercentLowerBound | 80 | % | 0.5 |
CPU usage. High severity. | ThresholdSessionCPUPercentHigherBound | 90 | % | 1 |
RAM usage. Low severity. | ThresholdSessionRAMPercentLowerBound | 80 | % | 0.5 |
RAM usage. High severity. | ThresholdSessionRAMPercentHigherBound | 90 | % | 1 |
Disk IO latency. Low severity. | ThresholdIOLatencyLowerBound | 20 | ms | 0.5 |
Disk IO latency. High severity. | ThresholdIOLatencyHigherBound | 30 | ms | 0.7 |
Logon duration. Low severity. | ThresholdLogonDurationLowerBound | 30 | s | 0.2 |
Logon duration. High severity. | ThresholdLogonDurationHigherBound | 60 | s | 0.4 |
Protocol latency. Low severity. | ThresholdSessionRpLatencyMsLowerBound | 100 | ms | 0.2 |
Protocol latency. High severity. | ThresholdSessionRpLatencyMsHigherBound | 200 | ms | 0.5 |
Application
Threshold | Setting | Default value | Unit | Default weight |
---|---|---|---|---|
CPU usage. Low severity. | ThresholdAppCPUPercentLowerBound | 80 | % | 0.5 |
CPU usage. High severity. | ThresholdAppCPUPercentHigherBound | 90 | % | 1 |
RAM usage. Low severity. | ThresholdAppRAMMBLowerBound | 1024 | MB | 0.1 |
RAM usage. High severity. | ThresholdAppRAMMBHigherBound | 2048 | MB | 0.3 |
Disk IO. Low severity. | ThresholdAppIOCountLowerBound | 200 | Count | 0.1 |
Disk IO. High severity. | ThresholdAppIOCountHigherBound | 400 | Count | 0.3 |
Network availability. Low severity. Note: higher is better |
ThresholdAppNetworkAvailabilityPercentLowerBound | 95 | % | 0.5 |
Network availability. High severity. Note: higher is better |
ThresholdAppNetworkAvailabilityPercentHigherBound | 90 | % | 0.2 |
Network latency. Low severity. | ThresholdAppSendLatencyMsLowerBound | 100 | ms | 0.2 |
Network latency. High severity. | ThresholdAppSendLatencyMsHigherBound | 300 | ms | 0.5 |
Application UI delay. Low severity. | ThresholdAppUIDelaySLowerBound | 5 | s | 0.2 |
Application UI delay. High severity. | ThresholdAppUIDelaySHigherBound | 10 | s | 0.5 |
Application errors. Low severity. | ThresholdApplicationErrorCountLowerBound | 1 | Count | 0.5 |
Application errors. High severity. | ThresholdApplicationErrorCountHigherBound | 2 | Count | 1 |
Modifying the Score Calculation
The scores’ default calculations are based on experiences in the field, but may not be applicable to your environment. Hence the calculations may be changed.
Before Modifying
Before making changes, note the following:
- The lowest weight possible is 0
- The highest weight possible is 1
- The sum of all weights doesn’t need to be 1. Each component is calculated seperately.
- All components together form a total machine/user session/application score. The total score is always equal to the lowest component score.
Modifying
To modify the score calcalution, change the following three input lookup files in $SPLUNK_HOME/etc/apps/uberAgent/lookups
. See Score Calculation for calculations and settings.
- Machine:
score_machine_configuration.csv
- User session:
score_session_configuration.csv
- Application:
score_application_configuration.csv
After Modifiying
- Distribute the changed input lookup files to all search heads
- It is best to delete all previous scores as they cannot be compared to the new ones. See Deleting Scores for instructions.
New versions of uberAgent may introduce new scores or changes to calculations for existing scores, hence your score modifications will be overriden when updating uberAgent.
Score Storage
Scores are stored in two different Splunk KV stores per category. One for the current date and a historic one for the last 30 days.
The scores for the current date get aggregated at midnight (average per day) and then stored in the historic KV store.
If you want to delete the KV stores, see Deleting Scores.
Deleting Scores
Scores are stored in a Splunk KV store and can be deleted via Splunk searches.
Machine
Run the following two searches one after another.
| outputlookup lookup_score_per_machine
| outputlookup lookup_score_historic_per_machine
User session
Run the following two searches one after another.
| outputlookup lookup_score_per_session
| outputlookup lookup_score_historic_per_session
Application
Run the following two searches one after another.
| outputlookup lookup_score_per_application
| outputlookup lookup_score_historic_per_application