Predictive Analytics & Forecasting
Predictive Analytics
Predictive Analytics is the Avantra Enterprise Edition approach to “spikey”, “bursty”, or “flapping” monitoring situations for checks based on time series data. It combines Machine Learning algorithms to predict future trends with classic threshold based monitoring.
This function is only available with the Avantra Enterprise Edition. |
There are situations where thresholds are too static, or become too strict. The intention of thresholds is to define a long term separator between good and not so good. Usually a situation only becomes critical if a threshold is exceeded over a longer period of time. Whenever resource usage is spikey, or there are usage bursts, thresholds can be exceeded for a short period of time, although this does not constitute a critical situation.
From a technical perspective it tricky is to decide: is a certain situation only a usage spike, or is it the beginning of a longer term problem?
Predictive Analytics tries to remediate this dilemma. The Avantra agent applies ML based algorithms to understand usage patterns and to predict - at every given point in time - how the usage will evolve in the near future. These predictions are used to determine a trend whenever the agent evaluates a check. And this trend is taken into account, in addition to the defined thresholds.
If a threshold is exceeded and the predictive analytics engine projects the situation clears again within a short period of time, the check status will not be changed. Only if the situation is projected to get worse the check status is changed immediately.
Staring with Avantra 21.11, the Predictive Analytics Engine is enabled in the CPULOAD and the HDB_CPULoad checks. It will successively be rolled out to other checks based on time series data.
Forecasting
Avantra includes a feature called forecasting which is implemented for many Avantra Checks. In fact, it is implemented wherever it is possible. A check requires a resource to be monitored, i.e. something where usage in percent of the maximum available resource space can be calculated.
An example of such a resource-related check is the FILESYSTEMS
check. A file system has a size and an amount of space that is used. Therefore, the usage in percent is calculated. Now if the used space of a file system grows fast, but is still below the usage thresholds, the used space may be filled very quickly so it may be hard to react in time before the usage threshold related check will give you a Warning status.
Forecasting takes the job to inform you of heavy resource consumption before the resource is full. It will use the previous usage values to interpolate into the future, how long it will take – assuming the usage will continue to grow uniformly – until the resource space available is exhausted. Similar to the two Warning and Critical usage thresholds, each check has two thresholds for forecasting: a Warning (…ExWarn) threshold with a default value of 12 hours, and a Critical (…ExCrit) threshold with a default of one hour. To interpolate into the future, forecasting will always take the usage values of half of the exhaustion threshold past from now, so per default, it will look 30 minutes back for the Critical threshold.
An issue with forecasting is to prevent false alerts, so a growing resource needs to be growing for a certain time, otherwise, a false alert will be reported too often. Additionally, forecasting will be inactive if the usage of a resource is below 50 %.
Forecasting is available for the following checks: ADA_DATAAREA, ADA_LOGAREA, DB2_FSLOGARCH, DB2_LOGUSAGE, DB2_TABLESPACES, FILESYSTEMS, HDB_Disks, MSS_DBUSAGE, MSS_LOGUSAGE, ORA_FSLOGARCH, ORA_TABLESPACES, SYB_DataSpaces, and SYB_LogSpaces
Forecasting is also available with Performance data for those resources where it is useful. Whenever you select the time range of a Performance chart to extend into the future, the data is extrapolated accordingly. It is also available in the Service Level Reports unless opted-out during generation.