Reference: Server Checks
AGENTALIVE
Verifies Avantra Agent alive status and detects cluster switches.
One of the most important Avantra checks. It differs from almost all other checks, since it is not executed by the Avantra Agent, but by the Avantra Master.
AGENTALIVE, on the one hand, is used to self-monitor Avantra Agent and is generated by Avantra Master if there is no message from Avantra Agent within a certain period of time.
On the other hand, it is used to detect cluster switches.
For a Physical Server, the AGENTALIVE check can only have two statuses: Ok or Critical. If the AGENTALIVE turns to Critical, all other dependent checks will turn to Unknown. A Critical AGENTALIVE check result can have several reasons:
-
The server is down.
-
The server is up but Avantra Agent does not run.
-
The server is up and the Avantra Agent is running but cannot connect to Avantra Master due to network problems.
For Virtual Cluster Servers, the AGENTALIVE can also return a Warning state: * The check will be in Ok status if the Virtual Cluster Server is active on one of the configured physical nodes. * The check will be in Warning status if the Virtual Cluster Server has switched from one physical node to the other. The period of how long the check result remains in the Warning state can be configured using the ClusterSwitchMsgDisplayDuration Monitoring Parameter. * The check will be in Critical if the Virtual Cluster Server is not detected on any of the physical nodes, and in any of the reasons given for Physical Servers.
Managed Object | |
Check Cycle |
n/a, Check is executed by Avantra Master |
Depends on | |
Monitoring Parameters |
AgentConnect
Monitors the connection from Avantra Master to Avantra Agent.
This Avantra check is executed on the Avantra Master. It monitors whether an Avantra Agent can be reached from the Avantra Master. To decide whether the connection was successful or not, the Agent Configuration distribution is analyzed. If there no connection to the agent for more than AgentConnectTimeWarn minutes (default: 10 minutes), the check status changes to Warning. If the connection cannot be established for more than AgentConnectTimeCrit minutes (default: 60 minutes), the check turns to Critical.
If the AgentConnect Check is in status Warning or Critical, no configuration could be sent to the Avantra Agent from the Avantra Master. When the Check changes to status Ok, the configuration of the Avantra Agent will work again.
Virtual Cluster Server does not possess an AgentConnect check.
Managed Object | |
Check Cycle |
n/a, Check is executed by Avantra Master |
Depends on | |
Monitoring Parameters |
AgentConnectRoute, AgentConnectTimeCrit, AgentConnectTimeWarn, and AgentProtocolUseSSL |
CPULOAD
Verifies CPU usage of Physical Servers.
The average usage of all detected CPUs is computed over a period of time.
This Check is also available for Virtual Cluster Servers but is disabled by default.
Managed Object | |
Check Cycle |
Basic |
Depends on | |
Monitoring Parameters |
CheckCycleTime, CPULoadAverageTime, CPULoadCrit, CPULoadWarn |
FILESYSTEMS
Verifies the usage of local (or remote) filesystems.
Every filesystem is checked for its usage (and inode usage on Unix-like operating systems). And a forecast is computed of when the filesystem limit will be exceeded if current usage rates continue. Thresholds can be defined on a per-file system level for both usage and forecast.
By default, only local file systems are considered. Use Monitoring Parameter FSMonitorNetwork to enable monitoring of remote file systems.
On Unix, the file system of the following types are monitored (locally), configured by Monitoring Parameter FSTypeLocal:
advfs, aix, ext2, ext3, ext4, gpfs, hfs, jfs, jfs2, minix, ntfs, ocfs2, reiser4, reiserfs, ufs, vxfs, xfs, xiafs, zfs
On Unix, the filesystems of the following types are monitored if FSMonitorNetwork is activated:
afs, dfs, nfs/nfs2/nfsv2, nfs3/nfsv3, smbfs
FULLCHECK
Usually the server does not have a FULLCHECK (i.e. Daily Check). The only exception is when a CUSTOM_CHECK is defined for this server. In such case, the status and description of the check(s) are provided.
Managed Object |
Server |
Check Cycle |
DAILY |
Depends on |
AGENTALIVE |
Monitoring Parameters |
None |
MEMORY
Verifies physical memory usage of Physical Servers.
MEMORY check verifies physical memory usage of the operating system against configurable thresholds MemoryUsageWarn and MemoryUsageCrit.
If the percentage of memory used is above MemoryUsageWarn, the check status will be Critical. If the percentage of memory used is above MemoryUsageWarn but still below MemoryUsageCrit, the check result status will be Warning.
If the monitoring parameter thresholds are configured as remaining free space using KB/MB or GB, the amount of free memory is checked (if it is below the corresponding threshold).
MEMORY check is accompanied by check PAGINGSPACE which includes virtual memory i.e. swap space as well. If MEMORY returns a Warning or Critical result, but PAGINGSPACE is still Ok, you will probably notice that the system will perform slower, but should still work correctly. If, however, PAGINGSPACE will result as Warning or even Critical too, the server is definitely in a more serious condition and you may face out-of-memory errors.
Managed Object | |
Check Cycle |
Basic |
Depends on | |
Monitoring Parameters |
PAGINGSPACE
Verifies paging space usage of Physical Servers.
Every detected paging space (or swap space) is checked for usage. Thresholds may be defined for the total paging space usage. Depending on the operating system, paging spaces may be devices, file (systems), or in memory. For operating systems that use an early paging strategy, paging space being reserved is considered in use.
There is also an option to check the ratio of paging space to physical memory by defining a value for PagingSpacePhysMemRatioWarn
On Microsoft Windows operating systems: This check reports the paging space usage as reported from windows perfmon.exe
.
This value may differ from PF Usage as displayed by the Windows Task Manager.
Managed Object | |
Check Cycle |
Basic |
Depends on | |
Monitoring Parameters |
PagingSpaceWarn, PagingSpaceCrit, CheckCycleTime, PagingSpacePhysMemRatioWarn, and TimeOutOsCalls (HSolaris only) |
SelfCheck
Performs Avantra Agent self checking.
In addition to checks performed by the Avantra Agent, there are many other tasks that run in the background, for example, change or performance data collection. If something goes wrong while doing these background tasks, SelfCheck will let you know.
Managed Object | |
Check Cycle |
Basic |
Depends on | |
Monitoring Parameters |
n.a. |
TimeOffset
Verifies server time accuracy.
The check compares the UTC server time with the Avantra Server UTC time. If the time difference exceeds TimeOffsetWarn (default is 30 seconds), the status changes to Warning, and if it exceeds TimeOffsetCrit (default 60 seconds), it changes to Critical.
For this check to provide correct results, it is crucial to keep the Avantra Server host clock properly synchronized.
Managed Object | |
Check Cycle |
60 Minutes |
Depends on | |
Monitoring Parameters |