The ULRM supports monitoring of the following resource groups:
File systems
The ULRM raises an alert if a file system is not mounted or if the space used by a file system exceeds a threshold. It automatically clears the alert once the file system is mounted (if the file system was not previously mounted) or when the space used falls below the threshold. The administrator specifies the file systems to monitor and usage thresholds for each file system.
Critical processes
The ULRM raises an alert when a critical process is in an unexpected state (‘up’ or ‘down’). It automatically clears the alert when the process returns to the expected state. The administrator specifies critical processes to monitor and the unexpected state of each process.
Long running processes
A long running process is a process that has accumulated more than 60 seconds of CPU time since the application started. The ULRM raises an alert when a long running process accumulates CPU time during a monitoring interval that exceeds a defined threshold CPU time. It automatically clears the alert when the accumulated CPU time during the monitoring interval falls below the threshold. The administrator specifies the maximum amount of time that processes are expected to accumulate during a polling interval and defines the processes that should be excluded from this monitoring.
CPU
The ULRM raises an alert if CPU usage remains above the threshold value for a period of time. The time period is used to avoid alerts for momentary spikes in CPU usage. It automatically clears the alert when CPU usage falls below the threshold.
The administrator specifies the CPU resource to monitor, either CPU usage or system CPU usage, and a usage threshold as a percent busy for the resource.
Memory
The ULRM raises an alert if memory usage remains above the threshold value for a period of time. The time period is used to avoid alerts for momentary spikes in memory usage. It automatically clears the alert when memory usage falls below the threshold. The administrator defines the memory usage threshold as a percent in use.
Log Files
The ULRM raises an alert if a particular phrase appears in a log file. It monitors messages that are written to the log file and raises an alert if a particular phrase appears in the log file. The administrator specifies the log files to monitor and, for each log file, phrases to search for in the log file. The phrase can be fixed text or a regular expression.
Custom actions (resources managed by user-supplied scripts or programs)
The ULRM provides custom actions as a method to use scripts or programs to extend the capabilities of the ULRM. An administrator defines in the resource monitor policy the name of the script or program, the arguments to pass to the script or program, and whether to cyclically execute the script or program.
Any scripting or programming language can be used to write the scripts and programs. The ULRM starts the script or program that monitors the UNIX or Linux resource. The custom action is configured to start the script or program once or periodically (based on the interval defined in the policy). When the monitoring conditions are met, the script or program posts events to the Operations Sentinel server by writing an event report to standard output.
The ULRM lets the administrator enable or disable monitoring for the entire policy (to account for scheduled maintenance) or for individual resource groups (for debugging and testing). Changes take effect immediately after the policy is distributed to the managed UNIX or Linux system. These changes only apply to the affected resource groups and do not disrupt monitoring of other resource groups.