Troubleshooting Fabric Management Platform Problems

Health Status of the Fabric Management Platform changes to “Warning” or “Critical”

Symptoms

Causes

Solutions

  • FFM Server Health status changes to “Critical” or “Warning” state.

  • Fabric Manager user interface stops functioning.

Fabric Manager services are not responding.

Recommendations

  • Allow time for the services to settle down. The health status of the Fabric Management Platform may change to Ok.

  • Ensure that the minimum number of applications are running on the Fabric Management Platform.

If this does not work, then:

  1. Log on to the Fabric Management Platform as a root user using PuTTY (or use the RDP client session of the Fabric Management Platform). The root user login is root/Administer4Me.

  2. Using the following command, restart the Fabric Manager services:

    #rcffmservices stop
    #rcffmservices start
  • CPU status under FFM Server Health changes to “Critical” or “Warning” state while performing some actions.

  • CPU status under FFM Server Health changes to “Critical” or “Warning” state when the system is idle.

Excessive CPU utilization.

Recommendations

  • Close the unwanted sessions terminals or Remote Desktop (RDP).

  • Contact Unisys Support Center and generate FFM dumps for further analysis and resolution.

Memory status under FFM Server Health changes to “Critical”.

Excessive utilization of the Fabric Management Platform memory. This might occur,

  1. After a file transfer activity such as image upload.

  2. During the transfer of files between FMP and your local system by using WIN SCP.

Recommendation

The memory status being "Critical" may be a temporary condition. However, if the alert persists, then ensure that only a minimum number of applications are running on the Fabric Management Platform. If this resolution does not work, then do the following:

  1. Log on to the Fabric Management Platform as a root user using PuTTY (or use the RDP client session of the Fabric Management Platform). The root user login is root/Administer4Me.

  2. Using the following command, free up the memory in the Fabric Management Platform:

    sync;echo 3 > /proc/sys/vm/drop_caches

    Note: Ensure that no other operations are performed in the Fabric Management Platform while executing this command.

Swap status under FFM Server Health changes to “Critical”.

Excessive utilization of the swap memory.

Recommendation

Ensure that the minimum number of applications are running in the Fabric Management Platform.

If this does not work, then:

  1. Log on to the Fabric Management Platform as a root user using PuTTY (or use the RDP client session of the Fabric Management Platform). The root user login is root/Administer4Me.

  2. Using the following command, free up the memory in the Fabric Management Platform:

    sync;echo 3>/proc/sys/vm/drop_caches

Disk status under FFM Server Health changes to “Critical”.

Excessive utilization of the disk memory.

You can free up the disk space by deleting the following:

  • Fabric Manager and Fabric Manager Platform dumps: See the ClearPath Forward Administration and Operations Guide for more information on deleting the dump files.

  • Images, blueprints, and s-Par firmware staged in the Fabric Manager disk: See the ClearPath Forward Administration and Operations Guide for more information on how to delete blueprints, gold images, and s-Par firmware.

To determine how much disk space is currently being used, log on to the Fabric Management Platform as root user using the Remote Desktop or an SSH client such as PuTTY, and then execute the following commands in the terminal window:

  • To see the disk usage at directory level, use the command

    #du -ch | sort -h –r
  • To know the % disk usage, use the command

    #df –T

Events status under FFM Server Health changes to “Critical” or “Warning”.

Following types (application type) of events are logged for the FMP:

  • IPMI SEL Events – Events from FMP’s IDRAC (Same as other IPMI SEL events for other platforms)

  • NAGIOS Events – Events that are logged when any of CPU, SWAP, Memory and DISK becomes critical or warning

Click on the number of critical or warning events to view the events in the Events & Alerts tab under Diagnostics. For more information on Events & Alerts, see Managing Diagnostic Data.