Monitoring

Overview

System monitoring is an imporant component of maintaining a reliable system. There are many open source packages that can be used for monitoring the system. These systems can monitor CPU, network, etc usage; send messages in the event of an error; or shutdown the system in the event of an emergency. Examples of industry standard systems are:

Example of a monitoring web page: https://advance.colorado.edu/computing/ClusterStatus