Administration
Introduction to system configuration
The An Najah Computing Cluster (ANCC) is designed for high throughput computing. It consists of about 50 worker nodes and 8 data servers. In order to efficiently and reliably manage the system, the provisioning tool Cobbler and the configuration management tool Puppet are used. It is highly recommended to use these tools for all system administration even simple package installation. If these tools are used, commissioning new hosts and consistent management of running hosts, can be achieved with minimal work. These tools will be discussed later.
In order to bring the system up initially or to recover from a catastrophic failure, the essential server machines (both real and/or virtual) might need to reinstalled by hand. This document will address the low level configuration of these machines.
System components
The essential services for bringing up the machines from scratch require the installation and configuration of the following machines and services:
t3ps.najah.edu - The main gateway machine.
cobbler - Virtual machine for provisioning other machines.
puppet - Virtual machine for configuration management.
With these services running, and provided there are reliable backups, one should be able to install and configure the system with minimal effort. Provisioning new hosts and managing host configurations is covered in the cobble and puppet sections respectively.
t3ps.najah.edu
This machine acts as the Internet gateway, firewall, and user interactive node. This machine is configured with:
Network bridge for virtual machines
Routing between the external IP and the internal IP
Able to run KVM virtual machines using the libvirt package.
Configuring the network bridge:
Please Class 17 - KVM: Virtual Machines from the An Najah HTC Cluster and Grid Course.
yum install bridge-utils
/etc/sysctl.d/90-cuhep-no-ipv6.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
/etc/sysctl.d/99-router.conf
# Controls IP packet forwarding
net.ipv4.ip_forward = 1
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1
/etc/sysconfig/network-scripts/ifcfg-br1
IPV6INIT=no
IPV6_AUTOCONF=no
UUID=d817b67c-ecf8-4f18-a9c7-2b7dc179e137
IPADDR="10.0.0.1"
NETMASK="255.0.0.0"
BOOTPROTO="static"
DEVICE="br1"
ONBOOT="yes"
IPV6INIT="no"
TYPE="Bridge"
/etc/sysconfig/network-scripts/ifcfg-enp3s0
TYPE=Ethernet
BOOTPROTO=static
DEVICE=enp3s0
ONBOOT=yes
BRIDGE=br1
Installing libvirt:
Please see Class 17 - KVM: Virtual Machines from the An Najah HTC Cluster and Grid Course.
yum install qemu-kvm qemu qemu-img virt-manager libvirt libvirt-python ibvirt-client virt-install virt-viewer
systemctl start libvirtd
systemctl enable libvirtd
Cobbler Installation
Please see Class 3, Class 4, and Class 5 from the An Najah HTC Cluster and Grid Course.
Puppet Installation
Please see Class 6, Class 7, and Class 8 from the An Najah HTC Cluster and Grid Course.
HTCondor Installation
The HTCondor installation has been completely puppetized. The puppet module is called htcondor. The code is keep in /usr/local/adm/puppet/modules