Administration

Introduction to system configuration

The An Najah Computing Cluster (ANCC) is designed for high throughput computing. It consists of about 50 worker nodes and 8 data servers. In order to efficiently and reliably manage the system, the provisioning tool Cobbler and the configuration management tool Puppet are used. It is highly recommended to use these tools for all system administration even simple package installation. If these tools are used, commissioning new hosts and consistent management of running hosts, can be achieved with minimal work. These tools will be discussed later.

In order to bring the system up initially or to recover from a catastrophic failure, the essential server machines (both real and/or virtual) might need to reinstalled by hand. This document will address the low level configuration of these machines.

System components

The essential services for bringing up the machines from scratch require the installation and configuration of the following machines and services:

  • t3ps.najah.edu - The main gateway machine.

  • cobbler - Virtual machine for provisioning other machines.

  • puppet - Virtual machine for configuration management.

With these services running, and provided there are reliable backups, one should be able to install and configure the system with minimal effort. Provisioning new hosts and managing host configurations is covered in the cobble and puppet sections respectively.

t3ps.najah.edu

This machine acts as the Internet gateway, firewall, and user interactive node. This machine is configured with:

  • Network bridge for virtual machines

  • Routing between the external IP and the internal IP

  • Able to run KVM virtual machines using the libvirt package.

Configuring the network bridge:

Please Class 17 - KVM: Virtual Machines from the An Najah HTC Cluster and Grid Course.

yum install bridge-utils

/etc/sysctl.d/90-cuhep-no-ipv6.conf

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1

/etc/sysctl.d/99-router.conf

# Controls IP packet forwarding
net.ipv4.ip_forward = 1
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

/etc/sysconfig/network-scripts/ifcfg-br1

IPV6INIT=no
IPV6_AUTOCONF=no
UUID=d817b67c-ecf8-4f18-a9c7-2b7dc179e137
IPADDR="10.0.0.1"
NETMASK="255.0.0.0"
BOOTPROTO="static"
DEVICE="br1"
ONBOOT="yes"
IPV6INIT="no"
TYPE="Bridge"

/etc/sysconfig/network-scripts/ifcfg-enp3s0

TYPE=Ethernet
BOOTPROTO=static
DEVICE=enp3s0
ONBOOT=yes
BRIDGE=br1

Installing libvirt:

Please see Class 17 - KVM: Virtual Machines from the An Najah HTC Cluster and Grid Course.

yum install qemu-kvm qemu qemu-img virt-manager libvirt libvirt-python ibvirt-client   virt-install virt-viewer
systemctl start libvirtd
systemctl enable libvirtd

Cobbler Installation

Please see Class 3, Class 4, and Class 5 from the An Najah HTC Cluster and Grid Course.

Puppet Installation

Please see Class 6, Class 7, and Class 8 from the An Najah HTC Cluster and Grid Course.

HTCondor Installation

The HTCondor installation has been completely puppetized. The puppet module is called htcondor. The code is keep in /usr/local/adm/puppet/modules