Difference between revisions of "Endeavour"
(39 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
= Endeavour = | = Endeavour = | ||
− | + | Endeavour was purchased as part of a full cluster, from Microway (quote: MWYQ11029 ~ $16,000 for base system) <br> | |
+ | It came with 13 x2 Number Smasher nodes (Total was ~$90,000) <br> | ||
+ | Arrived at UNH in April 2009. | ||
+ | |||
'''Notes on configuration status/changes and ToDo is at the bottom.''' | '''Notes on configuration status/changes and ToDo is at the bottom.''' | ||
+ | Endeavor web server is active: [http://endeavour.unh.edu/ Endeavour] <br> | ||
+ | It runs the Ganglia monitoring software on [https://endeavour.unh.edu/ganglia Endeavour Ganglia] | ||
+ | |||
+ | Endeavor RAID card is connected to: [http://10.0.0.199 10.0.0.199] <br> | ||
+ | Endeavor SWITCH is connected to [http://10.0.0.253 10.0.0.253] <br> | ||
+ | Hardware Temperature monitoring [http://10.0.0.98 10.0.0.98] <br> | ||
+ | Cacti Statistics: [http://roentgen.unh.edu/cacti roentgen.unh.edu/cacti] | ||
+ | |||
+ | == RAID Setup == | ||
+ | |||
+ | * Endeavour has an Areca RAID card on 10.0.0.199. | ||
+ | * There are 24 channels in the RAID card. | ||
+ | * Current setup: | ||
+ | ** RAID SET #00 - 9 disks @ 2TB each = 18 TB raw. 3 Volume sets. | ||
+ | *** 0/0/0 - RAID 6 - 100 GB | ||
+ | *** 0/0/1 - RAID 6 - 100 GB | ||
+ | *** 0/0/2 - RAID 6 - 13.8 TB = data1 | ||
+ | ** RAID SET #02 - 1 @ 750 GB | ||
+ | *** 0/0/4 - Passthroudh | ||
+ | ** RAID SET +04 - 12 disks @ 4TB each = 48 TB raw, 1 Volume set | ||
+ | *** 0/0/6 - RAID 6 - 40 TB = data2 | ||
+ | |||
+ | = Upgrading Nodes = | ||
+ | [[Upgrading the nodes]] | ||
= System Usage = | = System Usage = | ||
Line 9: | Line 36: | ||
This section explains some of the special use for this system. | This section explains some of the special use for this system. | ||
− | == OpenPBS = Portable Batch System == | + | == OpenPBS = [[Torque]] = Portable Batch System == |
− | + | Toque, "PBS", "OpenPBS", is our batch system. Local information for [[Torque]] | |
− | |||
− | |||
− | |||
=== Commands === | === Commands === | ||
− | ; pbsnodes : This gives a quick overview of all the known nodes and whether they are up. If they are what the status is. | + | ; pbsnodes : This gives a quick overview of all the known nodes and whether they are up. If they are what the status is. "pbsnodes -a" lists everything about all the nodes, "pbsnodes -l up" shows a list of the nodes that are up, "pbsnodes -l down" the same for the ones that are down. |
− | + | ; xpbs : Graphical interface to PBS. Really old and probably not that useful. | |
− | + | ; xpbsmon : Graphical interface to monitor nodes. It gives a quick view of node status. Use Ganglia for more sophisticated node stats. | |
+ | ; [[qsub]] : Command-line tool for submitting jobs to PBS | ||
+ | ; qstat : Information on what is on the queue | ||
+ | ; showq : Information on what is on the queue from the *scheduler* maui. -- Tells you what job will run next. | ||
+ | ; maui : Use 'sudo /etc/init.d/maui restart' to restart if the queuing system goes down | ||
− | + | === qperf Command === | |
− | + | qperf measures bandwidth and latency between two nodes. It can work over TCP/IP as well as the RDMA transports. | |
− | + | There are many more tests to use (rc_bi_bw) built in on the man page. | |
− | |||
+ | On the first node just run | ||
+ | qperf | ||
+ | On the second node run this to test Infiniband and ethernet | ||
+ | qperf -t 5 node2.farm.physics.unh.edu rc_bi_bw tcp_bw | ||
+ | = Initial setup and Configuration = | ||
+ | * Set the UNH IP address (endeavour.unh.edu) on eth1. <font color="green" ><b>[done]</b></font> | ||
+ | ** This made the system think of itself as "endeavor" rather than "master", causing PBS to get confused. PBS in /var/spool/pbs adjusted, also the maui scheduler in /usr/local/maui/maui.cfg modified. <font color="green" ><b>[done]</b></font> | ||
+ | * I switched the IP address on eth0 to 10.0.0.100 from 10.0.0.1 (since that is the usual gateway address, and we want to bridge the two backend networks.) <font color="green" ><b>[done]</b></font> | ||
+ | ** '''This requires ALL "hosts" files on the nodes to be modified''' <font color="green" ><b>[done,all nodes but 25]</b></font> | ||
+ | ** '''Also, the /root/.shosts /root/.rhosts and /etc/ssh/ssh_known_hosts /etc/ssh/shosts.equiv files need to be copied from node2 to node*''' <font color="green" ><b>[done,all nodes but 25]</b></font> | ||
+ | ** The file /var/spool/pbs/server_name needs to be updated as well <font color="green" ><b>[done,all nodes but 25]</b></font> | ||
+ | ** The /etc/pam.d/system-auth-ac needs to include the ldap module. '''(NOT DONE only for 2,3)''' | ||
+ | * Set the root password to standard scheme. <font color="green" ><b>[done,master only]</b></font> | ||
+ | * Setup the LDAP client side. <font color="green" ><b>[done,master only]</b></font> | ||
+ | * Recompiled PBS to include the xpbs and xpbsmon commands.<font color="green" ><b>[done]</b></font> | ||
+ | * Configured and started the iptables firewall <font color="green" ><b>[done,master only]</b></font> | ||
+ | * Integrated the backend network with the farm backend network (bridged the network switches) <font color="green" ><b>[done]</b></font> | ||
+ | * Setup automount, standard /net/data and /net/home <font color="green" ><b>[done,master only]</b></font> | ||
+ | ** TODO: We need a new rule that resolves /net/data/node2 for the disk in node2 etc. The nodes need to export their /scratch partition. The other partitions may not be needed, since the "rcpf" command (a foreach with rcp) can copy files in batch. | ||
+ | * The /etc/nodes file included the "master" node. This is too dangerous. It means that in a batch copy the file is also automatically copy back to the master, with potentially dangerous results. | ||
+ | * To add users to the "microway Ganglia control" part, add them to /etc/mcms.users Password is login password, LDAP is honored. | ||
+ | * '''Setup and started SPLUNK''' - runs its own server, and forwards to pumpkin. | ||
+ | * Added endeavour disks /data1 and /data2 to the export table and the LDAP auto.data table, so ls /net/data/endeavour{1,2} works. | ||
+ | * On node2 (so far) to get users and home directories integrated: | ||
+ | ** Copy /etc/ldap.conf /etc/openldap/* to node from endeavour. | ||
+ | ** copy /etc/nsswitch.conf to node from endeavour. | ||
+ | ** restart autofs (/etc/init.d/autofs restart) | ||
== TO DO == | == TO DO == | ||
− | # | + | # Figure out the monitoring system, Ganglia, and other Microway goodies.<font color="green" ><b>[partially done]</b></font> |
− | # | + | # LDAP (client) on nodes? |
− | + | # Test the Infiniband and MPICH setup. | |
+ | |||
=== Long Term To Do === | === Long Term To Do === | ||
Possible long term tasks if manpower is available. | Possible long term tasks if manpower is available. | ||
− | # | + | Long term goal is to have Endeavour as an independent system is need be. |
− | # Replicate home directories for selected users (this may be too tricky, really) | + | |
+ | # Run a replicate LDAP server on Endeavour. | ||
+ | # Run a replicate Named (DNS) server on Endeavour and Roentgen. | ||
+ | # Replicate home directories for selected users (this may be too tricky, really)? Else create a local copy of each user. |
Latest revision as of 17:00, 7 August 2017
Endeavour
Endeavour was purchased as part of a full cluster, from Microway (quote: MWYQ11029 ~ $16,000 for base system)
It came with 13 x2 Number Smasher nodes (Total was ~$90,000)
Arrived at UNH in April 2009.
Notes on configuration status/changes and ToDo is at the bottom.
Endeavor web server is active: Endeavour
It runs the Ganglia monitoring software on Endeavour Ganglia
Endeavor RAID card is connected to: 10.0.0.199
Endeavor SWITCH is connected to 10.0.0.253
Hardware Temperature monitoring 10.0.0.98
Cacti Statistics: roentgen.unh.edu/cacti
RAID Setup
- Endeavour has an Areca RAID card on 10.0.0.199.
- There are 24 channels in the RAID card.
- Current setup:
- RAID SET #00 - 9 disks @ 2TB each = 18 TB raw. 3 Volume sets.
- 0/0/0 - RAID 6 - 100 GB
- 0/0/1 - RAID 6 - 100 GB
- 0/0/2 - RAID 6 - 13.8 TB = data1
- RAID SET #02 - 1 @ 750 GB
- 0/0/4 - Passthroudh
- RAID SET +04 - 12 disks @ 4TB each = 48 TB raw, 1 Volume set
- 0/0/6 - RAID 6 - 40 TB = data2
- RAID SET #00 - 9 disks @ 2TB each = 18 TB raw. 3 Volume sets.
Upgrading Nodes
System Usage
This section explains some of the special use for this system.
OpenPBS = Torque = Portable Batch System
Toque, "PBS", "OpenPBS", is our batch system. Local information for Torque
Commands
- pbsnodes
- This gives a quick overview of all the known nodes and whether they are up. If they are what the status is. "pbsnodes -a" lists everything about all the nodes, "pbsnodes -l up" shows a list of the nodes that are up, "pbsnodes -l down" the same for the ones that are down.
- xpbs
- Graphical interface to PBS. Really old and probably not that useful.
- xpbsmon
- Graphical interface to monitor nodes. It gives a quick view of node status. Use Ganglia for more sophisticated node stats.
- qsub
- Command-line tool for submitting jobs to PBS
- qstat
- Information on what is on the queue
- showq
- Information on what is on the queue from the *scheduler* maui. -- Tells you what job will run next.
- maui
- Use 'sudo /etc/init.d/maui restart' to restart if the queuing system goes down
qperf Command
qperf measures bandwidth and latency between two nodes. It can work over TCP/IP as well as the RDMA transports.
There are many more tests to use (rc_bi_bw) built in on the man page.
On the first node just run
qperf
On the second node run this to test Infiniband and ethernet
qperf -t 5 node2.farm.physics.unh.edu rc_bi_bw tcp_bw
Initial setup and Configuration
- Set the UNH IP address (endeavour.unh.edu) on eth1. [done]
- This made the system think of itself as "endeavor" rather than "master", causing PBS to get confused. PBS in /var/spool/pbs adjusted, also the maui scheduler in /usr/local/maui/maui.cfg modified. [done]
- I switched the IP address on eth0 to 10.0.0.100 from 10.0.0.1 (since that is the usual gateway address, and we want to bridge the two backend networks.) [done]
- This requires ALL "hosts" files on the nodes to be modified [done,all nodes but 25]
- Also, the /root/.shosts /root/.rhosts and /etc/ssh/ssh_known_hosts /etc/ssh/shosts.equiv files need to be copied from node2 to node* [done,all nodes but 25]
- The file /var/spool/pbs/server_name needs to be updated as well [done,all nodes but 25]
- The /etc/pam.d/system-auth-ac needs to include the ldap module. (NOT DONE only for 2,3)
- Set the root password to standard scheme. [done,master only]
- Setup the LDAP client side. [done,master only]
- Recompiled PBS to include the xpbs and xpbsmon commands.[done]
- Configured and started the iptables firewall [done,master only]
- Integrated the backend network with the farm backend network (bridged the network switches) [done]
- Setup automount, standard /net/data and /net/home [done,master only]
- TODO: We need a new rule that resolves /net/data/node2 for the disk in node2 etc. The nodes need to export their /scratch partition. The other partitions may not be needed, since the "rcpf" command (a foreach with rcp) can copy files in batch.
- The /etc/nodes file included the "master" node. This is too dangerous. It means that in a batch copy the file is also automatically copy back to the master, with potentially dangerous results.
- To add users to the "microway Ganglia control" part, add them to /etc/mcms.users Password is login password, LDAP is honored.
- Setup and started SPLUNK - runs its own server, and forwards to pumpkin.
- Added endeavour disks /data1 and /data2 to the export table and the LDAP auto.data table, so ls /net/data/endeavour{1,2} works.
- On node2 (so far) to get users and home directories integrated:
- Copy /etc/ldap.conf /etc/openldap/* to node from endeavour.
- copy /etc/nsswitch.conf to node from endeavour.
- restart autofs (/etc/init.d/autofs restart)
TO DO
- Figure out the monitoring system, Ganglia, and other Microway goodies.[partially done]
- LDAP (client) on nodes?
- Test the Infiniband and MPICH setup.
Long Term To Do
Possible long term tasks if manpower is available.
Long term goal is to have Endeavour as an independent system is need be.
- Run a replicate LDAP server on Endeavour.
- Run a replicate Named (DNS) server on Endeavour and Roentgen.
- Replicate home directories for selected users (this may be too tricky, really)? Else create a local copy of each user.