High
Performance Computing Cluster (HPCC) Update
Summer 2002
This past
spring has been a busy one for the Committee on Technical
Computing (CTC). The CTC is made up or representatives from
the Colleges of Engineering, Science, Arts & Letters,
Mendoza College of Business Business, along with representatives
from the Office of Information Technologies. This committee
provides direction and makes decisions pertaining to the HPCC
and also the "High-End" OIT clusters.
This past
spring an e-mail was sent from CTC Chairman Dr. Mark Stadtherr
to all University Faculty communicating the expiration of
the lease on the University's SGI 32 processor Origin 2000
machine (medusa.hpcc.nd.edu). This letter solicited input
regarding various options concerning future HPCC facilities.
These results were presented to the CTC.
This spring
a user survey was sent requesting input from all HPCC users.
Results from this were presented to the CTC which allowed
a better understanding of the HPCC usage and especially the
software used.
Thanks
to the many Faculty and HPCC users which provided input for
these surveys.
Along
with input from these surveys, the CTC requested vendor presentations
from IBM, Sun, Dell and SGI. This input was used in determining
the best solutions for upgrading the HPCC.
In summary:
The CTC decided upon the following:
When
the lease of the 32 processor Origin 2000 medusa.hpcc.nd.edu)
expires in early September 2002, the machine will be returned
to SGI.
An IBM
1300 Linux cluster would be purchased and installed this summer.
The computer portion of the cluster would consist of 32 IBM
x330 compute nodes. Each node has 2 processors, thus adding
64 processors to the HPCC. A major part of the HPCC infrastructure
is the addition of a Cisco 6509 Gigabit switch. This switch
will provide interprocessor communication and fast access
to shared disk. A interactive front-end machine will be available
for interactive use and development similar to the Sun &
SGI interactive front ends.
A major
upgrade to the HPCC Infrastructure includes that addition
of a IBM FAStT500 storage server. This storage server is a
SAN which provides 1.7 Terabytes (usable) fast shared filesystem
support to the cluster through 4 IBM x342 servers using GPFS.
This additional disk space will also be made available to
all current HPCC machines via the network using new Gigabit
network connections added to those machines.
In addition,
an 8 processor Sun V880 utilizing 900 MHz copper processors
will also be added.
More specific
details of the new hardware (and pictures) will be available
in the next week on the HPCC web site. http://www.nd.edu/~hpcc
HPCC OUTAGES:
It was originally hoped that these systems would be available
for "Early User Test Period" July 1st, due to delays
in ordering / shipping we are now targeting July 15th. Watch
the "Message-of-the-Day" for updates.
In order
to update hardware/software, we will be planning outages of
the current HPCC - tentatively on July 4th. Watch the HPCC
"message-of-the-day" for further details. This will
also allow us to update the batch queuing system to the latest
version in order to better support of the Linux cluster. This
outage requires that no jobs be queued or running. We are
requesting that only short jobs be run until then, any jobs
running at that time will be killed.
Please
note that IP numbers of all the HPCC equipment will change
during this outage. The campus domain name server (DNS) will
be updated on the morning of July 4th.
If this
would cause MAJOR problems, please notify us as soon as possible
by sending e-mail to hpcc@nd.edu
page
modified 12/12/02
|