Computing In PHENIX
By Carla Vale
New desktop computers arrive with one or two gigabytes (GB) of memory. Mobile phones and iPods store a few GB of songs and photos. During the last heavy-ion data-taking period, the RHIC experiments recorded nearly a million GB of data (Figure 1). It's an enormous amount of information, which needs to be stored, retrieved, processed and analyzed before it will yield any of its physics secrets.
Our data’s trip to the RHIC Computing Facility (RCF) begins with a 3 km journey over a 10 Gbit/sec fiber optic network. While RHIC is running, PHENIX routinely sends raw data at 300 MB/sec over this link. Once in the RCF, the data files are cached on an array of disks and then stored on tape in a pair of StorageTek robotic tape libraries. Each library is filled with about 6000 tape cartridges and each can store over 2 petabytes (PB, one petabyte = 1 million gigabytes) of data. The interior of each robot looks a bit like the music collection of a 1970s audiophile-gone-mad, with thousands of neatly organized cartridges on shelves and a trio of robotic arms zipping back and forth grabbing cartridges, stuffing them into and retrieving them from an array of tape drives.
Soon after the RHIC run ends, once the necessary software and calibration constants have been determined and tested, all the data are retrieved from tape and sent to the Central Reconstruction Server (CRS), a farm of over 1000 rack-mounted PCs (roughly half in use by PHENIX), for event reconstruction. The main job of event reconstruction is to perform the CPU-intensive task of pattern recognition, combining distinct pieces of detector information into coherent particle trajectories. It takes about one second on a high-end computer to reconstruct a full PHENIX Au+Au event (2/3 for the central arm, 1/3 for the muon arms). The full reconstruction of the 5.4 billion events collected in Run-7 took about 10 months (Figure 2).
The technology of distributing files of events to a large farm of computers for processing isn't new–it's been done for years at SLAC and Fermilab–but large operations like these require very careful supervision to keep them working smoothly. There are several things that complicate this effort at RHIC. There is a great variation in the complexity of the events we process, from low multiplicity p+p events to massive Au+Au events. Often, from one RHIC run to the next, new detectors get added, and very soon both STAR and PHENIX will go through major upgrades, which will require significant updates to the reconstruction software.
PHENIX processes the majority of the data for its spin physics program (polarized p+p collisions) at a remote computing center in Wako, Japan, since 2005. The data for that effort are sent in real-time from the experiment over the network to Japan at nearly 100 MB/sec for a period of several weeks. We have also sent smaller data sets for reconstruction at other computing facilities in France, Vanderbilt University and Oak Ridge.
The analysis of the processed data is as large an operation as the initial event reconstruction. As much as possible, analyses that use the same data as input are lumped together and fed the full stream of reconstructed data in an organized procedure we call the “analysis train”. The motivation for doing this is to minimize the fetching of data from tape or disk, which is the principal bottleneck for the dozens of different analyses running at any given time. PHENIX uses a storage management system called dCache to organize the thousands of individual hard drives attached to the farm nodes into what appears to be a single, very large filesystem of hundreds of TB of disk, where the reconstructed data are stored and served to the train.
After all these steps, the billions of collision events
recorded by the detector are finally ready to be examined. The
analysis of a data set goes on for years, much longer than the
time that it took to collect or reconstruct it. And with new
data rolling in year after year, RHIC will surely keep computers
and physicists occupied for a very long time.