![]() |
![]() |
|
BNL: Departments | Science | ESS&H | Newsroom | Administration | Visitors | Directory |
|
Site Details Other Information |
Parallel Heisenberg Spin Model on Supercomputer Architectures The Heisenberg Spin Model we studied in [1] is used to describe magnetic materials. The model can be extended to a large number of atoms in order to compare with the bulk properties of magnetic materials whose measurement can involve millions or billions of atoms. In the future we will link our study of the Heisenberg model with density functional calculations of its parameters. An important application area is to high-density storage on nanomagnetic materials. We measured the performance of a parallel Heisenberg spin model using the Monte Carlo method and the Metropolis algorithm on various supercomputer architectures. These architectures include IBM BlueGene/L, PSC Quadrics Cluster, SGI Altix and QCDOC. This assembly of supercomputer systems probes a variety of supercomputer approaches including shared memory, Linux clusters and specialty machines. BlueGene/L, although originally envisioned as a computer for biomolecular simulation, has an efficient implementation of MPI and can be applied to a variety of problems. PSC Quadrics Cluster is a Linux cluster with a quadric interconnect and is also multipurpose. SGI Altix is a shared memory with Numaflex interconnect that can also be applied to distributed problems. QCDOC was built originally as a specialized machine for Lattice QCD but has an efficient message passing library called QMP and can also be applied to a wide range of applications, including computational biology [2] and nanoscience. The Heisenberg spin model of magnetism is defined by the energy where Si is a three component spin at lattice site i = (i1 ,i2 ,i3 ), the sum over nn in the equation is over nearest neighbor lattice sites, and J is the nearest neighbor coupling. The number of lattice sites or spins along a given direction is given by L. We parallelized the Heisenberg model by using domain decomposition on large lattices up to 16,777,216 atomic grid points. The number of Monte Carlo steps invoked in the Metropolis algorithm is important in reducing the error in the computation so we wanted to study the number of Monte Carlo steps per second that can be achieved on supercomputer architectures. Previous studies of the Parallel Monte Carlo algorithm for the Ising model were performed in [3] where formulas for the number of Monte Carlo steps per second were obtained. Two of the main properties of parallel computation we studied were strong scaling and weak scaling. Strong scaling means that we fix the problem size, vary the number of processors, and measure the speedup. This is very important for Monte Carlo simulations because one way of reducing the error in a Monte Carlo simulation is to increase the number of Monte Carlo steps. This can be done in a reasonable time frame by increasing the number of Monte Carlo steps per second. Weak scaling means we vary the problem size and the number of processors such that the execution time is the same. In this way we can obtain higher lattice resolution in the same amount of wall clock time. Results from a weak scaling study of QCDOC as a function of the number of processors are shown in Figure 1. Performance on the vertical axis is measured in the number of Monte Carlo steps per second. The results are in excellent agreement with weak scaling.
Using the data from the study we were able to fit a Laurent expansion of the
form where P is the number of processors. One way to
understand this formula is that the first term represents the time spent in
computation and the other two represent time spent in communication between
processors. This form is general enough to include the Ising model parallel
performance derived in [3] as well as the Heisenberg model. Note the
equation is consistent with weak scaling.
References
Last Modified: January 31, 2008 |
||||