Story by Helen Hill for MGHPCC MIT’s Joint Program on the Science and Policy of Global Change (JPSPGC) seeks to combine scientific research on changes to land, air, and water with innovative policy analysis to confront the global climate challenge. Underpinning their scientific research efforts are complex computer models of many kinds, models that until a few weeks ago were being run on machines housed in increasingly unsatisfactory accommodation on the MIT campus. Now though, those same machines have a handsome new space as JPSPGC’s Svante HPC cluster takes advantage of the opportunity to relocate to MGHPCC’s state-of-the-art high performance computing facility.
With typically 100 active user accounts, the Svante cluster is the principal large-scale computing resource for students, postdocs, and research staff working under the Joint Program umbrella – a loose association combining personnel from MIT’s Center for Energy and Environmental Policy Research (CEEPR), the Engineering Systems Division (ESD), the Center for Global Change Science (CGCS), and the Department of Earth, Atmospheric and Planetary Sciences (EAPS), as well as collaborators from other institutions. “Managing an HPC cluster for such a varied user base presents a major challenge,” says Dr Jeffery Scott, a research scientist in EAPS concerned with the ocean’s role in climate change, and manager of the Svante cluster. “Some people are running 3D Earth system models, which require a significant number of fast, interconnected processor cores. Other people are running economic models that run on a single processor, yet need to run hundreds to thousands of simulations. Other people write complex programs to analyze data that requires vast amounts of computer RAM. Another group of users run big multi-core MATLAB jobs to solve complex optimization problems. And yet another group spends much of their time downloading and analyzing data from the Program for Climate Model Diagnosis and Intercomparison (CMIP) archives, which requires significant amounts of storage capacity. In a nutshell, our user base is using Svante for everything that is beyond the capacity of your typical laptop or desktop machine and of course with a problem as big as those presented by climate science that’s pretty much everything.”
“We had previously been housing Svante in a neighboring building but the the last few years have been increasingly difficult as our system evolved from a generic Beowulf-type cluster into the mid-sized HPC cluster it is today. We had the air-conditioning capacity upgraded but eventually we were basically running at full room power capacity, tapping all the electrical outlets around the perimeter. Increasingly, intermittent cooling and power failures were putting the hardware at risk, as well as interfering with user productivity. Basically, any further upgrade of Svante was going to be problematic: It was even getting difficult to manage all the hardware from a physical and connection standpoint, lacking the tailored resources of a computing facility to neatly manage cables etc: There was simply no more space, no more power and no more cooling capacity.”