By Helen Hill for MGHPCC News
The MGHPCC Supercloud, operated by MIT Lincoln Laboratory Supercomputing Center (LLSC) is a facility intended to be of particular use to people who need to transition from their desktop workstation to a high performance computing cluster in order to solve larger or more complex problems, but do not have the time to gain in-depth specialized knowledge about parallelizing application software, running batch jobs, optimizing startup times for large collections of simultaneous jobs, and other details that end users must otherwise master in a Linux cluster environment.
Managed by MIT, but as a separate enterprise dealing with security-related (classified) research, Lincoln Laboratory began as a center for radar and early warning system research and has a long history of advanced technologies in sensors (signals intelligence, image recognition, etc) and associated computing. Today the Laboratory (now the largest US Department of Defense federally funded research and development center) is home to numerous unclassified open research activities, for example, development of the imaging devices and lenses for TESS or LIDAR-based tools now providing FEMA with automated post-disaster operational damage analysis.
The MGHPCC Supercloud project grew out of a desire to strengthen collaboration between Lincoln Laboratory and regional partners in the MGHPCC Consortium (BU, Harvard, MIT, Northeastern, and UMass). LLSC having already demonstrated its success in providing a similar cloud service to Lincoln Laboratory researchers, opened the doors to provide its service to members of the MGHPCC consortium.
The Supercloud supports most of the desktop computing environments that are commonly used in the research and engineering community today, including Matlab, R Studio, Jupyter Notebooks, and Machine Learning frameworks such as Tensorflow, Caffe, and Theano.
"If there's something a user needs and we don't already have it or something equivalent to it we will work with the user to help them get set up with what they need," says Lauren Milechin, who is in charge of new-user liaison.
In particular, the MGHPCC Supercloud provides a software platform which allows users to launch large-scale interactive compute jobs from their desktop, as well as facilitating ready sharing of large volumes of project data. The Supercloud experience enables reference datasets to be pre-positioned in databases, while also providing access to software modules and training to reduce user ramp-up time, all within a responsive, interactive supercomputing environment.
Simple line commands allow users to tailor hardware selection (ie from a single processor core to a whole node), using LLMapReduce (an implementation of native MapReduce), carry out language agnostic parallel data analysis or run Parallel Matlab. Users are also able to couple Supercloud resources with web-based interactive development environments like the popular Jupyter Notebook, or manage dynamic databases from a simple GUI.
"While you can run traditional big MPI Fortran code on the MGHPCC Supercloud, our really big success has been our ability to get really large numbers of people with little if any prior parallel computing background effectively working on these systems in very short order," says Jeremy Kepner, a Lincoln Laboratory Fellow and Founding Head of the LLSC.
High-performance data analysis (HPDA) requires high-level programming environments but also rapid interaction and fast turnaround. By optimizing every aspect of its HPDA system, the LLSC TX-Green system running on 32 000 cores (512 x 64-core Xeon nodes) with the same setup and software stack as the MGHPCC Supercloud has been demonstrated to be able to launch 260 000+ data analytics in just 40 seconds ie 6000+ launches per second, or a 1000x speed-up over standard approaches.
In another measure, some analytic applications are written uniquely for Microsoft Windows. Standard approaches based on Virtual Machines or Windows HPC can take hours to launch across thousands of cores. MGHPCC Supercloud launch system optimizations enable launch of 16 000 + Microsoft Windows environments (running WINE) on 16 000+ cores (256 x 64-core Xeon nodes) in just five minutes: ie 50+ launches per second or a 100x speed-up over standard approaches.