Example Collaborative Projects at the MGHPCC.

ATLAS Northeast Tier 2 Center “NET2”
Commonwealth Computational Cloud for Data Driven Biology (C3DDB)
Center for Data Science
Mass Open Cloud
MGHPCC Supercloud
Northeast Cyberteam Project
Northeast Storage Exchange

The projects listed above are just a few among the many collaborative projects utilizing the MGHPCC. If you would like your collaborative project listed here please contact

The U.S. ATLAS Northeast Tier 2 Center
Contact: Saul Yossef

The bulk of the computing resources available to U.S. physicists working on the  ATLAS experiment at the Large Hadron Collider is found at a “Tier 1” center at Brookhaven National Laboratory and at four “Tier 2” centers spread around the  United States.  One of these four is called the Northeast Tier 2 Center (NET2). NET2 is located at MGHPCC and is operated as a collaboration between Boston  University and Harvard University.  NET2 currently has approimately 10,000 cores  of worker nodes and approximately 6 Petabytes of storage.  NET2 is used by  consortium physicists and about 1000 others spread around the world.  NET2 also  shares it’s resources via the Open Science Grid so that projects like LIGO  can compute on NET2 during times of off-peak demand.  Large Hadron Collider  proton proton collisions are higher energy than have ever been seen before.  This  is new territory for fundamental physics, where new discoveries become possible  as has already happened with the ATLAS discovery of the long sought Higgs Boson in 2012.

Center for Data Science
Contact: Brant Cheikes

The Center for Data Science (CDS) in the College of Information and Computer Sciences (CICS) at the University of Massachusetts Amherst (UMass Amherst) is the campus’s leading interdisciplinary hub for data science education, research, and industry collaboration. Best known for their foundational work in machine learning, natural language processing, and computer vision, the nearly 40 CDS-affiliated computer scientists are internationally renowned researchers and educators across many data science specialty areas. Their software runs in Fortune 500 companies; their research is among the most cited in its field, and their graduates have become leaders in business and academia. CDS promotes closer collaboration among the 150+ affiliated faculty across the UMass campus and the regional Five College Consortium and fosters new educational programs in data science at all degree levels. Powered by its industry affiliates program and numerous industry-engagement events throughout the academic year, CDS builds intellectually strong, mutually beneficial relationships with industrial partners and entrepreneurs.

CDS has installed the Gypsum cluster at this location. Gypsum is a large cluster of computers, each of which contains four NVIDIA GPU cards. There are 75 nodes with four NVIDIA Titan-X GPUs, and 25 nodes with four NVIDIA M40 GPUs. The work proposed here will be performed in part using high-performance computing equipment obtained under a grant from the Collaborative R&D Fund managed by the Massachusetts Technology Collaborative.

Commonwealth Computational Cloud for Data Driven Biology
Contact: Chris Hill

The C3DDB project brings together major research universities, established commercial organizations and new startups in Massachusetts to accelerate life-science research and innovation, allowing these entities to capitalize and gain a competitive advantage in a life science world where data generation is rapidly outpacing capacity to analyze data. The project’s three main objectives are:

  • Boost individual and collaborative research in big-data enhanced life science research among the university partners of the MGHPCC.
  • Provide on-demand, big-data enhanced life science capabilities to the life science start‐up and enterprise community across the Commonwealth.
  • Work proactively to connect commercial enterprises and the research community and encourage exchange of ideas and innovation in big-data enhanced life science discovery.

Contact: Chris Hill

Called engaging1 (eo for short) because one of the project’s goals is to develop more interactive and dynamic approaches to computational sciences, a concept that has been referred to as “engaging supercomputing” and/or computational science 2.0. The cluster consists of several head or login nodes, hundreds to thousands of compute nodes and a very large central Lustre storage system.

Mass Open Cloud
Contacts: Peter Desnoyers, Orran Kreiger

The Massachusetts Open Cloud (MOC) is a public cloud being developed based on the model of an Open Cloud Exchange (OCX), a model where many stakeholders, rather than just a single provider, participate in implementing and operating the cloud. Hosted at Boston University and housed at the Hariri Institute for Computing, the project is a unique collaborative effort between higher education, government, non-profit entities and industry. The academic partners are Boston University, Northeastern, MIT, Harvard, and the University of Massachusetts. Public sector partners are Massachusetts Technology Collaborative (MTC)/Commonwealth of Massachusetts and the United States Air Force (USAF) at Hanscom Air Force Base. The current core industry partners are CISCO, Intel, NetApp, Red Hat and Two Sigma.

Primary goals are:

  • To create an inexpensive and efficient at-scale production cloud utility suitable for sharing and analyzing massive data sets and supporting a broad set of applications.
  • To create and deploy the OCX model, enabling a healthy marketplace for industry to participate at all levels in the cloud and profit from doing so.
  • To create a testbed for research in and prototyping of cloud technology, empowering a broad community of researchers, open source developers and companies to develop new cloud computing technologies.

The MGHPCC Supercloud

The MGHPCC Supercloud is an extension of the MIT Supercloud, operated by MIT Lincoln Laboratory (LLSC).

The Supercloud is particularly useful for people who need to transition from their desktop workstation to a high performance computing cluster in order to solve larger or more complex problems, but do not want to take the time to gain specialized knowledge about parallelizing application software, running batch jobs, optimizing startup times for large collections of simultaneous jobs, and other details that end users must otherwise master in a Linux cluster environment.

The Supercloud supports many most of the desktop computing environments that are commonly used in the research and engineering community today, including: Matlab, R Studio, Jupyter Notebooks, and Machine Learning frameworks such as Tensorflow, Caffe, and Theano.

An account signup form can be found at

Northeast Cyberteam Project
Contact: Julie Ma

Originating in May 2017, the Northeast Cyberteam Initiative is a 3-year NSF-funded effort to build a regional pool of Research Computing Facilitators to support researchers at small and mid-sized institutions in Maine, Massachusetts, New Hampshire and Vermont. Research Computing Facilitators (RCFs) are experts at figuring out how to match the right compute resources to the task at hand, something that can stymie researchers who are, for example, sifting through billions of records to find a specific pattern of genes that correlates with a particular form of cancer; or examining massive quantities of sensor data to understand movements on the sea floor. The RCF’s job is to help researchers make use of local, regional, and national high-performance computing resources when computing needs exceed the capacity of the scientist’s desktop.

RCFs can often be found in the research computing groups at large universities and corporations, but are scarce at smaller institutions. Recognizing that promising research can be stopped in its tracks without high-performance computing when the need arises, the Northeast Cyberteam Program was created to fill the gap. Over the three year period, the program will support 42 compute-intensive projects with RCFs-in-training (“students”), each paired with a mentor, to facilitate research computing needs for a 3 month period. RCF students will also have the opportunity to work on a live help desk with a mentor, honing their consultative skills while getting exposure to a broad range of research computing topics. As part of the program, RCF students will become part of a community of facilitators that has up-to-the-minute visibility into research computing projects and programs taking place in the region. Stipends ranging from $3000-$6000 are available for students participating in the program.

Northeast Storage Exchange
Contact: Saul Youssef

If one wants to create a regional shared cyberinfrastructure for the Northeastern  part of the U.S., it is hard to imagine a better place to start than MGHPCC where  we already find major facilities for Boston University, Harvard University, MIT, Northeastern  University and UMASS, conveniently located a few feet from each other.  One major  step in the direction of shared storage is a proposal by all the MGHPCC consortium  institutions to create, operate, and maintain a large storage facility at MGHPCC with multi-100Gb/s access to any consortium institution and multi-100Gb/s access to the wide  area research networks. The storage will be a “data lake” providing POSIX,  object store, block storage for clouds and virtual machines, Globus endpoints and possibly other  storage types in the future, all commonly managed and shared as a regional resource.  This project, called the Northeast Storage Exchange (NESE),  was funded by the  National Science Foundation and is about to make it’s first major deployment as of February 2018.  The basic economics of storage indicates that NESE has a great potential to grow and  to be an important resource for consortium institutions, for research in the Northeast, and for national and international research projects.