April 12, 2016
In this newsletter:
Welcome, Steve Girvin, Deputy Provost for Research
"The Yale Center for Research Computing is off to a great start in its new location on St. Ronan Street. The foundation is now laid for key initiatives on which the center will build for years ahead. The faculty co-directors are working to set the academic direction of the center, and the YCRC Steering Group, a governance board including faculty from a broad range of disciplines, is helping guide the direction of the initiatives. We look forward to involving even more faculty and researchers in the center's activities in the coming months".
Current Goals for the YCRC, Kiran Keshav, Executive Director of the Center
With the creation of the Yale Center for Research Computing (YCRC), FY16 marked the beginning of a new era in research technology support at Yale across the Faculty of Arts and Sciences (FAS) and the Yale School of Medicine (YSM). In response to the needs of the research community, the center is focused on advancements in four main areas; cyber-infrastructure, advanced support, education & training, and community development.
In FY16, the YCRC oversaw a number of investments in enhanced cyber-infrastructure and services, including a High Performance Computing (HPC) cluster refresh, expansion of the science network, new storage offerings, and the installation of the first HIPAA-aligned cluster. The YCRC continued to provide dedicated, advanced support for the Yale Center for Genome Analysis, as well as increased levels of support for several departments and disciplines, including Geology & Geophysics, Astrophysics and Computational Psychology.
FY16 saw the center’s first set of boot-camps and courses, and YCRC members, in cooperation with the Center for Teaching and Learning, helped facilitate the development of key teaching tools, including an Orbital Simulator and Mathematica tools for analysis of genetic drift. Finally, the recent co-location of YCRC staff into a creative space on Science Hill, will help drive community engagement using the center’s new auditorium and collaboration space.
FY17 is anticipated to bring continued development in these foundational areas with activities driven by both the big data needs on campus, and the need to support collaborative science.
A Message from our Faculty Co-Directors, Daisuke Nagai and Harlan Krumholz
"The YCRC strives to address some of the outstanding challenges facing the emerging areas of scientific computing and data science," remarked Daisuke Nagai, Associate Professor of Physics and Astronomy. "In addition to offering new and extended services on computing, networking and data storage, the YCRC is committed to bringing researchers and students up to speed on emerging hardware and software technologies through a variety of boot camps and workshops. I am hopeful that the Center’s new home on St. Ronan Street will serve as a hub of computational scientists and engineers and foster a variety of interdisciplinary research and collaborations in the coming months."
"The advent of massive databases and access to remarkable computational power is transforming research across all fields." commented Harlan Krumholz, Harold H. Hines Jr. Professor of Medicine. "For Yale to enable faculty and students to fulfill their promise and to contribute new insights, it is essential that we develop exceptional computer infrastructure and capacity – and create an environment that fosters interdisciplinary collaboration and community-wide skill building. We need to share best practices across campus – and find ways to take advantage of the immense talent around us. This Center should be a vibrant space where we bring new tools to research and make it possible to do the work that was once inconceivable".
Two new High Performance Computing clusters were installed early this year. The first, Milgram, is named after the Social Psychologist Stanley Milgram and is dedicated to work in the Psychology department on the link between behavior and modern brain imagining techniques (see Computation and Brain function below). Milgram's deployment is a key milestone for Yale research infrastructure, as it represents the first HPC cluster specifically designed for the privacy needs of human subjects research. This small, 12 node cluster with roughly 150 terabytes of storage is serving as our pilot for HIPAA alignment. Lessons from this implementation should be useful in thinking about larger implementations in coming years.
The second cluster is Ruddle, named after the noted Yale Biologist, Frank Ruddle. This cluster is a refresh of BulldogN, which serves the Yale Center for Genome Analysis. Ruddle has significantly enhanced computational power as compared to BulldogN, adding 400 cores and roughly a petabyte of additional high performance storage. This larger capability is being delivered using reduced physical and power resources, thus meeting greater demand for genomic computing at a lower operating cost.
In addition, the YCRC has been partnering with researchers in the Computer Science department and with Yale ITS to expand the Yale Science Network, a 100 Gb network specifically designed for large data transfer. By the end of FY16, five new locations will have installed the science network bringing the total to 16 buildings, covering a significant portion of science research being done on campus. As the network infrastructure expands in FY17, the project will also aim to create an intelligent network using software. The development of this defined network, led by Yale researcher Richard Yang, will allow the current campus network to interface seamlessly with the science network and, as the science network grows and becomes more widely used, will allow for intelligent utilization of the network to keep data transfers speedy and efficient.
New and Expanded Services
Storage: We have been broadening service offerings by extending the storage options available to HPC users. Shortly, we will announce the availability of two additional storage tiers on the high performance computing clusters. The first tier provides a redundant, highly reliable storage solution for project files. Data can be moved between all clusters and this network-attached, central storage device that is based on the standard tier (not high performance) of Yale ITS’ Storage@Yale service. Another storage tier will be based on the Storage@Yale archive tier service. This solution will also be accessible on all clusters, and is intended for long term storage of data.
Science Research Software Core: The Science Research Software Core is now part of the YCRC’s service offerings. The SRSC was implemented previously by the Provost in recognition of the demand for cutting edge software for use in analytical and experimental research, design and analysis. The software supported by the Core is specifically applicable to the fields of Physics, Applied Physics, Chemistry, Computer Science, Geology & Geophysics and Engineering and but it is available to all faculty and students. A full list of supported software titles is available on the YCRC website.
Support and Consultation is a key component of the Core. Misha Guy is the lead contact for the Core and has many years of experience in scientific software with formal training and an extensive background in mathematics and physics. Misha is available to help provide technical support for all software offered by the Science Research Software Core as well as consultations to assess what software package might best researcher's Science Research computing needs. To learn more about the SRSC, please visit the YCRC website and select the SRSC service.
Research Support Focus, Computation and Brain Function.
Expanding research support is a key goal for the YCRC. In this newsletter we are focusing on research in brain function and how the YCRC is helping researchers in this area. Brain function research is in the midst of a remarkable phase of development. The convergence of new imaging technologies, methods for online and remote behavioral collection, as well as a field-wide emphasis on open access data, have provided the opportunity for data-driven discovery science. Although these advances have the potential to provide deep insights into human brain function, the field is now facing a vast river of high-dimensional data, taxing the limits of traditional laboratory-based computing solutions. In this regard, a key factor slowing the pace of discovery is the availability of computational resources that can handle the exponential growth of scientific datasets.
Unraveling the relations that link the brain’s biological processes to cognition and behavior is perhaps one of the most complicated scientific endeavors humanity has ever faced. Yale researchers tackling scientific questions in the arena are actively benefiting from the support of the Yale Center for Research Computing (YCRC) faculty and staff. To highlight one example, researchers in the laboratory of Avram Holmes are focused on discovering the fundamental organization of large-scale human brain networks. The postdoctoral fellows and graduate students in this group seek to identify how individual variability in the brain’s core functional and network properties might serve to influence behavior, and potentially psychiatric illness risk in vulnerable populations. Due to the sensitive nature of aspects of the group's data, including the collection and analysis of protected health information (e.g., measures of brain structure and function, genome-wide genetic data, history of psychiatric treatment, etc.), they are unable to utilize existing open access high-performance compute clusters. To address this unique need, the YCRC has worked to develop and support the Milgram cluster, a HIPAA aligned server solution that allows for both computationally intense analyses and secure high performance storage.
The YCRC and the Milgram cluster provide the Holmes laboratory with the hardware infrastructure and support needed to pursue their research goals. The development of these resources at Yale facilitates computationally sophisticated, well-powered discovery science. Now when researchers in the Holmes laboratory test innovative ideas about how brain networks are organized, or how brain function links to behavior, their only limit is their scientific vision.
As another example, the research division named Neurocognition, Neurocomputation, and Neurogenetics (N3), focuses on systems neuroscience in psychiatry. Alan Anticevic, co-Director of N3, and his team, along with other researchers at N3, combine cognitive, computational and genetic approaches as well as pharmacological and neuroimaging genomic techniques to bridge multiple levels of experimental analysis and link genes, circuits, and behavior.
Both the Anticevic Lab and the N3 division rely on the state-of-the-art high performance computing resources provided by the Yale Center for Research Computing. The Anticevic Lab harnesses the combination of multi-modal imaging techniques, including task-based, resting-state, structural and pharmacological functional neuroimaging. The lab combines such approaches with computational modeling techniques, leveraging the computational infrastructure provided at Yale, to understand neural circuit dysfunction in psychiatric disorders.
YCRC has a new /home
For the past year, members of the Yale Center for Research Computing have been working from various locations across campus, including 25 Science Park, the Arthur K. Watson Building, West Campus, the Kline Geology Lab and others. The team has co-located to 160 St. Ronan Street near Science Hill. In addition to office space, the new location has its own auditorium, a conference room and collaboration space, all specifically designed for extended community engagement and training. The site is conveniently located next to the Pierson Sage garage and is on both the Blue and Red Yale Shuttle lines so that access by the Yale Community, and research computing communities beyond Yale, is easy.
Education and Training
HPC Bootcamp – Scripting with Python
On April 13th, 2016 HPC Support Specialist Steve Weston and Senior Research Scientist Rob Bjornson will give a two-hour introductory course on Python, a modern interpreted scripting language that is easy to learn and allows you to quickly automate tasks.
The course will cover:
- Basic Python syntax and data types
- Basic tasks like reading and writing files
- Using Python modules for additional functionality
- Best practices and Gotchas
- Walk-through examples of useful Python scripting for HPC
This is not a hands-on course.
Location and Registration information is available on the training page
XSEDE Open MP Training
On May 10th, 2016, Yale will be host to XSEDE satellite training for OpenMP in our 160 St. Ronan Street Auditorium. This workshop will be very hands-on, and use the foremost available platforms and technology. The Open MP workshop is part of a series of XSEDE remote workshops occurring throughout many host sites. Please visit the YCRC website for student enrollment information on this and other XSEDE workshops.
GIS Training for Research
We are very excited that, later this year, we will be able to offer a workshop on Geo-Computation analysis for spatially complex datasets using Yale’s HPC clusters. The workshop will be run by Giuseppe Amatulli, a forest scientist and spatial modeler with expertise in computer science. More information on this workshop will be available later this summer.
Amazon Cloud Storage and Compute: The YCRC has teamed with Amazon to help researchers make use of Amazon’s Cloud Credits for Research program, a grant program that provides limited free cloud compute and storage resources to research teams on campus. To find out more about the program, visit Amazon program site.
NVIDIA's GPU Education and Research Programs:The Yale Center for Research Computing is delighted to announce that NVIDIA, the world leader in visual computing and GPUs has selected the YCRC both as a GPU Education Center, and as a GPU Research Center. GPU-accelerated computing leverages the parallel processing capabilities required for advanced graphics processing to enable dramatic increases in software performance.
The YCRC’s designation as a GPU Education Center is based on Yale’s demonstrated commitment to advanced education on parallel computing using GPUs and CUDA C/C++. In particular, Yale’s Department of Computer Science offers two courses (CPSC 424 and CPSC 524) that focus on a variety of parallel computing techniques including the use of GPU accelerators and the CUDA framework. Both classes are taught by Andrew Sherman of the YCRC.
Several research groups at Yale have already included or will include GPU computing in their computational work. Collectively these research groups represent a range of disciplines and include research on such topics as the mechanical function of the heart (Campbell group in Biomedical Engineering); computational studies of water splitting in photosystem II (Batista group in Chemistry and the Yale Energy Sciences Institute); structural and functional MRI analysis (Duncan, Anticevic, Pelphrey and Staib groups in the Yale Medical School, Psychology and Biomedical Engineering); reconstructing heterogeneous protein structure from cryogenic EM (Tagare and Sigworth groups in the Yale School of Medicine and Biomedical Engineering); and new algorithms for extremely large linear algebraic equations (Rokhlin group in Computer Science).
Because of the vision, quality and impact of this research, NVIDIA has also designated the YCRC a GPU Research Center. As a GPU Research Center, the Yale will have pre-release access to NVIDIA GPU hardware and software, opportunities to attend exclusive events and to participate in training sessions, and support from a designated NVIDIA technical liaison.