Scalable Software Services for Life Science


High-Performance Computing for the Life Sciences

The ScalaLife Competence Centre works towards facilitation of the computational Life Science research by establishing common grounds for community work in this domain. The community includes developers of Life Science applications, research groups using the applications as well as hardware providers.

The Competence Center welcomes other Life Science software developers and users to join!

Joint projects

ScalaLife participates in several collaborative projects with European research groups with many scientific publications being produced with the help of the project.

1. Introducing flexibility in protein-protein docking by discrete molecular dynamics simulations

( collaboration with Institute for Research in Biomedicine (IRB), Barcelona )

Protein-protein interactions are responsible for the transfer of information inside the cell and currently represent one of the most interesting research fields in structural biology. However, experimental approaches have difficulties in providing 3D structures for the hundreds of thousands of specific interactions formed between the different proteins in a living organism. The use of theoretical approaches like docking aims to complement experimental efforts, but current methods have limitations due to problems of sampling of the protein-protein conformational space and the lack of accuracy of available force-fields. Cases that are especially difficult for prediction are those in which complex formation implies a non-negligible change in the conformation of the interacting proteins, i.e. those cases where protein flexibility plays a key role in protein-protein docking. In this work we present a new approach to introduce flexibility in docking by global structural relaxation based on ultra-fast discrete molecular dynamics.

Publication: Agusti Emperador, Albert Solernou, Pedro Sfriso, Carles Pons, Josep Lluis Gelpi, Juan Fernandez-Recio and Modesto Orozco, "Efficient Relaxation of Protein–Protein Interfaces by Discrete Molecular Dynamics Simulations",

J. Chem. Theory Comput., 2013, 9 (2), pp 1222–1229,DOI: 10.1021/ct301039e.

2. Protein Flexibility and Function

( collaboration with Technische Universität München Fakultät für Informatik, Munchen )

Proteins are intrinsically flexible molecules, thus function is often associated to flexibility. Experimental methods to determine protein flexibility are expensive and often time consuming. Over the past few years an efficient complementing method, molecular dynamics (MD) simulations, more and more proved to be a powerful tool to yield information on protein dynamics. In MD methods, successive conformations of proteins can be calculated using Newton’s law of motion. As a result a trajectory is produced that describes how the positions and velocities of all atoms vary with time. This way important observations can be made, helping to understand proteins, mutations and eventually associated diseases better.

Lately, MD simulations have been shown to be powerful tools to infer knowledge beyond the static picture given by X-ray crystallography and thus, complementing each other. In a large-scale study we now try to learn how and to what extent MD simulations can help us understand the effect of Single Nucleotide Polymorphisms (SNPs) on protein flexibility, and thus, function. One main objective is to learn about the advances of additional structural information, compared to sequence-based predictions. For this project we have created a comprehensive dataset including synonymous and non-synonymous SNPs that we have mapped to known PDB structures at different resolutions and sequence identity cutoffs. The largest dataset created comprises 16,000 mutations distributed among 970 high-quality protein structures of which 26% are found in Homo Sapiens and the majority are enzymes.

Running multiple MD simulations for these mutant and WT proteins for at least 10ns or even longer, will help to further our understanding regarding the connection of protein function and flexibility; furthermore, due to the large scale of this project it will be possible to infer statistical information about MD hitherto unseen.

The results from this study will help to further our understanding in two different areas: First, understand the relation of protein structure, flexibility and (mal)function. Second, learn about the caveats of MD simulations in respect to statistical validity. Finally, the data created will be of high interest to the scientific community and will be freely available through web download.

It is expected to be able to finish the data creation by the end of 2012, using the new SuperMUC computer. A possible publication based on the data is likely to be finished by mid 2013.

3. Multiscale modeling of fibril-indicator molecules

(collaboration with Department of Theoretical Chemistry and Biology, KTH, Stockholm)

The amyloid fibril formation has been reported to be responsible for a number of neurogenerative diseases such as Alzheimer's disease and Parkinson's disease. The early stage detection of such deposits in brain and in biostructures and the quantitative estimation of fibrils will help in clinical drug trial phases. Many molecules such as thioflavin-T (THT), Congo-red can be used for identifying and estimating such fibrils colorimetrically. However the properties such as their solubility and lipophilicity have to be tuned for the in-vivo imaging applications. We use a multiscale approach that include docking, molecular dynamics, hybrid QM/MM molecular dynamics and QM/MM response techniques to study the mode of binding of THT with fibril, various binding sites of fibrils, the molecular conformation of THT in fibril and its absorption spectra in fibril. The aim of this project is to understand the experimentally observed red shift in the absorption spectra of THT in presence of fibrils and to explain the fibril-induced enhancement in the fluorescence intensity of THT.

In the multiscale approach, the structure modelling using hybrid QM/MM molecular dynamics and property modelling using QM/MM response technique are the computationally demanding steps. ScalaLife is helping with enabling more efficient usage of DALTON and GROMACS for performing the simulations. Benchmarking analysis is also need to establish the optimal number of compute nodes for various QM system sizes and simulation patterns.

4. Poisson solver extension in GROMACS

(collaboration with National Center for Supercomputing Applications, Sofia)

The group is extensively using GROMACS for molecular dynamics simulations. Simulations are performed on a BlueGene/P machine. On this architecture GROMACS scales to only about 5000 cores with the standard PME algorithms. Peicho Petkov is working on implementing a Poisson solver in GROMACS to be used in place of the PME method. The goal is to achieve better performance and scalability on BlueGene machines. Electrostatic field calculations using a Poisson solver will also alleviate some boundary conditions artefacts that arise from other methods.

ScalaLife provides support for proper integration of the routines in the GROMACS source code, code analysis, profiling and performance benchmarking.

5. Code analysis and optimization of MUSIC

(collaboration with INCF, Stockholm)

MUSIC, the multi-simulation co-ordinator, is a standard API and communication framework allowing multiple neuronal network simulators and other tools to exchange data on-line forming a multi-simulation. It solves problems such as spatial alignment---to communicate the right data to the right processes in a cluster--and temporal alignment---when and how often to send data to which process, also when simulators are connected in complex patterns including loops. MUSIC currently scales to a few thousand cores. Within the framework of the ScalaLife project, our goals are to analyze MUSIC performance and identify weak spots, possible improvements and obstacles to better scaling, where the aim is scaling on at least the level of 10k cores.

6. Code analysis and benchmarking of XMIPP

(collaboration with Instruct Image Processing Center and the Biocomputing Unit at the Spanish National Center for Biotechnology, CNB-CSIC, Barcelona)

7. Code analysis and benchmarking of software for cellular biosystem simulations

(collaboration with ICM, Warsaw)

8. Code analysis and benchmarking of ERGO

(collaboration with Uppsala University, Sweden)

9. Multi-level/multi-scale simulations with Gromacs/Dalton using MAPPER frameworks

(collaboration with MAPPER project)

10. Implementation of the multi-scale Gromacs/Dalton workflow into GridBeans

(collaboration with MMM project)

User Community

Marc Offman -
Technische Universität München, Fakultät für Informatik, Germany
Research Description

Ekaterina Elts -
Thermodynamics and Energitechnik, University Paderborn, Germany
Research Description

Sarah Rouse -
Structural Bioinformatics and Computational Biochemistry Unit, Department of Biochemistry, University of Oxfor, UK

Tyler Reddy -
Structural Bioinformatics and Computational Biochemistry Unit, Department of Biochemistry, University of Oxfor, UK
Research Description

Antreas Kalli -
Structural Bioinformatics and Computational Biochemistry Unit, Department of Biochemistry, University of Oxfor, UK

Joseph Goose -
Structural Bioinformatics and Computational Biochemistry Unit, Department of Biochemistry, University of Oxfor, UK

Guillem Portella -
Molecular Modelling & Bioinformatics Group, Institute for Research in Biomedicine (IRB Barcelona), Spain
Research Description

Aatto Laaksonen -
Division of Physical Chemistry, Arrhenius Laboratory, Stockholm University, Sweden

Alexander Lyubartsev -
Division of Physical Chemistry, Arrhenius Laboratoty, Stockholm University, Sweden
Research Description

N. Arul Murugan -
Department of Theoretical Chemistry and Biology, School of Biotechnology, Royal Institute of Techonolgy, Sweden.
Research Description

Christoph Scheurer -
Lehrstuhl für Theoretische Chemie, Technische Universität München, Germany

Clemens Woywod-
Centre for Theoretical and Computational Chemistry, Tromso, Norway
Research Description

Johannes Weber -
Department of Chemistry, University of Munich (LMU), Germany
Research Description

Luca Frediani -
Theoretical Chemistry, Department of Chemistry, University of Tromsø, Norway

Juan Fernández-Recio -
Life Sciences Department, Barcelona Supercomputing Center, Spain
Research Description