The core technologies that are common to all the subprojects proposed here are MD and hybrid Monte Carlo methods. The numerical challenges of MD for biological systems are: large size of the systems, presence of multiple and long time scales, and the chaotic nature of the systems of differential equations that have to be solved. Each one of these challenges is addressed in this proposal. The size of the systems will be tackled by efficient parallel implementations of all algorithms described here, including fast force summation methods. Multiple and long time scales, and chaotic nature of MD, will be handled by constructing multiscale symplectic integrators and solvers.
The results of this research will be available to the scientific community in the form of collaboratory open-source software called PROTOMOL. This software will have intuitive scripting and web-based interfaces to allow users to easily prototype their own methods, and to customize the high-performance components provided. In particular, the ability to hide the parallelism from method developers will be supported, enabling them to test sequential implementations first; optional parallelization of successful algorithms may be done later. An initial release of PROTOMOL already provides many of these features. Components from this infrastructure will form the kernel of problem-solving environments for (bio)molecular modeling, protein folding, and molecular design.
This project will enable several natural collaborations which will enhance its scientific value. The framework of all these collaborations is being set locally, nationally, and internationally, with academic universities, research laboratories, and industry, as documented in the proposal. In computer science, we plan to: (a) Collaborate with Petaflop computing groups trying to map protein folding computations into a million-node parallel computer made of Processor-In-Memory nodes. We have already obtained encouraging simulation and analytical results about its feasability; and (b) Compare different parallelization and data distribution approaches for cluster computing.
In the area of computational chemistry and biology this project will foster collaboration in: (a) The use of parallel multiscale symplectic integrators for protein simulations; (b) Computer simulations to study the effectiveness and design of anti-cancer drugs; (c) Pharmacokinetics, in particular study of the quality of binding of drugs and macromolecules using hybrid Monte Carlo free energy calculations; and (d) Flexible molecular docking of both receptor and ligand for structure based drug design, incorporating dynamical and geometric methods. In the area of nanotechnologies, we will use MD simulations to screen proposed molecular quantum-dot cellular automata.
Finally, this research activity is being linked to undergraduate and graduate education through the following initiatives: (a) Computational science education for undergraduate and graduate students, emphasizing collaboration in significant applications; (b) Research experiences for undergraduates (REU); (c) Workshops in computational issues in (bio)molecular modeling and related applications; (d) Offering of courses related to (bio)molecular modeling, parallelism, and numerical methods crosslisted with departments in the College of Science; (e) Use of the collaboratory infrastructure part of the problem-solving environments proposed here as an educational tool for hands-on experience, and development of supplementary online materials for undergraduate data structures and graduate scientific computing courses; and (f) Co-advising science students.
| PROJECT DESCRIPTION: I. RESEARCH PLAN |