The 2009 International Conference on High Performance Computing & Simulation (HPCS'09)
June 21-24, 2009, Leipzig, Germany

HPCS 2009 DEMO SESSIONS

 DEMO IV

 The LIBI Grid Problem Solving Environment for Bioinformatics

M. Mirto1, M. Passante1, D. Tartarini1, A. Negro2,
I. Epicoco1,2, G. Aloisio1,2

1 University of Salento
2 Euro-Mediterranean Centre for Climate Change (CMCC)
Italy
maria.mirto@unile.it

ABSTRACT
This demonstration aims at to visualize the main features offered by the LIBI Grid Problem solving Environment. It is a virtual laboratory for Bioinformatics where e-scientists can use biological tools and visualize the results by using a Grid Portal. Through a web interface, applications such as PSI-Blast (multiple sequence alignment), MrBayes (Bayesian phylogenetic inference), Gromacs (molecular dynamics), etc. are submitted on remote grid resources and monitored, visualizing the results.

OUTLINE
One of the key aspects in life science applications is the execution of a high number of jobs for large-scale analysis. The Grid is a powerful tool for solving this kind of problem, but in a distributed environment, bioinformatics applications need to be re-engineered in order to provide better performance. In addition, a tool for orchestrating the execution of different kinds of applications and the access to the needed data is fundamental, because different and heterogeneous grid infrastructures have to coexist.

The solution adopted within the LIBI infrastructure consists of the development of an easily extensible Meta Scheduler able to support different middleware. The development leverages on our GRB technology, and includes a set of grid libraries, the Meta Scheduler and the LIBI grid portal. The LIBI grid portal provides an integrated approach to grid resource management, through a user-friendly web GUI and back-end GSI-enabled Meta scheduler Web Service, developed using gSOAP and our GSI plugin for gSOAP.

RESULTS AND IMPACT
The LIBI virtual laboratory for bioinformatics is based on a high performance and distributed infrastructure supporting access to large datasets and the execution of single or complex jobs. One of the main goals of the LIBI project has been the development of a Grid Problem Solving Environment, built on top of EGEE, DEISA and SPACI infrastructures, to allow the submission and monitoring of jobs mapped to complex experiments in bioinformatics. Built on top of gLite, Unicore and Globus core services, a set of enhanced and novel services has been implemented related to resource and data management.

For the resource management, a Meta Scheduler service embeds several plug-ins for accessing different grid middlewares and acts as an engine not only for batch, MPI and parameter sweep jobs, but also for complex jobs described by a workflow. Finally, a workflow editor has been implemented in order to compose different application runs on different middlewares. At the same time we have developed several case studies and successfully verified the effectiveness of the LIBI platform for supporting experiments in a number of real world bioinformatics application scenarios.

A number of BioGrid projects are underway, including myGrid (Stevens, 2004), GeneGrid (Kelly, 2005) and Proteus (Cannataro, 2004). These primarily focus on the sharing of computational resources, large-scale data movement and replication for simulations, remote instrumentation steering or high throughput sequence analysis.

In the last year of the project, the goals of the LIBI virtual laboratory will be to integrate the services developed in order to test the entire platform and provide a production infrastructure.

Several improvements are planned in different areas: as regards the resource management, we intend to test the workflow management system in order to compose different applications and run them on different grid middlewares. In addition, we plan to develop a client library for the Web Service applications. Finally, other applications such as the R package will be integrated into the PSE.

REFERENCES
  1. M. Mirto, I. Epicoco, S. Fiore, M. Cafaro, A. Negro, D. Tartarini, D. Lezzi, O. Marra, A. Turi, A. Ferramosca, V. Zara, G. Aloisio, at al. "The LIBI Grid Platform for Bioinformatics", in Mario Cannataro (Ed.), Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine and Healthcare, IGI Global (to appear).
  2. M. Mirto, S. Vicario, D. Tartarini, I. Epicoco, C. Saccone, G. Aloisio, "Bayesian Phylogenetic Inference in the LIBI Grid platform: a tool to explore large data sets", IEEE Proceedings of the International Symposium on Parallel and Distributed Processing and Applications (ISPA 2008) - December 10-12, 2008, pp. 855-860, Sydney, Australia.
  3. M. Mirto, I. Rossi, I. Epicoco, S. Fiore, P. Fariselli, R. Casadio, G. Aloisio, "High Throughput Protein Similarity Searches in the LIBI Grid Problem Solving Environment" Proceedings of the 5th International Symposium on Parallel and Distributed Processing and Applications (ISPA07), Niagara Falls (Canada), pp. 414-423.
  4. LIBI Web Site: www.libi.it; Involved applications in the case studies are available, by using the LIBI platform, at the following url: http://www.libi.it/biotools.
PRESENTER
Dr. Maria Mirto,
University of Salento, Italy

BIOGRAPHY
Maria Mirto
received her Laurea Degree in Computer Engineering from the University of Lecce, Italy in 2002. She received the Ph.D. degree in Computer Engineering from the ISUFI-University of Lecce, in 2006. Since 2006, she has a postdoc position in the LIBI project for the University of Salento. Since 2003 she is the Principal Investigator of the ProGenGrid project. Her research interests include High Performance, Distributed and Grid Computing, Bioinformatics and Workflow. She received the best paper award at the ITCC 2003 conference. She is co-author of more than 45 papers in refereed journals on parallel, grid computing and Bioinformatics.