Research Interests
The main focus of my research is optimizations for distributed memory programs (MPI). We employ static analysis, program transformations combined with dynamic optimizations to improve performance (or power efficiency) of parallel programs. Machine learning techniques are used to make these optimizations feasible and effective for the target architectures.
- Parallel Programming Models
- Optimizations & Program transformations
- Compilers
- Multi-/many-core Architectures
- Message-passing
- Machine Learning Techniques
Education
- Master Degree in Software Engineering, University of Bologna, 2006 Topic - "Generic types support in the Java Virtual Machine: Design and prototype implementation"
News
| EuroPVM/MPI 2009 Report (7-10 Sept. 2009, Espoo) | |
|
Several talks pointed out the leadership of the MPI programming model in High Performance Computing (HPC) which, as now, is able to efficiently scale up to a million processors. Furthermore, MPI is looking forward petascale and future exascale systems. As the number of cores per node is increasing, the MPI community is also addressing the problem of interoperability with other models (shared memory) and thus making hybrid programming (e.g. MPI+OpenMP, MPI+PGAS) simpler. However, in my opinion, MPI must catch up with the current state of parallel programming. The C/Fortran interface is too old, somehow difficult and error prone. Today, other models (like the actor model) are gaining more and more interest (take a look at Scala and Erlang) as they promise distributed memory programming in an easy flavour. However, these languages provides only a subset of the functionalities already present in MPI and their implementations are not able to take full advantages of the underlying hardware (e.g. InfniBand) as MPI does. Unfortunately there are no efforts in simplifying the usage of MPI (from the user point of view) and the impression is that MPI is destined to remain a "niche" interface for HPC world where thousand or more nodes are involved. In my opinion this is a pity as message-passing implicitly encourage data locality in its model and in a future where the number of cores per chip is rapidly increasing and the cache coherency systems are breaking down, a flavour of distributed memory programming on a single chip will be probably faced again! |
|
| Test | |
|
It works! ...and it's cool! |

