Title Student(s) Supervisor Description
Windows support for a Parallel Runtime system
1 Thomas Fahringer details here
Code Region Instrumentation for Energy Consumption Measurements 1 Radu Prodan, Vlad Nae details here
Using Amazon EC2 Cluster Compute instances for scientific computing 1 Radu Prodan details here
Using Amazon EC2 Cluster GPU instances for scientific computing 1 Radu Prodan details here
A Visualization Tool for Code-Performance Association 1 Thomas Fahringer details here
Insieme Compiler C++ Integration 1 Thomas Fahringer details here
Automated test data generation 1 Thomas Fahringer details here
Task parallelism using Insieme 1 Thomas Fahringer details here
Support for recursive data types in the Insieme Runtime 1 Thomas Fahringer details here
Insieme INSPIRE property deduction 1 Thomas Fahringer details here
Automated Characterizing of OpenCL Devices Using Micro Benchmarks 1 Thomas Fahringer details here
OpenCL Host Code Frontend for the Insieme Compiler Environment 1 Thomas Fahringer details here
Out-of-core Interactive Visualization of Massive Models 1 Biagio Cosenza details here
A Performance Predictor for Heterogeneous Computation 1-2 John Thomson details here
Visualization of LIC (Line Integral Convolution) by using OpenCL 1-2 Biagio Cosenza (DPS), Werner Benger (Astro UIBK, and LSU) details here
Using Amazon EC2 spot instances for scientific computing 1 Radu Prodan details here
Fortran Frontend for Insieme 1 Hans Moritsch details here
Parallel sorting algorithms in OpenCL
1 Radu Prodan  details here

  

 

Title Windows support for a Parallel Runtime system
Number of students 1
Language English
Supervisors Thomas Fahringer
Description

The goal of this master thesis is making the Insieme runtime system compatible with Windows.
Insieme (http://insieme-compiler.org) is a combined compiler/runtime framework for parallel program optimization research. The runtime component implements a high-performance user-level tasking system on top of POSIX interfaces, in particular pthreads.

Tasks
  • Familiarize yourself with the Insieme runtime (with support from its developers)
  • Port its various POSIX calls to equivalent Windows APIs
  • Particular attention should be paid to threading:
                        ~ try both the win32 pthread port and native windows threading and compare its performance
                        ~ performance analysis should be performed on at least 2 different versions of Windows (e.g. 7 and XP) and compared to Linux on the same system

 

Theoretical skills Parallel Computing, Operating Systems
Practical skills
  • Good C programming skills
  • Familiarity with POSIX threads and/or the Windows thread API is an advantage
Additional information  

 

 

Title Code Region Instrumentation for Energy Consumption Measurements
Number of students 1
Language English
Supervisors Radu Prodan, Vlad Nae
Description The goal of this thesis is the design, implementation and evaluation of a library enabling energy consumption measurements for program code regions.
Tasks
  • study state-of-the-art research in the fields of program code instrumentation and energy efficiency;
  • familiarize yourself with the power measuring device and its data collection interfaces;
  • implement a utility library for connecting to the power measuring device;
  • design and implement the program code instrumentation library for power measurements;
  • implement a test suite for accurate evaluation of power consumption for small program code regions.
Theoretical skills Basic knowledge of energy consumption in computing systems
Practical skills C programming
Additional information The provided power measuring device is an accurate device from a professional range and offers multiple methods of collecting data which makes it easy to work with.

 

Title
Using Amazon EC2 Cluster Compute instances for scientific computing
Number of students  1
Language English
Supervisor Radu Prodan
Description The goal of this thesis is to investigate the use of Apache Hadoop framework for scientific computing. A budget of about 1000 Euro for running EC2 experiments is available.
Tasks
  • study the Amazon EC2 API and create a tutorial;
  • analyse the cluster compute instances using well-known benchmarks such as latency measurement and HPL;
  • based on the benchmark analysis, simulate the execution of traces of large traces from the Parallel Workflow Archive and Grid Workflow Archive;
  • quantify time and cost-wise the suitability of these instances for scientific computing.
Theoretical skills Parallel computing
Practical skills C, Virtualisation, Java
Additional information Amazon EC2

 

Title
Using Amazon EC2 Cluster GPU instances for scientific computing
Number of students  1
Language English
Supervisor Radu Prodan
Description The goal of this thesis is to investigate the use of Apache Hadoop framework for scientific computing. A budget of 1000 Euro for running EC2 experiments is available.
Tasks
  • study the Amazon EC2 API and create a tutorial;
  • analyse the cluster GPU instances using well-known benchmarks such as latency measurement and HPL;
  • based on the benchmark analysis, simulate the execution of traces of large traces from the Parallel Workflow Archive and Grid Workflow Archive;
  • quantify time and cost-wise the suitability of these instances for scientific computing,
Theoretical skills Parallel GPU computing
Practical skills C, CUDA/OpenCL, Virtualisation, Java
Additional information Amazon EC2

 

Title
A Visualization Tool for Code-Performance Association
Number of students 1
Language English
Supervisor Thomas Fahringer
Description The aim of this master thesis is to develop a graphical user interface tool that visualizes given performance data for given source code for parallel programs. When doing performance analysis, one usually measures performance data (run time, cache misses, retired instructions, …) for several code regions of the program. The tool developed in the course of this thesis should visualize this performance data and associate it to the code (e.g. show performance bars with numbers, units, etc… next to the code)
Tasks
  • Familiarize with existing editors and graphics libraries that can be extended
  • Implement an editor or extend an existing one to support visualizing performance data next to program code.
Practical skills
  • Programming in C/C++ or Java
  • No skills or background in parallel programming necessary

 

Title
Insieme Compiler C++ Integration
Number of students 1
Language English or German
Supervisor Thomas Fahringer
Description The Insieme compiler system is a high level source to source compiler for parallel programs. It currently supports C input and output with parallel constructs in OpenMP, MPI or OpenCL. All input is transformed to the INSPIRE intermediate representation (IR), a semi-functional language with a strong type system and a number of unified built-in parallel constructs. Transformations are performed on this representation and in the end a C program with calls to the Insieme runtime system is generated.

The goal of this thesis is to add support for C++ programs to Insieme. This includes investigating which features of C++ can be supported without any changes, and, if this does not include all relevant features, proposing extensions to the IR to fully support all required features. It also includes implementation work on the frontend, and potentially the backend and runtime system to add this support. Finally, the performance and analysis impact of using various C++ features compared to standard C should be investigated.
Tasks all in cooperation with the Insieme team:
  • Familiarize yourself with Insieme and INSPIRE
  • Gain in-depth knowledge of C++ language features and requirements
  • Draft a transformation plan from C++ to INSPIRE (discuss any changes with the Insieme team)
  • Implement the frontend extensions (and, if required, backend features)
  • Investigate the performance and analysis impact of using C++
Theoretical skills
  • Compiler Construction
  • Distributed and Parallel Systems
Practical skills
  • Strong C++ skills
Additional information  

 

Title
Automated test data generation
Number of students 1
Language English
Supervisor Thomas Fahringer
Description One goal of the Insieme project is the development of a compiler capable of optimizing parallel programs. Iterative compilation will be one of the major concepts. This approach is based on the repeated compilation of a given code fragment and collection of information regarding the dynamic execution. The results are used to steer the compilation process, especially the selection of code transformations to be applied to the programs.

To enable the compiler to handle larger applications, sub-regions of the code have to be handled individually. Unfortunately, this makes it harder to apply iterative compilation, since for this approach executable programs equipped with representative input data sets are required.

In the Insieme compiler, every program is represented using INSPIRE - the Insieme parallel intermediate representation - which is a compact, semi-functional, strongly typed formal language for parallel applications. Within this representations, candidates for sub-regions can be easily identified and converted into executables.

The goal of this thesis is to automatically provide valid input data for those executables covering sub-regions. For instance, if the extracted region is a function processing two vectors of a given type, matching input data should be generated and passed to the executable. The generation process should also be extended to support the generation of sequences of input values for more extensive investigations of the sub-regions in the context of machine learning driven optimizations. Particularly, the generated input values may be required to satisfy certain statistical properties.
Tasks all in cooperation with the Insieme team:
  • Familiarize yourself with the Insieme project, in particular INSPIRE
  • Investigate the potential data types occuring as input for sub-regions
  • Formally limit the potential input types based on the INSPIRE types
  • Investigate literature regarding the automated generation of test data
  • Implement a test data generator
  • Integrate the test data generation into the Insieme project by offering a modul for generating an executable based on a sub-region and the generated data
Theoretical skills  
Practical skills
  • C/C++ skills (the compiler is implemented using C++0x, the target code is C99)
Additional information  

 

Title
Task parallelism using Insieme
Number of students 1
Language English
Supervisor Thomas Fahringer
Description The Insieme project aims to develop an optimizing programming tool chain for parallel programs including a compiler capable of static analysis and transformations and a runtime managing and tuning the execution of programs within shared and distributed memory systems. The system should be able to cover basic parallel paradigms like data or functional parallelism as well as task based work distribution.

The goal of this thesis is to verify and improve the system's ability to handle task based parallelism. Therefore, a certain number of task parallel test cases close to real-world examples should be implemented using a format which can be handled by the Insieme compiler (e.g. OpenMP). The test cases should cover a wide range of potential task-based parallel algorithms, and at least some of them should be parameterized for benchmarking.

The processing of these input codes should be investigated throughout the various compiler stages and, if possible, improved. Finally, the handling and mapping of the individual tasks within the runtime should be examined and refined.
Tasks all in cooperation with the Insieme team:
  • Familiarize yourself with the Insieme project, in particular INSPIRE and the runtime process model
  • Find and implement 3-5 real world examples for task parallelism fitting the test case requirements
  • Investigate the processing of the examples throughout the various compiler stages and the runtime
  • Derive proposals for improving Insieme's capability of handling tasks => Specification
  • Implement improvements to verify expected effects
Theoretical skills
  • Parallel Systems
Practical skills
  • C/C++ skills (the compiler is implemented using C++0x, the target code and runtime system is C99)
Additional information  

 

Title
Support for recursive data types in the Insieme Runtime
Number of students 1
Language English
Supervisor Thomas Fahringer
Description The core of the Insieme Compiler project is formed by a compact, semi-functional, strongly typed formal language for parallel programs called INSPIRE. Within the backend of the compiler, this representation is converted into equivalent C code using functionality offered by the Insiem Runtime. This runtime is capable of processing units of work within shared and distributed memory systems. However, to allow the runtime to control the distribution of data among its managed environment, the compiler has to provide the necessary meta information.

The INSPIRE language allows the specification of recursive data types, that is, types including their own definition as an element. For instance, a linked list defined by a struct including a value element and a next pointer to the same struct is a recursive type. The goal of this thesis is to investigate possibilities of mapping such recursive types to the virtual environment offered by the runtime such that recursive data like lists or trees can be effectively managed.
Tasks all in cooperation with the Insieme team:
  • Familiarize yourself with the Insieme project, in particular INSPIRE and the runtime process model
  • Investigate the concept of recursive data types and propose potential beneficial modifications
  • Investigate and propose a specification on how to map recursive types to the runtime
  • Implement your proposal
Theoretical skills
  • Parallel Systems
  • Functional Programming
Practical skills
  • C/C++ skills (the compiler is implemented using C++0x, the target code and runtime system is C99)
Additional information  

 

Title
Insieme INSPIRE property deduction
Number of students 1
Language English
Supervisor Thomas Fahringer
Description The core of the Insieme Compiler project is formed by a compact, semi-functional, strongly typed formal language for parallel programs called INSPIRE. This language is the basis for all analysis and transformations within the compiler.

For this language, a generic property deduction framework is required. For instance, the framework should allow the identification of dead code regions or replicated code as well as live variable, alias and exit analysis. The goal of this thesis is the design, specification and implementation of such a framework and the demonstration of its abilities by realizing one or two well known analyses.
Tasks all in cooperation with the Insieme team:
  • Familiarize yourself with the Insieme project, in particular INSPIRE
  • Familiarize yourself with Control Flow Graphs and related analyses
  • Derive a proposal for a property deduction system based on INSPIRE
  • Implement your system
  • Demonstrate its abilities by realizing 1-2 concrete analyses
Theoretical skills
  • functional programming
  • compiler construction
  • interest in logic and program analyses
Practical skills
  • C++ skills (the compiler is implemented using C++0x)
Additional information  

  

Title
Automated Characterizing of OpenCL Devices Using Micro Benchmarks
Number of students 1
Language English
Supervisor Thomas Fahringer
Description uCLbench is a benchmark suite to determine the most importatn characterisits of an OpenCL device such as GPUs or CPUs. The goal of this thesis is to develop further this benchmark suite in terms of scope of operation and ease of use. New benchamrks characterizing important poperties should be implemented and added to this benchmark suite. Furthermore the output of all benchmarks should be formatted in a common and machine readable format.
Tasks
  • Familiarize with uCLbench
  • Extend suite with new benchmarks
  • Brint the output in a machine readable form
Theoretical skills Basic knowledge of OpenCL device characteristics (mainly GPUs)
Practical skills Programming C and OpenCL
Additional information OpenCL: http://www.khronos.org/opencl/

  

Title
OpenCL Host Code Frontend for the Insieme Compiler Environment
Number of students 1
Language English
Supervisor Thomas Fahringer
Description The Insieme compiler frontend is aiming at translating several C dialects, like C with OpenMP/MPI or OpenCL to a common intermediate representation (called INSPIRE). Goal of this thesis is to implement a forntend for OpenCL host code. The OpenCL host language consists of many pre-implemented functions for allocating/copying memory, compiling OpenCL kernels at runtime etc. The first task is to port the semantics of these pre-implemented functions to INSPIRE. The second task is to implement mechanisms to aquire the OpenCL kernel code at compile time and add it to the program.
Tasks
  • Familiarize with the Insieme project, in particular INSPIRE and the runtime process model
  • Read into OpenCL standard
  • Implement a subset of the pre-implemented OpenCL host functionalities in INSPIRE
Theoretical skills Basic knowledge of compiler construction
Practical skills Programming C++
Additional information OpenCL: http://www.khronos.org/opencl/
Insieme: http://www.dps.uibk.ac.at/insieme/

 

Title
Out-of-core Interactive Visualization of Massive Models
Number of students 1
Language English
Supervisor Biagio Cosenza
Description

Traditional render systems provide interactive performance but require all the data to be loaded in the RAM. Out-of-core (OOC) rendering techniques are necessary for viewing large disk-resident data sets that do not fit in memory. In order to reach interactive performance, OOC techniques should carefully handle visibility-culling, Level-of-Detail (LOD), and memory management.

The aim of the thesis is to interactively render a massive model having millions of triangles.

The target model is a Boeing 777 (a courtesy of David Kasik, Boeing Corporation) and is supplied in a compressed version filling 10 CDs (or 2 DVDs). Students should agree to the Boeing's N.D.A. in order to use this model.

Tasks
  • Asset a render system for in-core models (small models that don't exceed the main memory)
  • Study a LOD approach based on Spherical Harmonics functions
  • Implement a pre-process algorithm in order to apply the LOD technique to the model
  • Implement a LOD-based algorithm to visualize the model interactivel
  • Use the proposed framework with the Boeing 777 model
Theoretical skills Basic knowledge of parallel computing
Practical skills
  • C/C++
  • Computer graphics and/or visualization background is a plus
Additional information

 

Title
A Performance Predictor for Heterogeneous Computation
Number of students 1
Language English
Supervisor John Thomson
Description

The current trend is microprocessor design is towards heterogeneous computation - that is different types of processing units, able to perform well at different tasks. However, there is often a penalty for moving code between processors, and it is often unclear how code will perform on computation platforms as diverse as an Intel i7, an Nvidia 480 and and IBM Cell SPE. For this project, you should design and build a performance predictor for heterogeneous processors, which analyses benchmark code and predicts the execution time of code without running it.

The predictor will use machine learning to predict the result, using past executions of similar programs to construct a model of the system. Accurate performance prediction is of great use in the systems world, for scheduling and code optimisation.

Tasks  
Theoretical skills  Compiler thoery, computer architecture, machine learning, basic statistics
Practical skills  Programming
Additional information  

    

Title
Visualization of LIC (Line Integral Convolution) by using OpenCL
Number of students 1-2
Language English
Supervisor Biagio Cosenza (DPS), Werner Benger (Astro UIBK, and LSU)
Description Line Integral Convolution (LIC) is a widely used visualization technique for vector fields on surfaces. There are many interesting extensions of the LIC, including 3d volume rendering, 2d tensor fields, volume with non-uniform data sets.
OpenCL is an open standard for cross-platform, parallel programming of modern processors, allowing portability between platforms (e.g. GPU and multi-core).
The aim of this project is to write an efficient parallel OpenCL implementation of the LIC algorithm and further extensions.
The implementation should be made available in the context of the VISH visualization shell and it will be used for medical visualization, computational fluid dynamics, and astrophysics.
This work is a collaboration with Institut für Astro- und Teilchenphysik and the Louisiana State University.
Tasks
  • Familiarize yourself with OpenCL
  • Study the Line Integral Convolution algorithm and familiarize with the VISH framework
  • Write OpenCL parallel implementation of the LIC algorithm
  • Study the OpenCL optimization guide and optimize the code to several architectures
  • Possible extension: 3D LIC volume, tensor LICHyper, Clifford transformation, other 3d volume rendering extensions
Theoretical skills  
Practical skills
  • C/C++
  • Basic knowledge of parallel computation
  • Computer graphics and/or visualization background is a plus
Additional information

          

Title
Using Amazon EC2 spot instances for scientific computing
Number of students 1
Language English or German
Supervisor Radu Prodan
Description

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers. Spot Instances enable you to bid for unused Amazon EC2 capacity. Instances are charged the Spot Price, which is set by Amazon EC2 and fluctuates periodically depending on the supply of and demand for Spot Instance capacity. To use Spot Instances, you place a Spot Instance request, specifying the instance type, the Region desired, the number of Spot Instances you want to run, and the maximum price you are willing to pay per instance hour. To determine how that maximum price compares to past Spot Prices, the Spot Price history is available via the Amazon EC2 API and the AWS Management Console. If your maximum price bid exceeds the current Spot Price, your request is fulfilled and your instances will run until either you choose to terminate them or the Spot Price increases above your maximum price (whichever is sooner).

The goal of this thesis is to investigate the use of EC2 spot instances for scientific application with respect to price, fault tolerance, and Quality of Service.

A budget of 1000 Euro for running EC2 experiments is available.

Tasks  
Theoretical skills  Distributed and parallel systems
Practical skills  Java
Additional information  

  

Title
Fortran Frontend für INSIEME Compilersystem
Number of students 1
Language English or German
Supervisor Hans Moritsch
Description

In der Arbeitsgruppe DPS wird ein Compilersystem für parallle Programme entwickelt. Die Eingabeprogramme liegen in C++ vor und enthalten verschiedene
Arten von parellelen Konstrukten (MPI, OpenMP, OpenCL). Als erster Schritt erfolgt eine Übersetzung in eine sprachunabhängige Zwischendarstellung (Intermediate Representation IR). Ziel dieser Arbeit ist es, diese Zwischendarstellung auch aus Fortran-Programmen zu erzeugen. Es soll ein Fortran-Frontend erstellt werden, das im wesentlichen aus einem Fortran-Parser und einem IR-Generator besteht.
Der Fortran-Parser analysiert das Eingabeprogramm und erstellt daraus einen auf Fortran basierenden Abstrakten Syntaxbaum. Der IR-Generator traversiert diesen und erzeugt
daraus die benötigte Zwischendarstellung. Existierende Fortran-Parser und Compilerbauwerkzeuge sind auf ihre Geeignetheit für diesen Zweck hin zu untersuchen und entsprechend
einzusetzen.

Tasks  
Theoretical skills  
Practical skills  
Additional information  

  

Title
Parallel sorting algorithms in OpenCL
Number of students 1
Supervisor Radu Prodan
Language German, English
Description Open Computing Language (OpenCL) is a framework for writing task and data parallel programs that execute across heterogeneous multicore platforms consisting of multicore devices such as CPUs, Graphical Processing Units (GPU) accelerators, and Cell Broadband co-processors. OpenCL includes a language (based on C99) for writing so called kernels which are SIMD functions that execute on OpenCL devices, plus APIs that are used to define and then control the platforms.

The objective of this thesis is to investigate the potential of using hybrid multicore architectures consisting of CPUs, GPUs, and Cell processors for parallelising list sorting algorithms such as bubble sort, quick sort, rank sort, bucket sort, selection sort, merge sort, etc.
Tasks
  • Implementation of several (3-4) parallel sorting algorithms in OpenCL
  • Scheduling and optimisation of the parallel sorting algorithms on hybrid multiprocessor platforms
  • Speedup and efficiency analysis
  • Overhead analysis
Theoretical skills Compiler construction, Parallel systems, Computer architecture
Practical skills C, C++
Additional information OpenCL
INSIEME