Overview
DIANE is a lightweight job execution control framework for parallel scientific applications. DIANE improves the reliability and efficiency of job execution by providing automatic load balancing, fine-grained scheduling and failure recovery.
DIANE provides an environment in which the existing applications may be more easily ported to heterogenous computing environments such as the Grid, batch farms or interactive clusters.
The default scheduling plugin algorithms are suited for bag of tasks applications and data-parallel problems with no inter-task communication. However the framework is designed to make it easy to plug in other scheduling algorithms for more complex task synchronization patterns and workflows, for example DAG4DIANE plugin provides support for directed acyclic graph (DAG) applications, MOTEUR plugin provides support for workflow applications.
The backbone of DIANE communication model is based on master-worker architecture. This approach is also known as agent-based computing or pilot jobs in which a set of worker agents controls the resources. The resource allocation is independent from the application execution control and therefore may be easily adapted to various use cases. DIANE uses the Ganga interface to allocate resources by sending worker agent jobs, hence the system supports a large of computing backends: LSF, PBS, SGE, Condor, LCG/EGEE Grid.
As opposed to standard message passing libraries such as MPI, the DIANE framework takes care of all synchronization, communication and workflow management details on behalf of the application. The execution of a job is fully controlled by the framework which decides when and where the tasks are executed. Thus the existing applications are very simple to interface as python plugin modules. Application plugin modules contain only the essential code directly related to the application itself without bothering about networking details.
History
The DIANE R&D Project was started at CERN in 2000. Initially it was intended to be specific to the distributed data analysis for High Energy Physics. Later the scope was extended and the tool was sucessfully used for Monte Carlo simulations based on Geant4 toolkit. As the EGEE project involved more and more scientific communities DIANE also gained popularity in various user communities. During the years the initial prototype developed as 1.x release series was patched and stretched in all directions, way beyond the original concept. Therefore the 2.0 release was indended as a complete rewrite and simplification of the code, capitalizing on the previous experience.