Data Mining Grid



Data mining tools and services for grid computing environments.


Future and emerging complex problem-solving environments are characterised by increasing amounts of digital data and rising demands for coordinated resource sharing across geographically dispersed sites. Next generation grid technologies are promising to provide the necessary infrastructure facilitating seamless sharing of computing resources in complex problem-solving environments. Currently, there exists no coherent framework for developing and deploying data mining applications on the grid. The DataMiningGrid project will address this gap by developing generic and sector-independent data mining tools and services for the grid.

The main objectives of the DataMiningGrid project are:
  • to develop grid interfaces that allow data mining tools and data sources to interoperate within distributed grid computing environments. A userfriendly workflow editor will be provided to facilitate the configuration of analysis tasks;
  • to develop grid-based text mining and ontology-learning services and interfaces for knowledge discovery in texts and ontology learning;
  • to develop a testbed consisting of several demonstrator applications from a diverse set of sectors, including the bioinformatics, healthcare, and automotive industries;
  • to align and integrate these technologies with emerging grid standards and infrastructures.
The key technologies developed in the project include:
  • distributed data mining tools, facilitating novel approaches to mine data;
  • grid-aware data mining interfaces and services (e.g. data sources, algorithms);
  • workflow-based management of mining in grid environments;
  • standardisation of data mining within grid computing environments.
To demonstrate the technology developed, the project will implement a range of demonstrator applications in e-science and e-business (see picture). Other results include promotion of accessible data mining and the facilitation of industrial take-up.



Because of the increasing role of data in many sectors, the project’s will be significant as it will be an important step towards more effective and efficient exploitation of available data and information resources. In the long run, the impact of the project will contribute to new business and R & D opportunities in the European market and an increase in quality of life. The project will also contribute to standardisation efforts of grid and data mining technologies.

Type of project
Specific targeted research project
Project coordinator
University of Ulster, Northern Ireland
Contact person
Prof. Werner Dubitzky,
School of Biomedical Sciences,
Cromore Road, BT52 1SA Coleraine
Northern Ireland
Project website

www.datamininggrid.org
Maximum Community
contribution to project

EUR 1,883,000
Project start date
1 September 2004
Duration
24 months