Exascale eScience infrastructures will face important and critical challenges, both from computational and data perspectives. Increasingly complex and parallel scientific codes will lead to the production of a huge amount of data. The large volume of data and the time needed to locate, access, analyze and visualize data will greatly impact on the scientific productivity of scientists and researchers in several domains. Significant improvements in the data management field will increase research productivity in solving complex scientific problems. The exascale scenario will involve a lot of distributed data available at an international level across several countries. Several application domains will produce large volumes of data. For instance, concerning the climate change domain, hundreds of exabytes of data (distributed across several centers) are expected to be available through heterogeneous storage resources (located in data centers as well as in external environments like data grids and data clouds) for access, analysis, post-processing and other scientific activities. Collections of data will be stored at different sites and made available to the users for further analysis and studies. The same will happen for other domains, where exascale high-performance computing applications will generate data at a very high rate (terabytes/s) on million of cores. In the paper, the main challenges that must be taken into account in the exascale context are presented and discussed.

Towards exascale distributed data management

ALOISIO, Giovanni;
2009-01-01

Abstract

Exascale eScience infrastructures will face important and critical challenges, both from computational and data perspectives. Increasingly complex and parallel scientific codes will lead to the production of a huge amount of data. The large volume of data and the time needed to locate, access, analyze and visualize data will greatly impact on the scientific productivity of scientists and researchers in several domains. Significant improvements in the data management field will increase research productivity in solving complex scientific problems. The exascale scenario will involve a lot of distributed data available at an international level across several countries. Several application domains will produce large volumes of data. For instance, concerning the climate change domain, hundreds of exabytes of data (distributed across several centers) are expected to be available through heterogeneous storage resources (located in data centers as well as in external environments like data grids and data clouds) for access, analysis, post-processing and other scientific activities. Collections of data will be stored at different sites and made available to the users for further analysis and studies. The same will happen for other domains, where exascale high-performance computing applications will generate data at a very high rate (terabytes/s) on million of cores. In the paper, the main challenges that must be taken into account in the exascale context are presented and discussed.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11587/363248
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact