Please use this identifier to cite or link to this item:
http://dx.doi.org/10.25673/86229
Title: | Dissecting self-describing data formats to enable advanced querying of file metadata |
Author(s): | Duwe, Kira Kuhn, Michael |
Issue Date: | 2021 |
Type: | Konferenzobjekt |
Language: | English |
URN: | urn:nbn:de:gbv:ma9:1-1981185920-881815 |
Subjects: | Information systems Hierarchical storage management Computer systems organization Client-server architectures Distributed storage |
Abstract: | In times of continuously growing data sizes, performing insightful analysis is increasingly difficult. I/O libraries such as NetCDF and ADIOS2 offer options to manage additional metadata to make the data retrieval more efficient. However, queries on this metadata are difficult as it is currently stored inside the corresponding self-describing data formats. By replacing the file system underneath with the storage framework JULEA, we can use dedicated backends for keyvalue and object stores, as well as databases. Splitting the BP file content into file metadata and file data enables novel and highly efficient data management techniques without creating redundancy.We have kept our approach transparent to the application layer by implementing a custom ADIOS2 engine. Moreover, our data analysis interface allows speeding up metadata queries by a factor of up to 60,000 in comparison to the ADIOS2 API and data formats. |
URI: | https://opendata.uni-halle.de//handle/1981185920/88181 http://dx.doi.org/10.25673/86229 |
Open Access: | Open access publication |
License: | (CC BY 4.0) Creative Commons Attribution 4.0 |
Sponsor/Funder: | Transformationsvertrag |
Publisher: | Association for Computing Machinery |
Publisher Place: | New York |
Original Publication: | 10.1145/3456727.3463778 |
Appears in Collections: | Fakultät für Informatik (OA) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Duwe et al._Dissecting self-describing_2021.pdf | Zweitveröffentlichung | 1.01 MB | Adobe PDF | View/Open |