Managing the High Cost of Data Movement Through Rich Metadata

15 Mar 2021
16:10 - 16:30

Managing the High Cost of Data Movement Through Rich Metadata

When combining traditional HPC and cloud resources for a single workflow, different cost metrics come into play.

In many cases, moving data into a cloud may be inexpensive or free, but pulling it back out again later may be costly. This is part of what makes the cloud sticky. Rather than moving full data sets from a cloud or HPC environment to the other to do more detailed analysis, judiciously subselecting the relevant data based on rich, deep queries can reduce both Ame and cost.

We have the EMPRESS system that can hold custom, rich metadata at various levels from the entire run down to just a region of a single variable. With the rich query interface returning the proper data subsets, only the data desired for use in the alternative platform need to be moved.