Data Integration for Business

Keywords: Big data

Affiliation: University of Vienna

Area of Application

For assessment of data quality in business analytics, metadata describing substantive properties of the data are of utmost importance. In particular one needs many times information on how representative the data are or about methods of data collection, knowledge which goes beyond the information in the data base scheme. By combining ideas from statistical metadata management and business workflow management DIBA offers an environment which allows to compute metadata for new data in a warehouse obtained by a data integration activity.

Abstract

The basic idea of the approach is to process metadata simultaneously with the data, i.e. DIBA defines besides database operations like joins corresponding metadata operations which update the data description. Besides the definition of the corresponding populations represented by the integrated data an important topic is keeping track of missing values and documentation of missing values occurring in connection with the operation process.