Reconstructing databases: instance-based structure discovery using reconstructability analysis

Details

Serval ID
serval:BIB_8226470FC95E
Type
Inproceedings: an article in a conference proceedings.
Collection
Publications
Institution
Title
Reconstructing databases: instance-based structure discovery using reconstructability analysis
Title of the conference
CASCON '17 Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering
Author(s)
Andritsos P.
Publisher
IBM Corp. Riverton, NJ, USA ©2017
Address
Markham, Ontario, Canada
Publication state
Published
Issued date
06/11/2017
Peer-reviewed
Oui
Pages
279-284
Language
english
Abstract
Exploring database tables, small or large, can be challenging if no proper structure and constraints exist. The usual, textbook-based way to impose structure in relational databases is to define functional dependency constraints and apply decomposition theory to achieve smaller, more concise and semantically meaningful relations, without loss of the original information. This procedure requires the existence or definition of functional dependencies, mostly at design time. Consequently, any data instances need to adhere to these constraints. However, functional dependencies are not always available or easy to deduce. In this position paper, we explore a new and novel way to perform decomposition, or reconstruction, of database tables, based on their instances and their information content. We present a technique from Systems Theory, called Reconstructability Analysis, (RA), and discuss how it can be used to decompose relations in a fully unsupervised way and without any pre-existing constraints. RA quantifies the information content of a database relation and searches for sub-relations that retain this information, while they can be described in a more concise fashion than the original one. After defining RA, we show its potential, we discuss advantages and disadvantages and propose problems worth exploring by the database community.
Create date
28/02/2018 11:02
Last modification date
21/08/2019 6:16
Usage data