enter search term and/or author name
Ranking and aggregation queries are widely used in data exploration, data analysis, and decision-making scenarios. While most of the currently proposed ranking and aggregation techniques focus on deterministic data, several emerging applications...
Correlated pattern mining in quantitative databases
Yiping Ke, James Cheng, Wilfred Ng
Article No.: 14
We study mining correlations from quantitative databases and show that this is a more effective approach than mining associations to discover useful patterns. We propose the novel notion of quantitative correlated pattern (QCP), which is...
Sketching techniques provide approximate answers to aggregate queries both for data-streaming and distributed computation. Small space summaries that have linearity properties are required for both types of applications. The prevalent method for...
Confidence bounds for sampling-based group by estimates
Fei Xu, Christopher Jermaine, Alin Dobra
Article No.: 16
Sampling is now a very important data management tool, to such an extent that an interface for database sampling is included in the latest SQL standard. In this article we reconsider in depth what at first may seem like a very simple...
Workload-aware anonymization techniques for large-scale datasets
Kristen LeFevre, David J. DeWitt, Raghu Ramakrishnan
Article No.: 17
Protecting individual privacy is an important problem in microdata distribution and publishing. Anonymization algorithms typically aim to satisfy certain privacy definitions with minimal impact on the quality of the resulting data. While much of...
Hierarchical synopses with optimal error guarantees
Panagiotis Karras, Nikos Mamoulis
Article No.: 18
Hierarchical synopsis structures offer a viable alternative in terms of efficiency and flexibility in relation to traditional summarization techniques such as histograms. Previous research on such structures has mostly focused on a single model,...
Efficient online index construction for text databases
Nicholas Lester, Alistair Moffat, Justin Zobel
Article No.: 19
Inverted index structures are a core element of current text retrieval systems. They can be constructed quickly using offline approaches, in which one or more passes are made over a static set of input data, and, at the completion of the process,...