enter search term and/or author name
A common problem in many types of databases is retrieving the most similar matches to a query object. Finding these matches in a large database can be too slow to be practical, especially in domains where objects are compared using computationally...
Optimized stratified sampling for approximate query processing
Surajit Chaudhuri, Gautam Das, Vivek Narasayya
Article No.: 9
The ability to approximately answer aggregation queries accurately and efficiently is of great benefit for decision support and data mining tools. In contrast to previous sampling-based studies, we treat the problem as an optimization problem...
Several studies have demonstrated the effectiveness of the Haar wavelet decomposition as a tool for reducing large amounts of data down to compact wavelet synopses that can be used to obtain fast, accurate approximate answers to user...
Pseudo-random number generation for sketch-based estimations
Florin Rusu, Alin Dobra
Article No.: 11
The exact computation of aggregate queries, like the size of join of two relations, usually requires large amounts of memory (constrained in data-streaming) or communication (constrained in distributed computation) and large processing times. In...
Estimating the selectivity of approximate string queries
Arturas Mazeika, Michael H. Böhlen, Nick Koudas, Divesh Srivastava
Article No.: 12
Approximate queries on string data are important due to the prevalence of such data in databases and various conventions and errors in string data. We present the VSol estimator, a novel technique for estimating the selectivity of approximate...
Out-of-core coherent closed quasi-clique mining from large dense graph databases
Zhiping Zeng, Jianyong Wang, Lizhu Zhou, George Karypis
Article No.: 13
Due to the ability of graphs to represent more generic and more complicated relationships among different objects, graph mining has played a significant role in data mining, attracting increasing attention in the data mining community. In...