ACM Transactions on Database Systems (TODS), Volume 36 Issue 3, August 2011

Efficient similarity joins for near-duplicate detection
Chuan Xiao, Wei Wang, Xuemin Lin, Jeffrey Xu Yu, Guoren Wang
Article No.: 15
DOI: 10.1145/2000824.2000825

With the increasing amount of data and the need to integrate data from multiple data sources, one of the challenging issues is to identify near-duplicate records efficiently. In this article, we focus on efficient algorithms to find a pair...

Differential dependencies: Reasoning and discovery
Shaoxu Song, Lei Chen
Article No.: 16
DOI: 10.1145/2000824.2000826

The importance of difference semantics (e.g., “similar” or “dissimilar”) has been recently recognized for declaring dependencies among various types of data, such as numerical values or text values. We propose a...

Embedding-based subsequence matching in time-series databases
Panagiotis Papapetrou, Vassilis Athitsos, Michalis Potamias, George Kollios, Dimitrios Gunopulos
Article No.: 17
DOI: 10.1145/2000824.2000827

We propose an embedding-based framework for subsequence matching in time-series databases that improves the efficiency of processing subsequence matching queries under the Dynamic Time Warping (DTW) distance measure. This framework partially...

The monte carlo database system: Stochastic analysis close to the data
Ravi Jampani, Fei Xu, Mingxi Wu, Luis Perez, Chris Jermaine, Peter J. Haas
Article No.: 18
DOI: 10.1145/2000824.2000828

The application of stochastic models and analysis techniques to large datasets is now commonplace. Unfortunately, in practice this usually means extracting data from a database system into an external tool (such as SAS, R, Arena, or Matlab), and...

A survey on representation, composition and application of preferences in database systems
Kostas Stefanidis, Georgia Koutrika, Evaggelia Pitoura
Article No.: 19
DOI: 10.1145/2000824.2000829

Preferences have been traditionally studied in philosophy, psychology, and economics and applied to decision making problems. Recently, they have attracted the attention of researchers in other fields, such as databases where they capture soft...