Publication

Random Manhattan Indexing

QasemiZadeh, Behrang
Handschuh, Siegfried
Citation
Behrang QasemiZadeh and Siegfried Handschuh (2014) Random Manhattan Indexing 25th International Workshop on Database and Expert Systems Applications
Abstract
Vector space models (VSMs) are mathematically well-defined frameworks that have been widely used in text processing. In these models, high-dimensional, often sparse vectors represent text units. In an application, the similarity of vectors and hence the text units that they represent is computed by a distance formula. The high dimensionality of vectors, however, is a barrier to the performance of methods that employ VSMs. Consequently, a dimensionality reduction technique is employed to alleviate this problem. This paper introduces a new method, called Random Manhattan Indexing (RMI), for the construction of L1 normed VSMs at reduced dimensionality. RMI combines the construction of a VSM anddimension reduction into an incremental, and thus scalable, procedure. In order to attain its goal, RMI employs the sparse Cauchy random projections.
Publisher
Publisher DOI
Rights
Attribution-NonCommercial-NoDerivs 3.0 Ireland