Publication

Investigating Context Parameters in Technology Term Recognition

QasemiZadeh, Behrang
Handschuh, siegfried
Citation
QasemiZadeh, Behrang; Handschuh, siegfried; (2014) Investigating Context Parameters in Technology Term Recognition . In: Adam Meyers, Yifan He and Ralph Grishman eds. COLING Workshop on Synchronic and Diachronic Approaches to Analyzing Technical Language Dublin, Iralnd, 2014-08-24- 2014-08-24
Abstract
We propose and evaluate the task of technology term recognition: a method to extract technology terms at a synchronic level from a corpus of scientific publications. The proposed method is built on the principles of terminology extraction and distributional semantics. It is realized as a regression task in a vector space model. In this method, candidate terms are first extracted from text. Subsequently, using the random indexing technique, the extracted candidate terms are represented as vectors in a Euclidean vector space of reduced dimensionality. These vectors are derived from the frequency of co-occurrences of candidate terms and words in windows of text surrounding candidate terms in the input corpus (context window). The constructed vector space and a set of manually tagged technology terms (reference vectors) in a k-nearest neighbours regression framework is then used to identify terms that signify technology concepts. We examine a number of factors that play roles in the performance of the proposed method, i.e. the configuration of context windows, neighborhood size (k) selection, and reference vector size.
Publisher
Publisher DOI
Rights
Attribution-NonCommercial-NoDerivs 3.0 Ireland