Domain adaptive extraction of topical hierarchies for Expertise Mining

Bordea, Georgeta
In this age of pervasive internet access we have become accustomed to rely on web search for our most basic information needs. But complex queries in knowledge-intensive organisations, as well as in the academic environment, are still best answered by direct interaction with domain experts. Experts produce large amounts of text in their daily activities that can be analysed to automatically map expertise and provide services that allow users to search for experts instead of documents. Current approaches for expert finding are based on keyphrase search, relying on exact string matches to identify experts. What is needed instead is support for exploratory search and discovery of expertise topics and experts, and in-depth measures of expertise, that can be provided by extracting expertise topics and the relations between them. This dissertation examines methods for extracting knowledge structures from text and their application to expert search. Towards this goal, we introduce a novel methodology called Expertise Mining, that provides solutions for expertise topic extraction, expert profiling and expert finding through text analysis. In particular, we propose a term extraction approach that considers the level of specificity of a term within a domain, as a solution for expertise topic extraction. We investigate relations between expertise topics, proposing a high-coverage method for topical hierarchy construction based on a global generality measure and a graph-based algorithm. We show that topical hierarchies can be used to improve expert finding, by measuring how well an individual covers the subtopics of a field. Additionally, automatically extracted expertise topics are used to construct expert profiles that provide context to the expertise of a person.This work has been part of the Saffron project, at the Digital Enterprise Research Institute (DERI), NUI Galway. The Saffron system currently provides insight into different Computer Science domains and was deployed at several conferences as a tool for finding collaborators.
Publisher DOI
Attribution-NonCommercial-NoDerivs 3.0 Ireland