Open knowledge base canonicalization: Techniques and challenges
Yang, Yang ; Curry, Edward
Yang, Yang
Curry, Edward
Loading...
Files
Loading...
Paper_1_Yang.pdf
Adobe PDF, 219.88 KB
Publication Date
2024-05-26
Keywords
Type
conference paper
Downloads
Citation
Yang, Yang, & Curry, Edward. (2024). Open knowledge base canonicalization: Techniques and challenges. Paper presented at the third International Workshop on Knowledge Graph Generation from Text, co-located with Extended Semantic Web Conference (ESWC), Hersonissos, Greece, 26-30 May.
Abstract
Curated knowledge bases (CKBs) play a fundamental role in both academia and industry. They require significant human involvement to pre-define the ontology and cannot quickly adapt to new domains and new data. To solve this problem, open information extraction (OIE) methods are leveraged to automatically extract structure in the form of non-canonicalized triples from unstructured text. OIE can be used to create large open knowledge bases (OKBs). However, noun phrases and relation phrases in such OKBs are not canonicalized, which results in scattered and redundant facts. In order to disambiguate and eliminate redundancy in such OKBs, the task of OKB canonicalization is proposed to cluster synonymous noun phrases and relation phrases into the same group and assign them unique identifiers. Nevertheless, this task is challenging due to the high sparsity and limited information of OKBs. In this paper, we provide an overview and analysis of the techniques used by the main frameworks and discuss the challenges in this topic.
Funder
Publisher
CEUR Workshop Proceedings
Publisher DOI
Rights
CC BY