Triplifying Wikipedia's tables
Muñoz, Emir ; Hogan, Aidan ; Mileo, Alessandra
Muñoz, Emir
Hogan, Aidan
Mileo, Alessandra
Loading...
Repository DOI
Publication Date
2013
Keywords
Type
Workshop paper
Downloads
Citation
MUÑOZ, Emir, HOGAN, Aidan and MILEO, Alessandra, 2013, Triplifying Wikipedia’s Tables. In : Proceedings of the Linked Data for Information Extraction Workshop (LD4IE 2013) at ISWC 2013. Sydney, Australia : CEUR-WS.org. 2013. CEUR Workshop Proceedings
Abstract
We are currently investigating methods to triplify the content of Wikipedia's tables. We propose that existing knowledge-bases can be leveraged to semi-automatically extract high-quality facts (in the form of RDF triples) from tables embedded in Wikipedia articles (henceforth called \Wikitables"). We present a survey of Wikitables and their content in a recent dump of Wikipedia. We then discuss some ongoing work on using DBpedia to mine novel RDF triples from these tables: we present methods that automatically extract 24.4 million raw triples from the Wikitables at an estimated precision of 52.2%. We believe this precision can be (greatly) improved through machine learning methods and sketch ideas for features that should help classify (in)correct triples.
Funder
Publisher
CEUR-WS.org
Publisher DOI
Rights
Attribution-NonCommercial-NoDerivs 3.0 Ireland