Abstract:
Over the last decade, extensive research has been done on automatic construction of knowledge graphs from Web resources, resulting in a number of large-scale knowledge graphs such as YAGO, DBpedia, BabelNet, and Wikidata. Despite that some of these knowledge graphs are multilingual, they contain few or no linked data in Persian, and do not support tools for extracting knowledge from Persian information sources. FarsBase is the first Persian multi-source knowledge graph, which is specifically designed for semantic search engines to support Persian knowledge. FarsBase uses a diverse set of hybrid and flexible techniques to extract and integrate knowledge from various sources, such as Wikipedia, Web tables and unstructured texts. It also supports entity linking, which allows integration with other knowledge graphs. To maintain a high accuracy for triples, we adopt a low-cost mechanism for verifying candidate knowledge by human experts, where the candidates for human verification are prioritized using different heuristics. FarsBase is being used as the semantic-search system of a Persian search engine and efficiently answers hundreds of semantic queries per second.