Abstract:
Domain ontologies are foundational for knowledge graph, yet their construction remains a bottleneck. Current LLM-based ontology construction is limited by rigid schemas that fail to "let the data speak" and capture real-world semantics. Additionally, the lack of a structured decomposition paradigm leads to hallucinations and poor relationship extraction, hindering the development of reliable domain ontologies. To address these limitations, we propose LLM4onto, a framework designed to automate domain ontology construction directly from raw text. LLM4onto integrates affinity propagation clustering to autonomously discover latent entity types without predefined constraints, alongside the Head-Tail-Relation-Ontology paradigm, which summarizes relationships between ontologies in a triplet format. This decomposition of the holistic construction task into simpler units significantly reduces the cognitive load on the model, thereby minimizing hallucinations. Extensive empirical evaluations demonstrate that LLM4onto achieves superior performance across diverse datasets, with specific validation on cybersecurity corpora underscoring its robust capability in constructing specialized domain ontology architectures.