Ontology Learning from Italian Legal Texts

ontologyOntologies capture our knowledge of a specific domain. Because this knowledge is often encoded in unstructured natural language text, it is essential we develop reliable techniques that automatically extract ontologies from such sources. One logical area of application is the legal domain, where valuable knowledge is dispersed across thousands of documents.

In their paper “Ontology learning from Italian legal texts”, Alessandro Lenci, Simonetta Montemagni, Vito Pirrelli and Giulia Venturi discuss a light-weight method for automatic ontology learning. Their method proceeds in several steps. After a pre-processing phase, they first extract target terms. These target terms can be either single words (identified on the basis of their frequency) and multi-word units (identified on the basis of the statistically significant co-occurrence of their parts). Next, these terms are organized. One algorithm, based on an analysis of their head-modifier structure, identifies hierarchical relationships, while another, based on distributional patterns, identifies semantically similar terms. A comparison with established legal ontologies shows that this method is able to model a significant body of legal knowledge.

Frankly, Lenci et al.’s claims left me a bit confused. In their abstract, they talk about a “fully implemented ontology learning system”, while their experiments produce nothing more than a collection of hyponym-hypernym pairs with the same heads, and loose collections of semantically similar words. Surely ontologies are richer than that? Even taking the difficulties of automatic knowledge modeling into account, ontology learning has already come much further than this paper would lead one to suspect. It’s true many systems suffer from poor precision and/or recall and are still a far cry from practical production-ready implementations. Still, that’s what research is for. To be sure, Lenci et al. take a first step toward ontology learning in the legal field, but it’s the next steps I’m looking forward to.

Leave a comment