In this paper, we present a comparative analysis of the leading rule- based information extraction systems in both research and industry, focusing on their main characteristics and their performance. Our evaluation was performed on a dataset of text documents about financial product descriptions from a real-world application scenario. In this study, we demonstrate that, while the considered tools share similarities in terms of expressiveness of their extractors and produce results of comparable quality, the implementation choices of their engines have a substantial impact on their overall execution time. Moreover, we emphasize that some of the considered tools offer seamless support for writing extraction rules, effectively addressing one of the common challenges associated with rule-based approaches.
Dettaglio pubblicazione
2023, Rules and Reasoning. RuleML+RR 2023., Pages 157-165 (volume: 14244)
Comparing State of the Art Rule-Based Tools for Information Extraction (04b Atto di convegno in volume)
Scafoglieri Federico
keywords