Unlocking Financial Insights with AI

Author: Denis Avetisyan


New research explores how artificial intelligence can automatically extract key relationships from complex financial documents.

This review details methods for Large Language Model-based triplet extraction from financial reports, focusing on schema adaptation and faithfulness verification without manual labeling.

Extracting structured knowledge from corporate filings remains challenging due to a lack of labeled data for reliable evaluation. The paper ‘LLM-based Triplet Extraction from Financial Reports’ introduces a semi-automated pipeline leveraging Large Language Models to extract Subject-Predicate-Object triplets, employing ontology-driven metrics to bypass the need for ground truth. Results demonstrate that automatically induced ontologies achieve perfect schema conformance and a hybrid verification strategy-combining regex with an LLM-as-a-judge-reduces subject hallucination rates from 65.2% to 1.6%. Given observed asymmetries in subject and object hallucinations linked to financial prose style, can these findings inform strategies for improving knowledge graph construction and reasoning in other domains with similar linguistic characteristics?


Extracting Meaning from the Unstructured

The modern business landscape necessitates data-driven strategies, and organizations are finding crucial intelligence increasingly resides not in neatly organized databases, but within vast repositories of unstructured data. Corporate annual reports, alongside sources like news articles, market research, and internal memos, offer a rich, albeit complex, source of information regarding performance, risk, and future outlook. This reliance stems from the recognition that traditional, structured data often paints an incomplete picture; nuanced details and forward-looking statements are frequently embedded within these textual formats. Consequently, the ability to effectively leverage these unstructured sources is becoming a key differentiator, enabling more informed decision-making, improved risk assessment, and a deeper understanding of competitive dynamics.

The conversion of raw, unstructured data into actionable intelligence increasingly relies on the construction of Knowledge Graphs (KGs). These KGs represent information as interconnected entities and relationships, offering a far more nuanced understanding than traditional databases. Central to this process is Triplet Extraction (TE), which identifies subject-predicate-object relationships within the text – for example, “Apple acquired Beats Electronics”. Accurate TE is paramount; errors or omissions in these triplets directly impact the KG’s integrity and, consequently, the validity of any insights derived from it. Sophisticated natural language processing techniques, including named entity recognition and relation classification, are employed to automate TE, but challenges remain in handling ambiguity, context, and the sheer volume of data present in real-world documents. Ultimately, the effectiveness of a KG hinges on its ability to accurately reflect the underlying information, making robust and precise TE a foundational component.

The Peril of Untruthfulness in Knowledge Construction

Faithfulness in knowledge graph (KG) construction refers to the extent to which asserted triples – subject, relation, and object – are verifiably supported by the source text used for extraction. A faithful KG accurately reflects the information present in the source documents, avoiding the introduction of unsupported or fabricated claims. Quantifying faithfulness is challenging, as it requires assessing the semantic alignment between extracted facts and the supporting evidence within the text. Low faithfulness directly impacts the reliability and usability of the KG, leading to inaccurate inferences and potentially flawed downstream applications. Therefore, faithfulness is a critical metric for evaluating the quality of any KG built through automated information extraction processes.

Knowledge graph (KG) construction is susceptible to “hallucinations,” which represent inaccuracies where extracted information lacks support from the source text and thus compromise faithfulness. These hallucinations manifest in three primary forms: Subject Hallucination (SH), where entities mentioned in the extracted triples do not appear in the source text; Object Hallucination (OH), where the values associated with relations are unsupported by the source; and Relation Hallucination (RH), where the stated relationship between entities is not present or inferable from the source text. The presence of any of these hallucination types directly degrades the quality and reliability of the constructed knowledge graph, impacting downstream applications reliant on accurate information.

Pattern matching techniques, while effective for identifying explicit relationships within text, struggle with nuanced errors in knowledge extraction that compromise faithfulness. These methods typically rely on predefined templates or regular expressions to locate and extract triples, and therefore fail when hallucinations – such as the introduction of entities or relations not present in the source – occur with slight variations or paraphrasing. Because pattern matching focuses on surface-level textual cues, it cannot assess the semantic validity of extracted information against the original source, leading to the acceptance of incorrect or unsupported facts. Consequently, reliance on pattern matching alone results in a high rate of false positives and undermines the reliability of the constructed knowledge graph.

Establishing Integrity: Ontology and Validation Techniques

A defined ontology serves as a formal representation of knowledge, establishing a vocabulary of entities, relations, and their properties, which is crucial for validating the accuracy of extracted triples. This validation process mitigates Relation Hallucination (RH), where a knowledge graph incorrectly asserts relationships not supported by the source data or the defined schema. By comparing extracted triples against the ontology, inconsistencies and unsupported relations can be identified and flagged, ensuring that the resulting knowledge graph adheres to a consistent and verifiable structure. The ontology effectively acts as a constraint on the extracted information, preventing the introduction of spurious or inaccurate relationships and thus improving the overall trustworthiness of the knowledge base.

Ontologies, serving as structured knowledge frameworks, can be constructed through two primary approaches: manual curation and automated generation. Manual Ontology construction involves expert knowledge engineers defining concepts and relationships, offering high precision but proving time-consuming and potentially lacking scalability. Conversely, Automatic Ontology generation leverages text analysis techniques to dynamically create ontologies from unstructured data, enabling rapid development and adaptation to evolving knowledge domains. However, automatically generated ontologies may exhibit lower precision than their manually crafted counterparts, requiring careful evaluation and refinement. The choice between these methods depends on the specific application requirements, balancing the need for accuracy against the demands of speed and scalability.

Verification of extracted knowledge relies on techniques such as Regex Matching and utilizing Large Language Models (LLMs) as evaluators. Regex Matching establishes entity existence through pattern recognition within the source text, providing a rule-based confirmation. LLM-as-a-Judge assesses the Faithfulness of extracted triples – that is, whether the relationship expressed in the triple is logically supported by the source text – offering a more nuanced evaluation of semantic correctness. These methods function as critical checks against both Subject Hallucination (SH) and Relation Hallucination (RH) by independently confirming the validity of entities and relationships before they are incorporated into the knowledge base.

Ontology Conformance (OC) is a quantifiable metric used to assess the alignment between extracted knowledge and a pre-defined ontology, effectively gauging knowledge quality. In our research, employing an automatic ontology strategy – denoted as \mathcal{O}_{Auto} – consistently achieved 100% Ontology Conformance across all experimental configurations. This indicates that the information extracted by our system, when utilizing \mathcal{O}_{Auto} , fully adheres to the established schema and constraints defined within the ontology, providing a high degree of confidence in the validity and structure of the extracted knowledge.

Research findings indicate zero instances of Subject Hallucination (SH) were observed when utilizing the 𝒪Auto ontology strategy in conjunction with model ℳ1, demonstrating a high degree of faithfulness in extracted subject information. Furthermore, across all evaluated models, the implementation of 𝒪Auto resulted in 0% Relation Hallucination (RH), signifying substantial improvement in adherence to the defined knowledge schema and a reduction in the generation of unsupported relationships between entities. These results collectively suggest that the 𝒪Auto strategy effectively constrains the output of knowledge extraction models, promoting the generation of verifiable and schema-consistent triplets.

The hybrid verification method demonstrated a substantial reduction in falsely identified Subject Hallucinations (SH) compared to a strict regular expression (Regex) baseline. This improvement stems from the method’s ability to leverage multiple verification signals, allowing it to more accurately discern true positives from false positives. Specifically, the hybrid approach corrected a majority of the false positives generated by the Regex baseline, indicating a more robust and nuanced evaluation of subject entity validity. This enhanced performance suggests the hybrid method offers a more reliable approach to mitigating SH and improving the overall faithfulness of extracted knowledge triples.

The pursuit of knowledge graph construction from financial reports, as detailed in the study, necessitates a ruthless prioritization of information. The core challenge lies not in adding complexity through expansive schema induction, but in distilling financial data into fundamental, verifiable triplets. As Brian Kernighan observed, “Complexity is vanity.” This sentiment perfectly encapsulates the study’s focus on minimizing hallucination and maximizing faithfulness. The research actively seeks to remove extraneous data and potential errors, aligning with the principle that true intelligence resides in simplicity, not in elaborate, potentially misleading constructions. The study demonstrates that a focus on essential relationships, rather than exhaustive detail, yields a more robust and reliable knowledge representation.

Future Directions

The pursuit of automated knowledge graph construction from financial text, as demonstrated, reveals less a solved problem and more a refinement of existing challenges. The minimization of ‘hallucination’-a polite term for confabulation-remains paramount. Current schema adaptation, while promising, operates as damage control, not prevention. The field must confront the underlying issue: Large Language Models are, at their core, stochastic parrots. Faithfulness metrics offer post-hoc assessment, but proactive constraint-forcing the model to reason within verifiable boundaries-is the logical, if difficult, extension.

Ontology induction, currently treated as a black box, demands increased transparency. The model doesn’t merely discover relationships; it imposes them, guided by internal biases and training data artifacts. Future work should prioritize methods for auditing this inductive process – identifying why a particular triple was extracted, not simply that it was. Unnecessary complexity in this area is violence against attention; simpler, more interpretable models will ultimately prove more valuable.

The long-term trajectory suggests a shift from purely data-driven extraction to a hybrid approach. Integrating symbolic reasoning with Large Language Models-a controlled synthesis of statistical power and logical rigor-offers a potential, though arduous, path. Density of meaning is the new minimalism; the goal is not more data, but more truth distilled from it.


Original article: https://arxiv.org/pdf/2602.11886.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-02-13 21:37