DOI: 10.24818/jamis.2026.01003
Vol. 25, No. 1, pp. 70-105, 2026
© 2026. This work is openly licensed via CC BY 4.0.
Author(s): Florina G. Huttera and Juliane Wutzler1,b
a Hasso Plattner Institute, Germany
b Worms University of Applied Sciences, Germany
1 Corresponding author: Juliane Wutzler, Area of Tourism and Transportation, Worms University of Applied Sciences, Erenburgerstraße 19, 67549 Worms, Germany, Tel: +49 6241 509118, email addresses: wutzler@hs-worms.de.
Keywords: generative artificial intelligence, large language models, prompting, data
extraction, accounting systems
JEL codes: M49
Abstract
Research Question: Can Large Language Models (LLMs) be used as low-cost tools to efficiently and effectively extract data from heterogeneous sources fed into accounting systems and processes?
Motivation: Accounting departments use a variety of data from a wide range of sources and feed them as inputs into their accounting systems and processes. Extracting such data often requires manual effort. Large Language Models may be a low-cost way to extract data without substantial upfront investments. Prior literature documents the potential of LLMs for data extraction in other domains or for long and largely semantic accounting documents. While these guidelines may be transferable to semantic data feeding into accounting systems, such as order emails, they are likely not directly transferable to non-semantic, semi-structured data sources, such as invoices.
Idea: In our proof-of-concept, we test whether general prompting guidelines from prior literature apply to both non-semantic but semi-structured (i.e., invoices) and semantic but unstructured data sources (i.e., order emails) used as inputs into the accounting system. We then identify these issues and derive guidelines for practical use.
Data: A synthetic dataset consisting of 46 heterogeneous PDF invoices and 10 order emails in Outlook format was created. Synthetic data allow explicitly including challenging variations and eliminate data privacy concerns.
Tools: Following design science research, we test and improve LLM (Mixtral-8x7B) prompts for Named Entity Recognition derived from prior literature to establish accounting-specific prompt guidelines.
Findings: Using Large Language Models to extract data that can be used as inputs into accounting processes requires case-specific adjustments to general prompting guidelines derived from the literature. We develop solutions to problems resulting from general prompting guidance and define transferable strategies for creating prompts that allow the extraction of data from semantic and non-semantic accounting data sources.
Contribution: We provide guidance on how LLMs can extract data for use in accounting systems. Data sources differ substantially from those in prior literature.
Full paper at: Full paper
