Fine-tuning LLMs for answer set programming
Erica Coppolillo et al.
Abstract
Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of natural language processing tasks, including code generation. While substantial progress has been made in adapting LLMs to generate code for various imperative programming languages, their effectiveness in handling declarative paradigms, such as Answer Set Programming (ASP), remains largely underexplored. This paper takes a step toward bridging that gap by investigating the potential of LLMs for ASP code generation. We begin with a systematic evaluation of several foundational LLMs, moving towards state-of-the-art models. We show that, despite their extensive training, large parameter counts, and significant computational backing, older models exhibit poor performance in generating syntactically and semantically correct ASP programs, while most recent ones mainly achieve impressive results. However, to overcome the need for huge computational power, we introduce LLASP, a fine-tuned, lightweight model specifically trained to encode ASP programs. In this regard, we extensively explore the effectiveness of fine-tuning by curating several dedicated datasets suitable for ASP encoding with increasing levels of complexity. First, we show that LLASP is effective in encoding template-based core problems in ASP; second, that the training strategy can be pushed forward to disregard the need for templating and make the generation prompt-invariant; and lastly, we show that even complex problems can be effectively encoded, beyond core tasks. Experimental results also show that LLASP significantly outperforms both its non-fine-tuned counterparts and most general-purpose LLMs, particularly in terms of semantic correctness, achieving a good trade-off between accuracy and resource-efficiency. Experimental code is publicly available at: https://github.com/EricaCoppolillo/LLASP .
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.50 × 0.4 = 0.20 |
| M · momentum | 0.50 × 0.15 = 0.07 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.