Grammar-obeying program synthesis: A novel approach using large language models and many-objective genetic programming

Research output: Contribution to a Journal (Peer & Non Peer)Articlepeer-review

5 Citations (Scopus)

Abstract

Program synthesis is an important challenge that has attracted significant research interest, especially in recent years with advancements in Large Language Models (LLMs). Although LLMs have demonstrated success in program synthesis, there remains a lack of trust in the generated code due to documented risks (e.g., code with known and risky vulnerabilities). Therefore, it is important to restrict the search space and avoid bad programs. In this work, pre-defined restricted Backus–Naur Form (BNF) grammars are utilised, which are considered ‘safe’, and the focus is on identifying the most effective technique for grammar-obeying program synthesis, where the generated code must be correct and conform to the predefined grammar. It is shown that while LLMs perform well in generating correct programs, they often fail to produce code that adheres to the grammar. To address this, a novel Similarity-Based Many-Objective Grammar Guided Genetic Programming (SBMaOG3P) approach is proposed, leveraging the programs generated by LLMs in two ways: (i) as seeds following a grammar mapping process and (ii) as targets for similarity measure objectives. Experiments on a well-known and widely used program synthesis dataset indicate that the proposed approach successfully improves the rate of grammar-obeying program synthesis compared to various LLMs and the state-of-the-art Grammar-Guided Genetic Programming. Additionally, the proposed approach significantly improved the solution in terms of the best fitness value of each run for 21 out of 28 problems compared to G3P.

Original languageEnglish
Article number103938
JournalComputer Standards and Interfaces
Volume92
DOIs
Publication statusPublished - Mar 2025

Keywords

  • Grammar
  • Grammar guided GP
  • LLMs
  • Multi-objective
  • Program synthesis

Fingerprint

Dive into the research topics of 'Grammar-obeying program synthesis: A novel approach using large language models and many-objective genetic programming'. Together they form a unique fingerprint.

Cite this