Treffer: Discrete prompt optimization using genetic algorithm for secure Python code generation.

Title:
Discrete prompt optimization using genetic algorithm for secure Python code generation.
Authors:
Tony, Catherine1 (AUTHOR) catherine.tony@tuhh.de, Pintor, Maura2 (AUTHOR) maura.pintor@unica.it, Kretschmann, Max1 (AUTHOR) max.kretschmann@tuhh.de, Scandariato, Riccardo1 (AUTHOR) riccardo.scandariato@tuhh.de
Source:
Journal of Systems & Software. Feb2026, Vol. 232, pN.PAG-N.PAG. 1p.
Database:
Business Source Elite

Weitere Informationen

• A discrete prompt optimization pipeline leveraging genetic algorithm, equipped with security-focused scoring and mutation functions for secure code generation. • Introduces two new security-specific LLM-assisted prompt mutation techniques called self-guided and feedback-guided mutation to generate more security oriented variations of code generation prompts. • The security-specific mutation techniques led to prompts with richer security cues compared to the prompts mutated by just using the generic techniques such as paraphrase, back translation and cloze transformation. • Prompts optimized using a combination of security-specific and generic mutation techniques led to the most improvement in the security of the LLM-generated code. • Prompts optimized on one LLM demonstrated lack of transferability to others indicating the importance of model-specific optimizations. Large language models (LLMs) have become powerful tools that enable novice developers to generate production-level code. However, research has highlighted the security risks associated with such code generation, due to the high volume of generated software vulnerabilities. Recent studies have explored various techniques for automatically optimizing prompts to elicit desired responses from LLMs. Among these methods, Genetic Algorithms (GAs), which search for optimal solutions by evolving an initial population of candidates through iterative mutations, have gained attention as a lightweight and effective prompt optimization approach that does not require large datasets or access to model weights. However, their potential has not yet been examined in the context of secure code generation. In this paper, we use GA to develop a discrete prompt optimization pipeline specifically designed for secure code generation. We introduce two domain-specific prompt mutation techniques and assess how incorporating these security-focused mutations alongside general-purpose techniques, such as back translation and paraphrasing, affects the security of Python code generated by LLMs. Results demonstrate that our security-specific mutation techniques led to prompts with richer security context compared to the generic mutation techniques. Furthermore, combining these techniques with generic mutations substantially reduced the number of security weaknesses in the LLM-generated code. We also observed that prompts optimized for a particular LLM tend to perform best on that same model, highlighting the importance of model-specific prompt optimization. [ABSTRACT FROM AUTHOR]

Copyright of Journal of Systems & Software is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)