Treffer: Refactoring Python Code with LLM-Based Multi-Agent Systems: An Empirical Study in ML Software Projects

Title:

Refactoring Python Code with LLM-Based Multi-Agent Systems: An Empirical Study in ML Software Projects

Authors:

Alexander Puma Pucho, Alexandre Mello Ferreira, Elder José Reioli Cirilo, Bruno B. P. Cafeo

Source:

Anais do XXXIX Simpósio Brasileiro de Engenharia de Software (SBES 2025). :678-684

Publisher Information:

Sociedade Brasileira de Computação, 2025.

Publication Year:

2025

Document Type:

Fachzeitschrift Article

DOI:

10.5753/sbes.2025.11033

Accession Number:

edsair.doi...........e2bf225ba1fb932d056ca5c41a609f9c

Database:

OpenAIRE

Weitere Informationen

Refactoring is essential for improving software maintainability, yet it often remains a validation-intensive and developer-guided task—particularly in Python projects shaped by fast-paced experimentation and iterative workflows, as is common in the machine learning (ML) domain. Recent advances in large language models (LLMs) have introduced new possibilities for automating refactoring, but many existing approaches rely on single-model prompting and lack structured coordination or task specialization. This study presents an empirical evaluation of a modular LLM-based multi-agent system (LLM-MAS), orchestrated through the MetaGPT framework, which enables sequential coordination and reproducible communication among specialized agents for static analysis, refactoring strategy planning, and code transformation. The system was applied to 1,719 Python files drawn from open-source ML repositories, and its outputs were compared against both the original and human-refactored versions using eight static metrics related to complexity, modularity, and code size. Results show that the agent consistently produces more compact and modular code, with measurable reductions in function length and structural complexity. However, the absence of a validation agent led to 281 syntactically invalid outputs, reinforcing the importance of incorporating semantic and syntactic verification to ensure transformation correctness and build trust in automated refactoring. These findings highlight the potential of LLM-based multi-agent systems to automate structural code improvements and establish a foundation for future domain-aware refactoring in ML software.

Treffer: Refactoring Python Code with LLM-Based Multi-Agent Systems: An Empirical Study in ML Software Projects

Weitere Informationen

Links

Zusatz-Funktionen