Journal ArticleUnknown

An Empirical Study of Code Smells in Transformer-based Code Generation Techniques

Authors

Mohammed Latif Siddiq, Shafayat Hossain Majumder, Maisha R. Mim, Sourov Jajodia, …

Author Affiliations

University of Notre Dame, Bangladesh University of Engineering and Technology

Year2022

Citations58

DOI10.1109/scam55253.2022.00014

Abstract

Prior works have developed transformer-based language learning models to automatically generate source code for a task without compilation errors. The datasets used to train these techniques include samples from open source projects which may not be free of security flaws, code smells, and violations of standard coding practices. Therefore, we investigate to what extent code smells are present in the datasets of coding generation techniques and verify whether they leak into the output of these techniques. To conduct this study, we used Pylint and Bandit to detect code smells and security smells in three widely used training sets (CodeXGlue, APPS, and Code Clippy). We observed that Pylint caught 264 code smell types, whereas Bandit located 44 security smell types in…

View at Publisher

BORR does not host full-text PDFs. The button above takes you to the original publisher.

Fields & Keywords

Physical Sciences Computer Science Information Systems Software Engineering Research Advanced Malware Detection Techniques Software Reliability and Analysis Research Computer security Programming language Electrical engineering Statistics