Security Weaknesses of Copilot Generated Code in GitHub
arxiv(2023)
摘要
Modern code generation tools, utilizing AI models like Large Language Models
(LLMs), have gained popularity for producing functional code. However, their
usage presents security challenges, often resulting in insecure code merging
into the code base. Evaluating the quality of generated code, especially its
security, is crucial. While prior research explored various aspects of code
generation, the focus on security has been limited, mostly examining code
produced in controlled environments rather than real-world scenarios. To
address this gap, we conducted an empirical study, analyzing code snippets
generated by GitHub Copilot from GitHub projects. Our analysis identified 452
snippets generated by Copilot, revealing a high likelihood of security issues,
with 32.8
span 38 different Common Weakness Enumeration (CWE) categories, including
significant ones like CWE-330: Use of Insufficiently Random Values, CWE-78: OS
Command Injection, and CWE-94: Improper Control of Generation of Code. Notably,
eight CWEs are among the 2023 CWE Top-25, highlighting their severity. Our
findings confirm that developers should be careful when adding code generated
by Copilot and should also run appropriate security checks as they accept the
suggested code. It also shows that practitioners should cultivate corresponding
security awareness and skills.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要