Veracode: 45% of AI-generated code snippets contain security flaws

Serge Bulaev

Serge Bulaev

Recent research suggests that 45% of AI-generated code snippets may contain security flaws. While AI tools can help teams create features quickly, proving that the code is safe and reliable often takes much longer. Some studies show that AI assistance leads to more completed tasks without lowering code quality, but others report more bugs and duplicated code. Experts recommend treating all AI-generated code as untrusted and using careful review and testing to find problems. Many teams are now adding security checks early, having experts review risky changes, and being careful about where and how they use AI in critical systems.

Veracode: 45% of AI-generated code snippets contain security flaws

Recent studies show many AI-generated code snippets contain security flaws, creating a new bottleneck for development teams. While large language models (LLMs) can generate code in minutes, proving that code is safe and reliable can take significantly longer, creating a risk profile where rapid output is slowed by manual assurance.

Scale of the security gap

Studies on AI-assisted coding show conflicting quality outcomes. Some analyses link AI use to more duplicated code and churn, while other enterprise data reports higher task completion with no drop in quality. These mixed signals suggest outcomes are heavily dependent on a team's specific workflow and review processes.

Repositories adopting AI assistants show mixed results. For example, one analysis found that higher AI adoption correlated with more duplicated lines, less refactoring, and increased code churn (GitClear). In contrast, industry reports from major corporate Copilot rollouts indicate significant increases in completed tasks with no observable decline in code quality, suggesting that workflow design is a key differentiator.

Closing the Critical Verification: New Bottleneck in AI-Assisted Development

Experts advise treating all AI-generated code as untrusted by default. A secure, evidence-driven review process involves several key stages:

  • Define clear requirements and identify security-critical paths before generating code.
  • Automate static analysis (SAST) and dependency scans for all AI-generated outputs.
  • Manually prioritize the review of high-risk logic, such as authentication, authorization, and data handling.
  • Require verifiable proof, like executable tests or logs for edge cases, instead of simple explanations.
  • Maintain detailed logs of prompts, model versions, and review decisions for auditability.

The underlying principle is to demand verifiable evidence of security, as code that merely compiles is not sufficient until it has passed rigorous misuse case testing.

Productivity findings require context

The impact of AI assistants on developer productivity is not straightforward. While they reduce time spent typing, overall throughput gains vary. For instance, industry reports observe significant rises in weekly commits and substantial jumps in compile counts, suggesting faster iteration cycles. However, other studies found little change in overall cycle time alongside notable increases in bugs when AI-generated code was not thoroughly checked.

This suggests that short-term speed-ups can be offset by long-term maintenance costs from insecure or duplicated code. Ultimately, the depth of code verification - not the speed of generation - is the key factor determining net productivity gains.

What teams are doing now

To mitigate these risks, leading development teams are adopting several key strategies:

  1. Integrating security scans directly into the IDE to catch flaws early in the development lifecycle.
  2. Implementing mandatory secondary reviews by a domain expert for high-risk code changes.
  3. Archiving complete prompt-and-response histories to support auditing and incident analysis.
  4. Restricting the use of AI assistants for critical modules like cryptography and payments until robust testing protocols are established.

These practices all underscore a central theme: code generation has become easy, but proving its security remains difficult. As AI coding tools continue to evolve, the primary bottleneck will likely remain on verification. Overcoming this challenge requires a strategic investment in both automated security gates and disciplined, expert human oversight.