OpenAI uses Codex AI to migrate 600 petabytes in two months

OpenAI reportedly used Codex AI to help move 600 petabytes of data in two months in early 2025. Engineers used Codex agents to break down the migration into many small, checkable steps, which may have saved months of manual work. The process still needed humans to supply missing details, supervise, and approve actions, since the AI could miss important context or make mistakes. Reports suggest this hybrid approach lets teams automate more, but there may still be challenges like code errors or cost overruns if teams rely only on the AI.

In a significant project, OpenAI used an internal AI data agent with Codex-based enrichment to make over 70,000 datasets and 600 petabytes of data searchable. According to internal reports, OpenAI engineers leveraged AI agents to automate numerous data management tasks, dramatically reducing a timeline that would have otherwise required extensive manual effort. This undertaking signals a growing trend in data platform management: using large language models to break down complex data operations into small, verifiable, and automatable steps.

AI-Powered Data Management: Making Massive Datasets Searchable

OpenAI used AI agents to automate its data management process. The AI generated numerous pull requests for scripting, reviewing, and executing granular tasks like re-pointing data access layers and orchestration configurations. This approach broke the massive project into smaller, verifiable steps supervised by humans.

The process followed OpenAI's published playbook, a five-step AI adoption framework: Align, Activate, Amplify, Accelerate, and Govern. According to OpenAI guidance, the success of this process hinges more on high-quality context and strict validation than on raw code output. In practice, AI agents generated many pull requests to update task definitions and data access layers. A release agent then staged these changes, monitored performance metrics, and flagged them for human approval. A Forbes coverage notes that a similar agent-based system is used to manage Apache Spark updates, creating incremental changes and halting automatically if telemetry deviates from the norm.

Technical obstacles and how teams mitigated them

The team encountered several significant technical challenges:
- Managing legacy features without direct equivalents in new environments.
- Detecting silent failures where AI-generated code appeared correct but failed on edge cases.
- Overcoming siloed metadata that hindered AI's ability to understand pipeline dependencies.
- Managing rollback risks during large data operations.
- Handling region-specific compliance checks that required manual intervention.

To mitigate these risks, OpenAI emphasized clear, practical governance and avoiding unnecessary manual compliance reviews, while implementing strict validation, shadow testing, and acceptance thresholds as part of rollout. Acknowledging that models can produce syntactically valid but functionally incorrect code, engineers maintained live parity dashboards throughout the entire process to ensure data integrity.

Why the model needed human supervision

While AI significantly accelerated the data management process, human supervision remained critical. OpenAI's own materials emphasize that the AI relies on complete metadata and clearly defined ownership to prevent regressions. Team leads confirmed that engineers had to manually annotate pipelines where data lineage was unclear, providing the necessary context for the AI agent to proceed safely. The project underscores that LLM-assisted data operations are a hybrid effort: the AI automates implementation, while humans are responsible for providing context, validating actions, and authorizing key decisions like rollbacks.

Broader context

This data management approach reflects broader industry trends. While LLM automation promises to compress data operation timelines significantly, experts caution against treating AI output as infallible, citing risks of downtime and cost overruns. OpenAI's success was rooted in a methodology that acknowledges these risks, incorporating incremental changes, constant human-in-the-loop feedback, and persistent project auditing to ensure a reliable outcome.