Meta Acquires 49% of Scale AI for $14.8 Billion to Boost LLM Data

Serge Bulaev

Serge Bulaev

Meta has bought a 49% non-voting stake in Scale AI for $14.8 billion, which may help Meta get more data for its AI models. This deal gives Meta special access to Scale's data and talent, including Scale's founder, Alexandr Wang, who now leads Meta's Superintelligence Labs. Some reports suggest that other AI companies like OpenAI and Google might stop using Scale and switch to other data providers, which could help Scale's rivals. The deal appears to avoid full antitrust review, but regulators might still look into how Meta uses its new position. It is unclear if this move will give Meta a big advantage or change how other companies get their training data.

Meta Acquires 49% of Scale AI for $14.8 Billion to Boost LLM Data

Meta agreed to buy a 49% nonvoting stake in Scale AI for about $14.8 billion, with the goal of strengthening its AI and data capabilities, while Scale remains independent. The unusual non-voting structure, detailed in a June 2025 Reuters report, allows Meta to gain privileged access to the data-labeling leader and bring founder Alexandr Wang into its executive ranks, while strategically aiming to sidestep a full antitrust review.

What Meta Wanted Most: Data, Feedback, and Talent

Meta acquired a 49% stake in Scale AI to secure a stable supply of high-quality, human-curated training data. This strategic investment aims to bolster its Reinforcement Learning from Human Feedback (RLHF) capabilities, which are critical for developing more advanced large language models and staying competitive.

To remain at the forefront of LLM research, Meta identified "human curated training data at massive scale" as a critical capability, especially for reinforcement learning from human feedback (RLHF) workflows link. Scale's reputation for high-quality labeling and evaluation pipelines directly addresses this need. The deal provides Meta with:

  • Guaranteed access to the high-volume data labeling capacity that other frontier labs rely on.
  • An industry-proven RLHF stack to integrate into its existing model-training infrastructure.
  • Key senior talent, led by Wang, who now serves as Chief AI Officer and head of Meta Superintelligence Labs.

Wang's Remit One Year In

As head of Meta's Superintelligence Labs, Alexandr Wang now directs the company's frontier model development. His first major project, Muse Spark, is a proprietary foundational model designed for tight integration across Meta's product ecosystem, including Facebook, Instagram, and Ray-Ban Meta glasses. In public appearances, Wang continues to frame Meta's long-term goal as achieving "personal superintelligence."

Wang's reported focus includes accelerating Meta's next model cycles after Llama, embedding multimodal models into Meta products, and using them for health-related queries/applications, but the sources do not establish consumer health advice as a declared primary focus area.

Competitive Ripple Effects in the Data-Labeling Market

The acquisition immediately triggered competitive shifts in the data-labeling market. Citing concerns over neutrality, major AI labs like OpenAI, Google, and xAI reportedly paused or reduced their reliance on Scale AI. According to reports in Business Insider and Computerworld, this created a significant market opening for independent providers.

Key beneficiaries of this market disruption include:

  • Surge AI: A specialist in RLHF and high-quality NLP data.
  • Appen: A generalist provider whose CEO described the event as a "huge opportunity."
  • Labelbox and Mercor: Platform-focused competitors reporting a significant increase in client inquiries.

This shift indicates a growing market preference for data vendors that can guarantee independence from major tech giants while offering sophisticated RLHF tooling.

Regulatory and Ecosystem Watchpoints

The deal's structure is under regulatory scrutiny. While the 49% stake was likely designed to avoid a mandatory merger probe, regulators may still investigate the partnership's competitive implications. Reuters framed the investment as a "latest test of AI partnerships," highlighting concerns over Meta gaining privileged access to competitors' data workflows via its influence over Scale.

For the broader AI ecosystem, the key questions remain: Will leading AI labs permanently shift their data contracts away from Scale? And will Meta's vertical integration strategy yield a measurable advantage in its upcoming LLM benchmarks? Industry observers are closely tracking model performance, the commercial adoption of Muse Spark, and any regulatory responses that could redefine the boundaries of strategic investments in critical AI infrastructure.