Meta Updates AI Content Rules, Emphasizes Labeling Over Takedowns

Serge Bulaev

Serge Bulaev

Meta is changing how it handles AI-made content on Facebook, Instagram, and Threads. Instead of deleting these posts, Meta will now label them as "Made with AI" to help people know what's real and what's not. This is important, especially in a big election year, because fake images, videos, and voices could trick voters. Meta hopes these labels and slower spreading of false posts will stop confusion. But, some worry that people might ignore too many labels or that fake content will still slip through.

Meta Updates AI Content Rules, Emphasizes Labeling Over Takedowns

Meta is updating its AI content rules across Facebook, Instagram, and Threads, shifting from removing manipulated media to labeling it. This major policy change, which begins in May 2024, introduces "Made with AI" labels for synthetic content. The move prioritizes transparency over takedowns, a crucial distinction in a year with over 50 national elections where experts fear AI-driven misinformation could influence voters.

From Takedowns to Transparency: The Key Policy Shifts

Meta's new AI content policy shifts from automatically removing manipulated media to applying "Made with AI" labels. This change, driven by user feedback and Oversight Board recommendations, prioritizes transparency. Content is now only removed if it also violates other community standards like voter interference or harassment.

The policy overhaul was driven by stakeholder feedback, including a conclusion from Meta's Oversight Board that the previous 2020 rule was too narrow. After consulting 120 organizations across 34 countries, Meta expanded its policy beyond manipulated speech to cover AI-generated photos, audio, and videos depicting people doing things they never did. Citing survey data that 82% of users prefer labels over removal, Meta made transparency the default, as detailed in its announcement about.fb.com. If independent fact-checkers deem content False or Altered, it receives a more prominent overlay and is demoted in user feeds.

Challenges in Detection, Trust, and 'Label Fatigue'

Experts warn of a persistent arms race between deepfake generators and detection tools. While automated detectors struggle to keep pace with new synthesis methods, human detection is even less reliable, especially with video. Meta's policy relies on a multi-layered approach to identification:

  • Technical detection via C2PA watermarks
  • Creator self-disclosure through platform tools
  • Manual review from nearly 100 fact-checking partners

However, any delay in this process could allow deceptive content to go viral before being labeled. Furthermore, policy experts raise concerns about 'label fatigue,' where users may begin to ignore the constant "Made with AI" warnings, a risk analyzed by Tech Policy Press techpolicy.press.


What exactly is changing in Meta's AI-content rules?

Starting in May 2024, Meta replaced its 2020 "remove first" rule for deepfakes with a label-first model. Videos, photos and audio that are flagged as AI-made now stay online but carry a visible "Made with AI" or "Imagined with AI" label. Content is only removed if it also violates a separate policy such as voter interference or harassment.

How does Meta decide which posts get an AI label?

The company uses a three-step check:
1. Industry-shared metadata or watermarks embedded by tools like DALL-E
2. User self-disclosure when they upload AI-made ads or organic posts
3. Fact-checker review - when nearly 100 independent partners rate something "False" or "Altered", Meta down-ranks the post and adds an overlay instead of deleting it

Why did Meta shift from takedowns to labels?

Meta's own Oversight Board said the old policy was "too restrictive" and risked silencing legitimate speech. Public surveys across multiple countries showed 82% of respondents wanted context labels rather than removal, prompting the change.

Will these rules really protect elections?

Prominent safeguards include:
- Prominent warning labels for any AI content that could deceive voters on "matters of public importance"
- Special election integrity teams that can still remove posts if they break other rules
- Encrypted C2PA signals baked into images from major creation tools, making it harder for fakes to slip through

Still, the system is reactive: content spreads instantly while detection can take hours of forensic work, and legal remedies only apply after harm is done.

How do Meta's rules compare to other major platforms?

OpenAI already bans custom chatbots that impersonate candidates and plans to cryptographically sign every DALL-E image. Apple and X say they will "consider" similar detection tags, but no universal standard exists yet. Industry observers warn the "liar's dividend" - where real footage is dismissed as fake - may grow until every major network adopts the same tamper-evident metadata chain.