NIST Updates AI Risk Framework, Targets Third-Party Model Security

Serge Bulaev

Serge Bulaev

The updated NIST AI Risk Management Framework now includes third-party AI models in its inventory and classification step, meaning these outside models may need to be checked as carefully as in-house systems. There may be hidden security risks, like covert telemetry, that can sometimes be found with special model and hardware checks. New regulations and policies, such as the EU AI Act and guidelines from the Treasury Department, suggest organizations should regularly monitor, review, and control these models, especially those affecting customer decisions. Legal protections for AI outputs may remain weak, so contracts might now include rules against copying models and require audits if there are problems. Technical safeguards like sandboxing and encryption appear to help keep foreign AI models secure while allowing companies to keep using new tools.

NIST Updates AI Risk Framework, Targets Third-Party Model Security

The updated NIST AI Risk Management Framework now targets third-party AI model security, signaling that external code must be audited with the same rigor as in-house systems. As enterprises integrate foreign large language models into core workflows, they face rising legal, compliance, and security risks, including the bypass of geographic restrictions creating significant blind spots for security teams.

A practical review of current red flags, controls, and contractual levers follows.

Covert telemetry and location checks

Key risks in using third-party AI models stem from hidden security vulnerabilities like covert telemetry, failure to comply with new regulations such as the EU AI Act, and intellectual property theft via model distillation. These threats necessitate a multi-layered mitigation strategy combining technical, contractual, and governance controls.

Hidden telemetry is moving from speculation to observable pattern. Microsoft researchers reported that backdoored models exhibit a distinctive "Double Peak" attention signature when a secret trigger fires. Their latent-space scanner shakes the model with adversarial perturbations until that brittle path becomes visible, then kills execution. For teams without in-house interpretability experts, hardware-level anomaly detection offers a fallback: one study projects telemetry into a principal-component space and raises an alert when Mahalanobis distance spikes past the 99th percentile. These methods sit well in a sandbox that blocks outbound traffic until the model passes inspection.

Continuous governance frameworks

Voluntary guidance is converging with regulation. The NIST Generative AI Profile (NIST.AI.600-1) recommends continuous monitoring and drift detection using various metrics including KS tests, but does not prescribe a specific monitoring architecture. On the regulatory side, the EU AI Act entered into force in August 2024, with mandatory conformity assessments for high-risk systems becoming effective in August 2025. Sector rules add more depth: financial institutions face evolving regulatory guidance that includes third-party risk considerations for AI systems.

According to industry reports, enterprises are running a growing number of GenAI applications. This suggests inventories must expand beyond formal procurements to capture "shadow AI" tools adopted by individual teams. Risk tiering becomes essential: low-impact chatbots may merit quarterly reviews, while any model that touches customer credit decisions needs real-time drift dashboards.

Contractual safeguards against distillation and data leakage

Copyright protection for model outputs remains weak. Legal scholars note that distillation extracts functional patterns, not expressive content, so breach-of-contract claims are the primary recourse. Major vendors already embed anti-distillation language: Anthropic prohibits users from utilizing inputs and outputs to train other AI models for commercial or competing purposes and reserves the right to terminate accounts. OpenAI uses similar wording that prohibits using their service to develop competing models.

Policy discussions around AI security continue to evolve. Enterprises procuring offshore models now face questions about export-control exposure and vendor provenance. Clauses worth adding to master service agreements include:

  • Audit rights covering source weights and training data
  • Explicit bans on distillation, model scraping, or competing development
  • Notification triggers for breaches that expose inference logs or user prompts
  • Termination rights tied to regulatory non-compliance (EU AI Act, sector RMFs)

Technical deployment controls

Sandboxing remains the first line of defense. Teams can route initial calls through a controlled egress layer that strips location metadata and enforces rate limits. Watermarking methods that inspect packet timing alone help spot rogue instances even when traffic is encrypted. For production, homomorphic encryption plus federated learning allow limited analytics on encrypted data, preserving confidentiality while verifying that models operate only inside approved regions.

By combining architectural scans for sleeper agents, framework-aligned monitoring, and contracts that anticipate IP misuse, security and legal officers gain an integrated playbook for vetting foreign AI suppliers without blocking innovation.


What specific updates did NIST make to address third-party AI model risks?

NIST has expanded its AI Risk Management Framework with guidance that specifically targets risks from vendor-supplied models and Large Language Systems. The framework now recommends that organizations inventory and classify all AI systems, explicitly including vendor-provided ones, shadow AI tools, and embedded ML components. For technical controls, NIST provides guidance on continuous monitoring and drift detection using various metrics, plus considerations for dynamic vendor models.


Why is third-party model security now considered a national security priority?

The threat landscape continues to evolve with growing concerns about AI security. Policy discussions around systematic capability extraction from frontier AI systems have intensified, with various legislative proposals under consideration to strengthen protections against unauthorized model replication and misuse. Industry leaders have specifically asked legislators to strengthen export controls so foreign adversaries cannot harvest and repackage U.S. frontier AI capabilities.


What contractual protections should enterprises include when procuring external AI models?

Given that copyright law offers little protection against model distillation, contractual clauses have become the primary enforcement mechanism. Enterprises should require:

  • Explicit anti-distillation language prohibiting use of outputs to train competing models (following Anthropic's and OpenAI's lead)
  • Clear documentation of data-handling practices, consent mechanisms, and limitations on data reuse
  • Audit rights for technical verification of model provenance
  • Vendor attestation of holistic controls, ideally with AI-focused addenda to SOC 2 reports

Many enterprises now run significant numbers of GenAI applications, making centralized vendor scrutiny essential.


How can technical teams detect hidden telemetry or unauthorized location checks in third-party models?

Engineering teams should implement proactive architectural auditing rather than reactive behavioral monitoring:

Detection Method Purpose Key Technique
Double Peak Pattern Analysis Detect sleeper agents/backdoors Latent Adversarial Training on attention heads
Sparse Autoencoders (SAEs) Secret elicitation TF-IDF-style weighting on residual activations
Packet Timing Analysis Imposter detection Sequence-of-length metadata extraction
Homomorphic Encryption Location verification CKKS scheme for math on encrypted data

Microsoft's research identified that backdoored models exhibit a distinct "Double Peak" Attention Pattern in internal attention heads when encountering secret triggers - enabling real-time "neural firewall" detection.


What governance framework should enterprises adopt to manage third-party AI risks comprehensively?

Organizations should align with converging standards:

  1. NIST AI Risk Management Framework - foundational voluntary base with generative AI guidance for vendor-specific considerations
  2. EU AI Act - regulatory mandate that entered into force in August 2024, with conformity assessments for high-risk systems effective from August 2025
  3. ISO/IEC 42001 - international standard for AI Management Systems providing frameworks for documenting third-party model behaviors, training data sources, and known limitations

Financial institutions face additional evolving regulatory guidance that includes dedicated third-party risk considerations for AI systems.