Azure AI workloads protected with RHEL confidential virtual machines
In this blog, we will learn how to protect Azure AI Workloads with RHEL confidential virtual machines.
Artificial Intelligence (AI) is reshaping numerous sectors—including healthcare, finance, national defense, and autonomous technologies. As AI continues to expand its footprint, many organizations are migrating AI operations to the cloud to benefit from its elasticity, speed of deployment, and lower costs. However, this shift introduces new concerns around privacy, proprietary information protection, and regulatory compliance.
Limitations of Traditional Virtual Machines
While virtual machines (VMs) offer isolation between workloads, they fall short when it comes to protecting sensitive data from privileged system components. System administrators, hypervisors, or anyone with physical access to the hardware can potentially expose sensitive data, even when it’s processed in a VM. This is especially risky for AI applications that manage private information like patient records, financial transactions, or proprietary models.
Addressing the Security Gap with Confidential Virtual Machines
Confidential Virtual Machines (CVMs) help close this security loophole. Built on hardware-based memory encryption and isolation technologies, CVMs ensure that even cloud service providers cannot access sensitive in-memory data. This makes CVMs ideal for securely running AI models in cloud environments.
This article examines how Confidential Virtual Machines (CVMs) enhance the security of AI workloads in public cloud settings, using a fraud detection scenario with sensitive financial data as a practical example.
To learn more, visit: [Red Hat Confidential Virtual Machines](https://www.redhat.com)
What Are Confidential Virtual Machines?
CVMs are enhanced VMs that utilize confidential computing technologies. Unlike standard VMs, which rely primarily on software-level isolation and trust in the cloud infrastructure, CVMs utilize hardware-level safeguards to ensure data confidentiality and workload integrity.
Confidential computing specifically focuses on safeguarding data while it’s actively being processed in memory—a state known as data in use.
There are three states of data protection:
- Data at rest refers to information saved on storage media, typically safeguarded through encryption solutions such as LUKS (Linux Unified Key Setup).
- Data in transit: Moving through networks, secured via protocols like TLS or OpenSSL.
- Data in use: Actively residing in system memory and being handled by the CPU—this is the core area that confidential computing aims to protect.
Even if storage and network layers are encrypted, an attacker who gains access to memory could compromise unprotected data in use. Confidential computing solves this by ensuring that even memory is protected.
Trusted Execution Environments (TEEs)
Confidential Virtual Machines (CVMs) operate within Trusted Execution Environments (TEEs)—dedicated hardware-based zones designed to offer memory encryption, ensure runtime integrity, and support attestation mechanisms. The encryption keys used are managed by the CPU and cannot be accessed externally, ensuring that no third party, including hypervisors or cloud admins, can view or manipulate the data.
Widely used hardware technologies for confidential computing include **AMD SEV-SNP** and **Intel TDX**.
Comparing Isolation and Confidentiality
We can break down three different types of isolation:
- Workload-to-workload isolation: Provided by traditional VMs to separate workloads from each other.
- Workload-to-host isolation: Traditional VMs protect the host from potentially malicious workloads.
- Host-to-workload isolation: Unique to CVMs, this protects sensitive workloads from being accessed or tampered with by the underlying infrastructure.
CVMs make it possible for organizations to confidently run sensitive AI workloads in shared cloud environments, aligning with regulatory frameworks like the Digital Operational Resilience Act (DORA).
AI Workloads and the Public Cloud
Public cloud platforms are becoming the default environment for AI development due to their computational capabilities, access to GPUs, and scalable storage options. “Utilizing public cloud platforms for AI offers several major advantages, such as:”
- Rapid prototyping and scaling: Instantly test and iterate on AI models without investing in physical infrastructure.
- Cost efficiency: On-demand resources and auto-scaling features reduce operational costs.
- Resiliency: Geographic redundancy and disaster recovery tools built into cloud platforms.
However, public clouds present a critical challenge—you don’t own the hardware. This raises the risk of privileged users or malicious actors exploiting memory-level vulnerabilities.
Confidential computing mitigates this risk by encrypting data while in use, ensuring secure execution even on shared cloud infrastructure.
Deploying Confidential Virtual Machines on Microsoft Azure with Red Hat Enterprise Linux
Starting with RHEL 9.6, Red Hat now supports confidential virtual machines on Microsoft Azure, making it easy to deploy CVMs via the Azure Marketplace.
Key features include:
- Comprehensive support for Red Hat’s Unified Kernel Image (UKI), including features like FIPS compliance and kdump functionality
- Works seamlessly with both AMD SEV-SNP and Intel TDX technologies
- Includes native support for the Trustee attestation client.
CVMs on Azure use Trustee, an attestation framework from the Confidential Containers project under CNCF, to validate trust in the computing environment before unlocking sensitive data.
Understanding Trustee: Attestation in Action
Trustee is a remote attestation system responsible for confirming that a CVM is running in a trusted state before allowing access to sensitive information. It consists of several components:
- Attestation Agent (AA): Runs inside the CVM and gathers integrity evidence.
- Key Broker Service (KBS): Handles verification and secret delivery once attestation passes.
- Attestation Service (AS): Verifies the trust evidence presented by the Confidential Virtual Machine (CVM).
These components interact to ensure that decryption keys or sensitive data are only shared with trusted, verified environments.
AI Use Cases Secured by Trustee
- Securing AI Models: Trustee ensures that AI models are only accessed within a trusted CVM. Before decrypting or loading the model, the CVM must pass attestation by providing verified measurements (e.g., kernel, firmware checksums).
- Securing Training and Inference Data: Training and inference data remain encrypted and are only unlocked following a successful attestation process. This ensures that only verified CVMs can access and process the data.
- Interactive Inference Scenarios: For real-time inference, a secure connection is established post-attestation, allowing the user to send data to the AI model running inside a verified CVM.
Demonstration: Protecting Fraud Detection Inference Data
To illustrate, consider a use case involving a fraud detection AI model:
- Model Function: Analyze transactions based on geolocation, amount, and method of payment.
- Environment: RHEL CVM running on Azure.
Workflow:
- Credit card information is secured through encryption within a trusted environment.
- Encryption keys are stored in the Trustee.
- Encrypted datasets are uploaded to secure cloud storage (Azure Blob, AWS S3).
- The CVM downloads the datasets but cannot decrypt them yet.
- The Confidential Virtual Machine initiates an attestation process with the Trustee service.
- Upon successful verification, the Trustee sends the decryption key.
- The encrypted data is unlocked within memory and utilized by the model for inference tasks.
This setup ensures security across all three states of data: at rest, in transit, and in use.
Conclusion
Running AI workloads in the public cloud brings scalability and cost benefits but introduces concerns around control and data exposure. Confidential Virtual Machines provide a robust solution by safeguarding data in use through hardware-based encryption and isolation. When combined with attestation mechanisms like Trustee, organizations can confidently secure their AI pipelines—even in untrusted environments.
By adopting confidential computing, enterprises can comply with strict regulations, protect intellectual property, and securely deploy AI models in the cloud.