Privyy AI Technical Whitepaper
A Defense-in-Depth Architecture for Secure AI Conversations
1. Executive Summary
As Large Language Models (LLMs) become integral to daily productivity, the privacy of the data fed into them has become a critical concern. Standard architectural patterns—which rely on transport layer security (TLS) and transparent database encryption (TDE)—are insufficient for protecting sensitive user contexts against modern threats like database leaks, cloud insider attacks, or account compromises.
Privyy.io introduces a "Process-then-Protect" architecture powered by Grimlock, a custom cryptographic module. This architecture ensures that while data must be ephemeral during inference, its long-term persistence is cryptographically isolated from the infrastructure that hosts it.
This paper details the engineering decisions behind Grimlock, the dual-stack implementation (Go/TypeScript), and the threat model that guarantees user privacy even in the event of a total database compromise.
2. The Privacy Problem in Modern AI
In traditional AI applications, user data typically resides in three states:
- In-Transit: Protected by TLS (HTTPS).
- In-Process: Plaintext in RAM during inference.
- At-Rest: Often stored in plaintext within the database, protected only by the storage provider’s disk encryption.
The Vulnerability: If the database credentials are compromised, or if a "rogue admin" at the cloud provider accesses the live database instance, the entire conversation history is exposed. "Encryption at Rest" provided by AWS/Azure usually manages keys transparently, meaning the database engine itself can decrypt the data.
The Privyy Solution: We move the encryption boundary up the stack. By encrypting data at the Application Layer before it ever touches the database driver, we ensure the storage layer holds only mathematical noise (ciphertext), not information.
3. Architecture Overview
Privyy.io utilizes a Stateless Inference / Encrypted Persistence model.
3.1 The Data Lifecycle
- Transport: The client sends a request over TLS 1.3.
- Ephemeral Inference (The "Open" Window): The server receives the request. The payload exists in volatile memory (RAM) strictly for the duration of the LLM inference call.
- Constraint: We utilize enterprise-tier "Zero-Retention" APIs where the inference provider is contractually bound to discard data immediately after processing.
- Immediate Locking (Grimlock): Post-inference, the conversation context is immediately passed to the Grimlock module.
- Secure Persistence: Grimlock encrypts the payload using AES-256-GCM with a context-specific key.
- Storage: Only the resulting ciphertext is written to the database.
4. The Grimlock Cryptographic Module
Grimlock is the kernel of our security architecture. It is a versioned, cross-platform library implemented with identical logic in Go (for backend services) and TypeScript (for client-side operations).
4.1 Cryptographic Primitives
We rely on a suite of modern, non-proprietary algorithms chosen for their resistance to specific attack vectors:
| Component | Algorithm | Purpose |
|---|---|---|
| Symmetric Encryption | AES-256-GCM | Authenticated encryption for conversation history. Ensures confidentiality and integrity. |
| Key Exchange | X25519 (ECDH) | Establishing secure shared secrets for key rotation or multi-device sync. |
| Key Derivation | HKDF-SHA512 | Deriving context-specific encryption keys from master secrets. |
| Key Wrapping | Argon2id | Memory-hard password hashing to protect master keys against GPU brute-force attacks. |
4.2 Context Binding (AAD)
Grimlock uses the Additional Authenticated Data (AAD) feature of AES-GCM. Every encrypted message is cryptographically bound to its metadata (e.g., user_id, conversation_id).
- Benefit: This prevents "Confused Deputy" attacks. If an attacker copies an encrypted blob from User A to User B's database record, decryption will fail because the context (AAD) does not match.
4.3 Cross-Platform Parity
A unique feature of Grimlock is its Bidirectional Compatibility. We maintain a rigorous test suite where:
- A key derived in TypeScript (Client) produces the exact same bit-sequence as in Go (Server).
- Messages encrypted in the browser can be decrypted by the server (and vice versa) if authorization permits.
- Verification: 7/7 critical vectors are tested bi-directionally in CI/CD pipelines to prevent implementation drift.
5. Trust Boundaries & Threat Model
We assume a "Breach-Inevitable" environment for our storage infrastructure. Below is our analysis of risk and mitigation.
5.1 Threat Matrix
| Attack Vector | Risk | Mitigation |
|---|---|---|
| Database Leak (SQL Dump) | Critical | ✅ Solved: Attacker retrieves only AES-256 ciphertext. Keys are not in the DB. |
| Cloud Insider (Rogue Admin) | High | ✅ Solved: Provider sees encrypted blobs. They lack the application-layer keys. |
| Inference Provider (LLM) | Medium | ⚠️ Managed: Mitigated via "Zero-Retention" contracts and ephemeral processing. |
| Man-in-the-Middle | High | ✅ Solved: TLS 1.3 + Certificate Pinning. |
| Brute-Force (Password) | Critical | ✅ Solved: Argon2id makes brute-forcing computationally expensive. |
5.2 The Inference Trust Boundary
It is important to be transparent: We do not use Homomorphic Encryption (as it is currently too slow for LLMs). Therefore, the LLM provider does see the plaintext for the milliseconds required to generate a response.
Our Defense:
- Statelessness: We treat the inference provider as a stateless calculator. We do not use "Chat History" APIs that store context on their side. We send context, get the answer, and the provider discards it.
- Memory Hygiene: On our own servers, the Grimlock Go implementation employs secure memory wiping (e.g.,
memguardpatterns) to zero-out plaintext buffers immediately after encryption is complete.
6. Key Management & Recovery
6.1 Key Storage
- Master Keys: Never stored in plaintext. They are wrapped (encrypted) using the user's Argon2id-hashed passcode.
- Session Keys: Derived on-the-fly using HKDF and exist only in volatile memory.
6.2 Account Recovery
If a user loses their passcode, they lose access to the wrapped Master Key. To prevent data loss, Grimlock implements a Recovery Flow:
- A 32-byte recovery key (convertible to a BIP39 mnemonic) is generated at account creation.
- This key can independently derive the encryption keys, bypassing the passcode requirement.
Note: We do not store this recovery key. If the user loses both their passcode and their recovery key, the data is mathematically unrecoverable. This is a deliberate design choice to prioritize privacy over convenience.
7. Conclusion
Privyy.io and Grimlock represent a shift from "trusting the infrastructure" to "trusting the mathematics." By implementing robust Application-Layer Encryption, we decouple data security from database security.
While the necessity of plaintext inference remains a constraint of current AI technology, our architecture ensures that this exposure is fleeting, while the long-term persistence—the most attractive target for attackers—remains opaque and secure.