Autonomous Operation Safety

Rule

Never combine private data access, untrusted content processing, and external communication in a single operation.

Context

During autonomous operations when handling data, processing content, or communicating externally.

Detection

Observable signals that indicate lethal trifecta risk:

Operation involves reading private/sensitive data
Processing user-submitted or external content
About to send data via HTTP, email, webhook, or other channels
Multiple security boundaries being crossed
Handling secrets or credentials with external systems

Pattern

Isolate operations to avoid lethal trifecta:

# Safe combinations (never all three together)
READ_ONLY = {
    "Private data + Analysis": Safe (no external communication)
    "Public research + Communication": Safe (no private data)
    "Trusted content + Communication": Safe (no untrusted content)
}

# DANGEROUS: All three elements (lethal trifecta)
read_private_emails()        # 1. Private data
process_user_input()         # 2. Untrusted content
send_http_request(data)      # 3. External communication
# → Perfect prompt injection vulnerability

# Safe: Separate operations by context
if has_private_data and has_untrusted_content:
    disable_external_communication()

Outcome

Following this pattern prevents:

Prompt injection attacks: Can't leak private data through untrusted content
Data exfiltration: Private data not exposed to external systems
Security breaches: Operations isolated by security boundaries
Audit failures: Clear separation enables tracking

Benefits:

Safe autonomous operation
Defense against adversarial prompts
Clear security boundaries
Auditable operations

Safe Operation Patterns - GREEN/YELLOW/RED classification
Escalation vs Autonomy - When to escalate to human
Simon Willison's "lethal trifecta" framework - the theoretical basis for this pattern

Autonomous Operation Safety

Autonomous Operation Safety

Rule

Context

Detection

Pattern

Outcome

Related

Match Keywords