The Uncomfortable Truth About "Bulletproof"
Let's be clear from the start: "bulletproof" is not a marketing term. It's not something you declare after writing a few constraints. It's not achieved by enabling RLS on every table and calling it a day.
In PostgreSQL, bulletproof means something specific and falsifiable:
- A single bug in application code cannot cause cross-tenant data exposure or corruption. Not "probably won't" — cannot.
- A single misconfiguration of privileges or RLS cannot silently degrade into "everyone can read/write everything." The system fails loudly, not quietly.
- The system remains predictable under load and growth. Performance doesn't fall off a cliff when you hit 10x your current data volume.
- Invariants are enforceable by the engine and verifiable by CI. Drift is detected before it reaches production — not discovered during an incident.
- Operational reality is accounted for. Migrations, backfills, vacuum, retention, and incident response are part of the design, not afterthoughts.
The Five Pillars of Database Design Doctrine
After years of building, breaking, and rebuilding production systems, I've distilled what actually works into five non-negotiable pillars. These aren't suggestions — they're doctrine.
Pillar 1: Isolation is Engine-Enforced
Goal: No cross-tenant leaks, even if RLS is wrong or the application is compromised.
The Core Principle
If your tenant isolation relies on application code being correct, you've already lost. Applications have bugs. Developers make mistakes. Attackers find exploits. The database engine doesn't have these problems — it enforces what you tell it to enforce, every single time.
Choose Your Tenancy Model Explicitly
Before writing a single table definition, decide:
| Model | Isolation Strength | Operational Complexity | Best For |
|---|---|---|---|
| Shared tables + tenant keys | Medium | Low | Most SaaS applications |
| Schema-per-tenant | High | High | Regulated industries, enterprise contracts |
| Hybrid | Configurable | Medium | Mixed workloads with varying sensitivity |
Most teams choose shared tables. That's fine — but then you must be rigorous about what follows.
The Tenant Path Pattern
This is the pattern I've seen work at scale across dozens of production systems:
Every tenant-scoped row carries a mandatory tenant root key:
-- Organization is the tenant root
CREATE TABLE organizations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
-- Sites belong to organizations
CREATE TABLE sites (
id UUID NOT NULL,
organization_id UUID NOT NULL REFERENCES organizations(id),
name TEXT NOT NULL,
-- The critical part: composite uniqueness
PRIMARY KEY (id),
UNIQUE (id, organization_id) -- Enables composite FK from children
);
-- Projects belong to sites (and transitively to organizations)
CREATE TABLE projects (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
site_id UUID NOT NULL,
organization_id UUID NOT NULL,
name TEXT NOT NULL,
-- Composite FK enforces the path
FOREIGN KEY (site_id, organization_id)
REFERENCES sites(id, organization_id)
);The Anti-Patterns That Will Burn You
I've seen all of these cause production incidents:
- "We'll filter by tenant_id in queries." What happens: A developer forgets the WHERE clause. Data leaks.
- "RLS will handle it." What happens: RLS is disabled during a migration. Data leaks.
- "We'll validate in application code." What happens: A new endpoint skips validation. Data leaks.
- Polymorphic references without tenant-path enforcement. What happens:
entity_type = 'comment', entity_id = 123points to the wrong tenant's comment. Data leaks.
Notice a pattern? Every one of these "solutions" leaks data eventually. Composite foreign keys don't.
Pillar 2: Access Control is Layered, Minimal, and Verifiable
Goal: No "surprise" escalation paths. No accidental broad privileges.
The Defense-in-Depth Model
Think of access control as concentric rings:
┌─────────────────────────────────────────┐
│ Ring 1: GRANTS │ ← Who can access objects at all?
│ ┌─────────────────────────────────┐ │
│ │ Ring 2: RLS │ │ ← Which rows can they see?
│ │ ┌─────────────────────────┐ │ │
│ │ │ Ring 3: Functions │ │ │ ← Controlled write paths
│ │ │ ┌─────────────────┐ │ │ │
│ │ │ │ Ring 4: App │ │ │ │ ← Business logic validation
│ │ │ └─────────────────┘ │ │ │
│ │ └─────────────────────────┘ │ │
│ └─────────────────────────────────┘ │
└─────────────────────────────────────────┘Each ring provides independent protection. If one fails, the others hold.
The Privilege Doctrine
Default: DENY.
-- Start by revoking everything from public
REVOKE ALL ON ALL TABLES IN SCHEMA public FROM PUBLIC;
REVOKE ALL ON ALL FUNCTIONS IN SCHEMA public FROM PUBLIC;
-- Grant explicitly and minimally
GRANT SELECT ON customers TO authenticated;
-- Note: No INSERT, UPDATE, DELETE. Those go through functions.Sensitive writes go through purpose-built functions, not direct table access:
-- Bad: Direct access
GRANT INSERT, UPDATE ON contracts TO authenticated;
-- Good: Controlled write path
GRANT EXECUTE ON FUNCTION create_contract(params) TO authenticated;The SECURITY DEFINER Doctrine
SECURITY DEFINER functions run with the privileges of the function owner, not the caller. This is powerful and dangerous. Here are the hard rules:
Every SECURITY DEFINER function must:
1. Lock search_path explicitly
CREATE FUNCTION sensitive_operation()
RETURNS void
LANGUAGE plpgsql
SECURITY DEFINER
SET search_path = public, pg_temp -- Locked. Always.
AS $$
BEGIN
-- function body
END;
$$;2. Avoid dynamic SQL unless absolutely necessary
-- Dangerous: SQL injection vector
EXECUTE format('SELECT * FROM %I', user_input);
-- If you must use dynamic SQL, audit it heavily and validate inputs3. Use explicit permission checks
CREATE FUNCTION delete_organization(org_id UUID)
RETURNS void
LANGUAGE plpgsql
SECURITY DEFINER
SET search_path = public, pg_temp
AS $$
BEGIN
-- Explicit check before any action
IF NOT is_organization_owner(auth.uid(), org_id) THEN
RAISE EXCEPTION 'Permission denied';
END IF;
-- Proceed with deletion
DELETE FROM organizations WHERE id = org_id;
END;
$$;4. Be owned by a trusted owner from an allowlist
Your CI should verify that all SECURITY DEFINER functions are owned by postgres or supabase_admin — never by application roles.
The RLS Doctrine
Row-Level Security is for filtering, not for preventing writes. This distinction matters.
Use RLS for:
- Read filtering (which rows can a user see?)
- Simple write filtering where direct writes are genuinely allowed
Avoid RLS for:
- Complex cross-table logic that the query planner can't optimize
- As a substitute for proper privilege management
- Heavy subqueries on hot tables
The WITH CHECK rule:
Every INSERT and UPDATE policy needs a WITH CHECK clause. Without it, you're only filtering reads:
-- Incomplete: Only filters SELECT
CREATE POLICY "Users see own data" ON user_data
FOR ALL
USING (user_id = auth.uid());
-- Complete: Filters reads AND validates writes
CREATE POLICY "Users manage own data" ON user_data
FOR ALL
USING (user_id = auth.uid())
WITH CHECK (user_id = auth.uid()); -- Prevents inserting data for other usersPillar 3: Data Invariants are Explicit, Minimal, and Enforced Where They Belong
Goal: Integrity isn't optional, and it isn't duplicated in five places.
The Constraint Hierarchy
Use the strongest constraint that matches your invariant:
| Invariant Type | Enforcement Mechanism | Performance Impact |
|---|---|---|
| Referential integrity | Foreign keys | Very low |
| Local value rules | CHECK constraints | Very low |
| Uniqueness | Unique indexes | Low |
| Complex business rules | Triggers | Medium-High |
| Cross-table invariants | Triggers or functions | High |
Always prefer declarative constraints over triggers. Triggers are more powerful but more expensive and harder to reason about.
The State Machine Pattern
For entities with lifecycles (orders, subscriptions, tickets), enforce valid transitions:
-- Define the valid states
CREATE TYPE order_status AS ENUM (
'draft',
'submitted',
'approved',
'fulfilled',
'cancelled'
);
-- Define valid transitions
CREATE TABLE order_status_transitions (
from_status order_status NOT NULL,
to_status order_status NOT NULL,
PRIMARY KEY (from_status, to_status)
);
INSERT INTO order_status_transitions VALUES
('draft', 'submitted'),
('draft', 'cancelled'),
('submitted', 'approved'),
('submitted', 'cancelled'),
('approved', 'fulfilled'),
('approved', 'cancelled');
-- Enforce transitions
CREATE FUNCTION enforce_order_transition()
RETURNS TRIGGER
LANGUAGE plpgsql AS $$
BEGIN
IF OLD.status IS DISTINCT FROM NEW.status THEN
IF NOT EXISTS (
SELECT 1 FROM order_status_transitions
WHERE from_status = OLD.status AND to_status = NEW.status
) THEN
RAISE EXCEPTION 'Invalid status transition: % → %',
OLD.status, NEW.status;
END IF;
END IF;
RETURN NEW;
END;
$$;The Immutability Pattern
Some data must never change: audit logs, legal evidence, financial records. Enforce this at the database level:
-- 1. Remove modification privileges
REVOKE UPDATE, DELETE ON audit_log FROM authenticated;
-- 2. Add a trigger as defense in depth
CREATE FUNCTION prevent_audit_modification()
RETURNS TRIGGER
LANGUAGE plpgsql AS $$
BEGIN
RAISE EXCEPTION 'Audit records are immutable';
END;
$$;
CREATE TRIGGER audit_immutability
BEFORE UPDATE OR DELETE ON audit_log
FOR EACH ROW
EXECUTE FUNCTION prevent_audit_modification();
-- 3. Partition for growth and retention
CREATE TABLE audit_log (
id UUID DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
action TEXT NOT NULL,
payload JSONB,
created_at TIMESTAMPTZ DEFAULT now()
) PARTITION BY RANGE (created_at);
-- Monthly partitions
CREATE TABLE audit_log_2024_01 PARTITION OF audit_log
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');Pillar 4: Performance is a First-Class Constraint
Goal: Predictable latency under scale. No "it works in dev" surprises.
The Query Plan Doctrine
Every hot-path query must have:
- A deliberate index strategy
- A measured execution plan (
EXPLAIN ANALYZE) - A regression gate in CI
-- Example: Ensure tenant queries use indexes
EXPLAIN ANALYZE
SELECT * FROM projects
WHERE organization_id = '123e4567-e89b-12d3-a456-426614174000'
AND status = 'active';
-- Expected: Index Scan using idx_projects_org_status
-- Not acceptable: Seq Scan on projectsRLS Performance Reality
RLS policies are executed per row. A "simple" policy can destroy performance:
-- Looks innocent, performs terribly at scale
CREATE POLICY "Team members see team data" ON documents
USING (
team_id IN (
SELECT team_id FROM team_memberships
WHERE user_id = auth.uid()
)
);
-- Better: Use a materialized membership check
CREATE POLICY "Team members see team data" ON documents
USING (
EXISTS (
SELECT 1 FROM team_memberships tm
WHERE tm.team_id = documents.team_id
AND tm.user_id = auth.uid()
)
);
-- Even better if team_memberships has: INDEX (user_id, team_id)EXPLAIN your RLS-protected queries. The planner might not be able to push predicates through your policy functions.The JSONB Doctrine
JSONB is one of PostgreSQL's best features. It's also a performance trap if misused.
JSONB is appropriate for:
- Payloads and unstructured content
- Rarely-filtered metadata
- Feature flags and configuration
- Fields that genuinely vary by record
JSONB is risky for:
- Fields used in WHERE clauses
- Join keys
- Permission-determining attributes
- Frequently-indexed columns
If you query a JSONB field often, extract it:
-- Instead of: WHERE metadata->>'status' = 'active'
-- Add a generated column:
ALTER TABLE events ADD COLUMN status TEXT
GENERATED ALWAYS AS (metadata->>'status') STORED;
CREATE INDEX idx_events_status ON events(status);Partition Strategy for Growth
High-write tables need partitioning before they become problems:
-- Partition by time for append-only data
CREATE TABLE events (
id UUID DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
event_type TEXT NOT NULL,
payload JSONB,
created_at TIMESTAMPTZ DEFAULT now(),
PRIMARY KEY (id, created_at) -- Include partition key
) PARTITION BY RANGE (created_at);
-- Partition by tenant for multi-tenant workloads
CREATE TABLE large_tenant_data (
id UUID DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
data JSONB,
PRIMARY KEY (id, tenant_id)
) PARTITION BY HASH (tenant_id);Pillar 5: "Bulletproof" Requires Falsification
Goal: Drift and regressions get caught automatically, before production.
The CI Gate Philosophy
You are not bulletproof until your CI can fail the build for:
- Privilege drift — Someone granted INSERT to
authenticatedon a sensitive table - SECURITY DEFINER drift — Wrong owner, unlocked search_path, dynamic SQL
- Missing tenant enforcement — Composite FKs absent where required
- Missing WITH CHECK — Direct-write tables without insert/update validation
- Query plan regressions — Critical queries showing Seq Scans or cost explosions
- Version mismatch — Required features missing in target PostgreSQL version
Minimum CI Checks (Non-Negotiable)
# Example CI pipeline checks
database_checks:
- name: "Schema ownership allowlist"
query: |
SELECT tablename, tableowner
FROM pg_tables
WHERE schemaname = 'public'
AND tableowner NOT IN ('postgres', 'supabase_admin')
expect: empty
- name: "SECURITY DEFINER search_path locked"
query: |
SELECT proname, prosecdef, proconfig
FROM pg_proc
WHERE prosecdef = true
AND NOT ('search_path=public, pg_temp' = ANY(proconfig))
expect: empty
- name: "No public CREATE for app roles"
query: |
SELECT grantee, privilege_type
FROM information_schema.role_table_grants
WHERE table_schema = 'public'
AND grantee = 'authenticated'
AND privilege_type = 'CREATE'
expect: empty
- name: "Tenant keys NOT NULL"
query: |
SELECT table_name, column_name
FROM information_schema.columns
WHERE column_name IN ('organization_id', 'tenant_id', 'site_id')
AND is_nullable = 'YES'
expect: emptyFalsification Tests
These tests should attempt to break your invariants and verify they hold:
-- Test: Cross-tenant insert must fail
DO $$
BEGIN
-- Attempt to create a project pointing to wrong org's site
INSERT INTO projects (site_id, organization_id, name)
VALUES (
(SELECT id FROM sites WHERE organization_id = 'org-1' LIMIT 1),
'org-2', -- Wrong organization!
'Malicious Project'
);
RAISE EXCEPTION 'TEST FAILED: Cross-tenant insert should have been rejected';
EXCEPTION
WHEN foreign_key_violation THEN
RAISE NOTICE 'TEST PASSED: Cross-tenant insert correctly rejected';
END;
$$;
-- Test: Direct immutable table modification must fail
DO $$
BEGIN
UPDATE audit_log SET action = 'tampered'
WHERE id = (SELECT id FROM audit_log LIMIT 1);
RAISE EXCEPTION 'TEST FAILED: Audit modification should have been blocked';
EXCEPTION
WHEN OTHERS THEN
RAISE NOTICE 'TEST PASSED: Audit modification correctly blocked';
END;
$$;Practical Patterns to Standardize
Pattern 1: The Tenant Path Standard
Define your tenant hierarchy once and enforce it everywhere:
Organization (root)
└── Site
└── Project
└── [Entity]Every entity carries the minimal path required:
- Organization-scoped:
organization_id - Site-scoped:
organization_id + site_id - Project-scoped:
organization_id + site_id + project_id
Pattern 2: The Sensitive Write Path
For tables containing contracts, billing, evidence, or PII:
-- App roles: SELECT only (via RLS-filtered views)
GRANT SELECT ON contracts_view TO authenticated;
-- Writes: Functions only
GRANT EXECUTE ON FUNCTION create_contract TO authenticated;
GRANT EXECUTE ON FUNCTION sign_contract TO authenticated;
-- No direct table access
REVOKE ALL ON contracts FROM authenticated;Pattern 3: The Evidence/Audit Standard
For compliance-critical append-only data:
- Partition by time (monthly or weekly)
- Immutable privileges (no UPDATE/DELETE)
- Immutability trigger (defense in depth)
- Retention policy (detach and archive old partitions)
- Indexed query paths (tenant + time range)
The Graduation Criteria
You can call your database design "bulletproof" only when all of the following are true:
| Criterion | Verification |
|---|---|
| Engine-enforced tenant isolation | Composite FKs make cross-tenant corruption impossible |
| Principle of least privilege | App roles cannot write sensitive tables directly |
| SECURITY DEFINER hardened | Locked search_path, trusted owners, no dynamic SQL |
| RLS is measured | Policies proven by EXPLAIN; WITH CHECK on all write policies |
| Growth plan implemented | Partitioning + retention for append-only and high-volume tables |
| CI gates active | Build fails on drift, plan regressions, or privilege mistakes |
| Load tests exist | Part of release discipline, not a one-off |
The Hard Truth
A database schema becomes bulletproof when it is designed so that your future self cannot accidentally break your invariants.
This requires:
- Engine-level constraints that don't depend on code correctness
- Minimal trusted surfaces that limit blast radius
- Verifiable CI gates that catch drift before production
- Performance proofs under load that prevent surprise degradation
Everything else — the blog posts, the conference talks, the "best practices" documents — is decoration.
