ADR-002: PostgreSQL as Primary Storage
| Status | Date | Deciders |
|---|---|---|
| Accepted | 2024-01-15 | Core Team |
Context
go-mink needs a reliable, production-ready storage backend for events. The storage must support:
- ACID Transactions: Atomic event appends with version checks
- Optimistic Concurrency: Detect concurrent modifications
- Efficient Queries: Load events by stream, position, or time
- Subscriptions: Support for change notifications
- Scalability: Handle millions of events
- Operational Maturity: Well-understood operations, backup, monitoring
We need to choose between:
- Dedicated event stores (EventStoreDB)
- Relational databases (PostgreSQL, MySQL)
- Document databases (MongoDB)
- Custom solutions
Decision
We will use PostgreSQL as the primary storage backend with a custom schema optimized for event sourcing.
Schema Design
The adapter uses a two-table design: a streams table that tracks the
current version (and category) of each stream, plus an events table that
stores the immutable event records. The events table uses global_position
(a BIGSERIAL) as its primary key, which gives both a compact row identity
and a total global ordering for catch-up subscriptions. The unique
(stream_id, version) constraint enforces optimistic concurrency. The schema is
configurable via WithSchema; the core table names are streams and events.
CREATE TABLE streams (
id BIGSERIAL PRIMARY KEY,
stream_id VARCHAR(500) NOT NULL UNIQUE,
category VARCHAR(250) NOT NULL,
version BIGINT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_streams_category ON streams (category);
CREATE TABLE events (
global_position BIGSERIAL PRIMARY KEY,
stream_id VARCHAR(500) NOT NULL,
version BIGINT NOT NULL,
event_id UUID NOT NULL DEFAULT gen_random_uuid(),
event_type VARCHAR(500) NOT NULL,
data JSONB NOT NULL,
metadata JSONB,
timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (stream_id, version)
);
CREATE INDEX idx_events_stream ON events (stream_id, version);
CREATE INDEX idx_events_type ON events (event_type);
CREATE INDEX idx_events_timestamp ON events (timestamp);
The adapter also creates snapshots and checkpoints tables as part of the
core schema. Outbox, idempotency, and saga tables are created separately by
their respective sub-stores.
Key Features Used
- JSONB: Flexible event
data/metadatastorage with indexing - BIGSERIAL primary key: Global ordering and row identity via
global_position - UNIQUE Constraint: Optimistic concurrency via
(stream_id, version)onevents streamstable: Authoritative current version per stream for fast version checks- Polling subscriptions: Catch-up via
global_position(see below)
Concurrency Control
Appends run inside a transaction that checks the stream's current version in the
streams table, inserts the new rows into events, and updates the stream
version — all atomically. A mismatch between the expected version and the
stored version aborts the transaction with a concurrency conflict:
-- Inside a single transaction:
-- 1. Read and lock the current version.
SELECT version FROM streams WHERE stream_id = $1 FOR UPDATE;
-- 2. If it matches the expected version, insert the new event(s).
INSERT INTO events (stream_id, version, event_type, data, metadata)
VALUES ($1, $2, $3, $4, $5)
RETURNING global_position;
-- 3. Bump the stream's version.
UPDATE streams SET version = $2, updated_at = NOW() WHERE stream_id = $1;
Consequences
Positive
- Familiar Technology: Most teams already know PostgreSQL
- Operational Maturity: Proven backup, replication, monitoring
- JSONB Power: Flexible schema with query capabilities
- Transaction Support: Can combine event writes with other operations
- Cloud Availability: Available on all major cloud providers
- No Vendor Lock-in: Open source, portable
- Cost Effective: No additional licensing for event store
- Ecosystem: Rich tooling (pgAdmin, pg_dump, etc.)
Negative
- Not Purpose-Built: Less optimized than dedicated event stores
- Subscription Limitations: LISTEN/NOTIFY has message size limits
- Partitioning Complexity: Manual partitioning for very large datasets
- No Built-in Projections: Must implement projection engine ourselves
Neutral
- Schema Management: Need to handle migrations
- Connection Pooling: Standard PostgreSQL considerations apply
Alternatives Considered
Alternative 1: EventStoreDB
Description: Purpose-built event store database.
Pros:
- Optimized for event sourcing
- Built-in subscriptions
- Projections in JavaScript
Rejected because:
- Additional infrastructure component
- Less familiar to most teams
- Licensing considerations for enterprise features
- Smaller community in Go ecosystem
Alternative 2: MongoDB
Description: Document database with change streams.
Pros:
- Flexible schema
- Good Go driver
- Change streams for subscriptions
Rejected because:
- Weaker transaction guarantees (improved in recent versions)
- Less suitable for strict ordering requirements
- JSONB in PostgreSQL provides similar flexibility
Alternative 3: Apache Kafka
Description: Distributed event streaming platform.
Pros:
- Excellent for high-throughput
- Built-in partitioning
- Strong ecosystem
Rejected because:
- Complex infrastructure
- Not designed for entity-level streams
- Retention policies complicate event sourcing
- Overkill for many use cases
Alternative 4: Custom File-Based Store
Description: Append-only log files.
Rejected because:
- Would need to build everything from scratch
- No query capabilities
- Complex replication/backup
- Not suitable for production
Implementation Notes
Connection Pooling
We use pgx for connection pooling:
config, _ := pgxpool.ParseConfig(connStr)
config.MaxConns = 25
config.MinConns = 5
pool, _ := pgxpool.NewWithConfig(ctx, config)
Subscription via Polling
For reliable subscriptions, we use polling with checkpoints rather than LISTEN/NOTIFY:
func (s *Subscriber) Poll(ctx context.Context, fromPosition uint64) ([]StoredEvent, error) {
return s.db.Query(ctx, `
SELECT * FROM events
WHERE global_position > $1
ORDER BY global_position
LIMIT 100
`, fromPosition)
}