Article
Relational, Document, or Platform?
A pragmatic guide to choosing your database in 2026 — with a four-layer decision framework, real-world case studies, and a 20-question interactive checklist
01
Define the Options
Most “database comparison” articles pretend you are choosing between two products. You are not. You are choosing across three layers: data model, database engine, and operational platform. Conflating them is how teams end up with the wrong database for the right reasons.
The three layers of a database decision
03Operational Platform
The operational wrapper
02Database Engine
The software that stores & queries
01Data Model
How you structure your data
The next three subsections unpack each layer — starting from the innermost. Data model first, then engine, then platform.
Relational Databases (MySQL / PostgreSQL)
The relational model organizes data into tables with rows and columns, linked by foreign keys and governed by a schema enforced at write time. Normalization eliminates redundancy. Joins reassemble data at read time. Constraints — NOT NULL, UNIQUE, CHECK, FOREIGN KEY — push data integrity into the engine itself.
ACID transactions are native: every statement runs inside a transaction by default. Decades of tooling, SQL standardization, and optimizer research make relational databases the most predictable choice for workloads with well-defined relationships and consistent access patterns.
PostgreSQL and MySQL are not interchangeable. PostgreSQL supports advanced data types (JSONB, arrays, ranges, hstore), has a rich extension ecosystem (pgvector, PostGIS, TimescaleDB, Citus), and implements the SQL standard more faithfully. MySQL prioritizes simplicity, replication ease, and read throughput — it remains the backbone of WordPress, Shopify, and large-scale web applications.
Where PostgreSQL is blurring the lines: PostgreSQL’s JSONB support (available since 9.4, 2014) lets you store, index, and query semi-structured data with GIN indexes — effectively giving you document-style flexibility inside a relational database. With PostgreSQL 17 (September 2024), the engine delivers up to 2× write throughput improvements under high-concurrency workloads, 20× vacuum memory reduction, incremental backups, and JSON_TABLE() for converting JSON to relational rows. The “just use Postgres” argument has never been stronger.
Engine profiles at a glance
The extensible relational powerhouse
- Schema
- Strict DDL + JSONB escape hatch
- Transactions
- Full ACID, serializable isolation
- Scaling
- Vertical; Citus / partitioning for horizontal
- Query Model
- SQL standard — JOINs, CTEs, window functions
- Sweet Spot
- Complex relationships, analytics, extensions
The document model for flexible data
- Schema
- Flexible documents + optional JSON Schema validation
- Transactions
- Multi-doc ACID (4.0+); single-doc atomic by design
- Scaling
- Native horizontal sharding, auto-balancing
- Query Model
- Aggregation pipeline — $match, $group, $lookup
- Sweet Spot
- Nested data, polymorphic entities, rapid iteration
MongoDB (the Document Model)
MongoDB stores data as BSON documents — rich, nested structures that map directly to objects in your application code. Instead of normalizing into tables, you embed related data inside a single document when it is accessed together, or reference it with an ObjectId when it lives independently.
This is not schema-less. It is schema-flexible. You define the shape of your data in your application code (or with MongoDB’s built-in JSON Schema validation). The database does not reject a document because it has an extra field — but well-designed MongoDB applications have rigorous, versioned schemas enforced at the application layer.
The aggregation framework replaces SQL’s SELECT … JOIN … GROUP BY with a pipeline of stages ( $match, $group, $lookup, $unwind, $project). It is powerful, composable, and natively understands nested data.
Multi-document ACID transactions have been available since MongoDB 4.0 (2018) for replica sets and 4.2 for sharded clusters. They work. The 60-second lifetime is a configurable default, not a hard ceiling. The 1,000-document figure is a recommendation, not an engine limit. MongoDB’s philosophy is clear: design your schema so that single-document atomicity covers most operations, and use transactions for the exceptions — including Temenos running core banking at 150K TPS (see Case Studies, Section 05).
With MongoDB 8.0 (October 2024), the engine delivers up to 32% overall throughput improvement and 200%+ faster time-series aggregations (vendor benchmarks using YCSB and Timescale TSBSuite respectively), queryable encryption with range query support, and sharding that distributes data up to 50× faster at 50% lower starting cost.
MongoDB Atlas (the Platform)
Here is the distinction most articles miss: MongoDB Atlas is not “MongoDB in the cloud.” It is a managed data platform that bundles the MongoDB database engine with services you cannot get from Community or Enterprise Server alone.
Atlas generated 71% of MongoDB’s $2.01B FY2025 revenue, growing 24% year-over-year. The company’s business has fundamentally shifted to this platform model. Atlas Vector Search lets you store operational data and vector embeddings in the same database — as explored in What Are Vector Embeddings? , RAG From the Ground Up, and Distance Metrics Explained. Atlas Stream Processing enables event-driven architectures without Kafka/Flink complexity — see From Hours to Milliseconds.
The trade-off is real: Atlas introduces vendor lock-in to MongoDB Inc.’s pricing and feature roadmap. Production dedicated clusters start at ~$57/month (M10) and climb to $3,000+/month for M200 instances — before data transfer, backup, and advanced feature charges.
Self-managed MongoDB vs. Atlas — what you actually get
Operational Automation
Zero-downtime management and security
Atlas Platform Services
Integrated capabilities beyond the database
MongoDB Engine
Document model, aggregation, transactions
When someone says “we use MongoDB,” ask: Community Server? Enterprise? Atlas? The answer changes the entire capability matrix.
The Expanding Landscape
The choice is no longer a binary between relational and document. Supabase wraps PostgreSQL with auth, storage, real-time subscriptions, and pgvector. Neon offers serverless PostgreSQL with scale-to-zero and database branching (acquired by Databricks, May 2025). PlanetScale built distributed MySQL on Vitess and now offers PostgreSQL support (GA late 2025). CockroachDB delivers distributed SQL with serializable isolation and automatic geo-replication.
The database landscape — schema flexibility × operational scope
The Hyperscaler Platform Play
The landscape above includes independent databases. But the biggest competitors to Atlas are not other databases — they are cloud platform ecosystems. The core architectural trade-off: Atlas is a vertically integrated platform — deep capabilities, single query language, single console. Hyperscalers are horizontally integrated — broad capabilities across many services that require configuration and connection. Atlas wins on developer simplicity for MongoDB workloads. Hyperscalers win on breadth for heterogeneous stacks.
Hyperscaler database platforms
Purpose-built databases (15+ engines)
Aurora: 3–5× standard PG throughput. DSQL (GA May 2025): 99.999% multi-region availability.
MongoDB compatibility
DocumentDB: ~58% MongoDB compatible — wire protocol only, missing aggregation stages and change streams.
AI-first, PostgreSQL-centric strategy
AlloyDB embeds Google’s ScaNN vector index. Spanner: 99.999% global SLA.
MongoDB compatibility
Firestore with MongoDB compatibility (GA August 2025) — use existing MongoDB drivers against Google’s serverless document DB.
Multi-model flexibility
Cosmos DB: 5 tuneable consistency levels, multiple APIs (NoSQL, MongoDB, Gremlin, Table, PG).
MongoDB compatibility
Cosmos DB MongoDB API: ~32% compatible. Azure DocumentDB (GA November 2025): vCore-based, open-source engine under Linux Foundation.
The analyst perspective is instructive: Gartner’s 2025 Cloud DBMS Magic Quadrant places nine vendors in the Leaders quadrant — AWS, Google, Microsoft, Oracle, Databricks, Snowflake, MongoDB, IBM, and Alibaba Cloud. MongoDB holds its position alongside the hyperscalers, while CockroachDB, Couchbase, and Redis sit in the Niche Players quadrant. Independent database vendors compete, but the Leaders quadrant is dominated by cloud platforms.
The lock-in trade-off is real but different in kind. Atlas introduces vendor lock-in to MongoDB Inc.’s pricing and feature roadmap, but lets you deploy across AWS, Azure, and GCP — even multi-cloud simultaneously. Hyperscalers lock you to a single cloud provider but often offer open-source compatible engines (PostgreSQL, increasingly MongoDB-compatible). You are not choosing between lock-in and freedom. You are choosing between vendor lock-in and cloud lock-in.
02
What the Code Looks Like
Before entering a decision framework, see what these databases look like in practice. Same domain, two models — a course platform with instructors, modules, lessons, and enrollments.
Relational Model (PostgreSQL DDL)
CREATE TABLE instructors (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL
);
CREATE TABLE courses (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
slug TEXT UNIQUE NOT NULL,
instructor_id INTEGER REFERENCES instructors(id),
price_cents INTEGER NOT NULL CHECK (price_cents >= 0),
status TEXT CHECK (status IN ('draft','published'))
);
CREATE TABLE modules (
id SERIAL PRIMARY KEY,
course_id INTEGER REFERENCES courses(id) ON DELETE CASCADE,
title TEXT NOT NULL,
position INTEGER NOT NULL,
UNIQUE (course_id, position)
);
CREATE TABLE lessons (
id SERIAL PRIMARY KEY,
module_id INTEGER REFERENCES modules(id) ON DELETE CASCADE,
title TEXT NOT NULL,
video_url TEXT,
position INTEGER NOT NULL
);
-- Fetch a course page: multi-table JOIN
SELECT c.title, m.title AS module, l.title AS lesson
FROM courses c
JOIN modules m ON m.course_id = c.id
JOIN lessons l ON l.module_id = m.id
WHERE c.slug = 'database-fundamentals'
ORDER BY m.position, l.position;Gain: Referential integrity, constraints, normalized storage, ad-hoc JOINs
Give up: Single page requires multi-table JOIN. Schema changes need ALTER TABLE + migration.
Document Model (MongoDB + TypeScript)
interface Course {
_id: ObjectId;
title: string;
slug: string;
instructor: {
_id: ObjectId;
name: string; // denormalized
};
priceCents: number;
status: "draft" | "published";
modules: Array<{
title: string;
position: number;
lessons: Array<{
title: string;
videoUrl?: string;
position: number;
resources?: Array<{ // easy to add later
name: string;
url: string;
}>;
}>;
}>;
}
// Fetch a course page: single document read
db.courses.findOne({ slug: "database-fundamentals" });
// Update denormalized instructor name
db.courses.updateMany(
{ "instructor._id": instructorId },
{ $set: { "instructor.name": "New Name" } }
);Gain: Single read fetches entire page. Add fields without migration. Document mirrors API response.
Give up: Denormalized name needs updateMany. No database-enforced foreign keys. 16MB document limit.
Product Catalog (Polymorphic Data)
Different product categories have fundamentally different attributes. In relational: table-per-type (migration for every new category), EAV pattern (query nightmare), or JSONB (reasonable middle ground). In MongoDB: one collection, different shapes per category, zero migration for new types. For polymorphic data, the document model is more natural. PostgreSQL’s JSONB is a reasonable middle ground if you need relational features elsewhere.
03
The Decision Framework
How do you navigate this? Not with a single flowchart. Enterprise architecture research — from the SEI’s ATAM method to Azure’s Architecture Center — converges on a consistent message: technology selection is a layered process, not a binary choice. Our framework has four layers. Each narrows the field.
Layer 1: Access Pattern Classification
Start with your workload, not your database preference. Ask: is your data highly relational with complex joins? Or naturally hierarchical, nested, or polymorphic? If mixed, which access pattern dominates? The 80/20 rule applies: optimize for the queries that handle 80% of your traffic.
If your data has rigid, well-known relationships → lean relational (PostgreSQL/MySQL). If it’s naturally nested or polymorphic → lean document (MongoDB). Need integrated vector search, stream processing, or full-text search? → Atlas is the integrated play. Need global distribution with strong consistency? → CockroachDB or Aurora Global.
This gets you from “I have no idea” to “I have 2–3 candidates.” It is a starting point, not the answer.
Layer 2: Tradeoff Analysis
Every database decision involves tensions — improving one quality attribute degrades another. The SEI’s ATAM calls these tradeoff points: properties that affect multiple quality attributes simultaneously. The purpose of this layer is not to score databases — that is Layer 3’s job. The purpose is to force you to articulate what you are willing to give up. If nothing in your profile feels uncomfortable, you have not been honest enough.
Layer 2 — Tension map
Position each axis where your workload actually sits. Every position has a cost — if nothing feels uncomfortable, you have not been honest enough.
Move the sliders or pick an archetype to see your tradeoff profile
Layer 3: Weighted Scoring Matrix
Once you have 2–3 finalists, score them systematically. Research on structured decision-making shows that weighted scoring matrices consistently outperform unstructured deliberation when comparing 3+ options on 4+ criteria.
Rules: 4–8 criteria (consolidate related items). Never weight everything equally — this is the most common mistake. Use a 1–5 scoring scale. Have multiple stakeholders score independently, then average.
Layer 3 — Weighted scoring example
Content management platform scenario. If you changed the weights, the outcome would shift — that is the point.
| Criterion | Weight | PostgreSQL | MongoDB | Atlas | Aurora | AlloyDB | Cosmos DB |
|---|---|---|---|---|---|---|---|
| Polymorphic data handling | 0.25 | 3(0.75) | 5(1.25) | 5(1.25) | 3(0.75) | 3(0.75) | 3(0.75) |
| Full-text search quality | 0.2 | 3(0.60) | 2(0.40) | 5(1.00) | 3(0.60) | 3(0.60) | 2(0.40) |
| Operational burden | 0.2 | 3(0.60) | 2(0.40) | 5(1.00) | 4(0.80) | 4(0.80) | 3(0.60) |
| Team expertise | 0.15 | 4(0.60) | 3(0.45) | 3(0.45) | 4(0.60) | 3(0.45) | 2(0.30) |
| Total cost (3-yr TCO) | 0.1 | 4(0.40) | 4(0.40) | 2(0.20) | 3(0.30) | 3(0.30) | 2(0.20) |
| Vendor independence | 0.1 | 5(0.50) | 4(0.40) | 2(0.20) | 2(0.20) | 2(0.20) | 1(0.10) |
| Weighted Total | 1.00 | 3.45 | 3.30 | 4.10Winner | 3.25 | 3.10 | 2.35 |
Polymorphic data handling
w: 0.25Full-text search quality
w: 0.2Operational burden
w: 0.2Team expertise
w: 0.15Total cost (3-yr TCO)
w: 0.1Vendor independence
w: 0.1Weighted Totals
Atlas wins because the team weighted full-text search and operational simplicity highly. Raise vendor independence to 0.25 and the outcome shifts. The matrix makes your priorities explicit and debatable.
Total Cost of Ownership: The Dimension Most Articles Ignore
Sticker price is not cost. Enterprise TCO includes DBA headcount (1 senior FTE = ~$120K–180K/yr in the US), training, migration dual-running periods, integration infrastructure, security tooling, opportunity cost (engineers doing ops instead of shipping features), and downtime losses.
The managed service is not “cheaper.” It shifts cost from people to platform fees. For small teams (< 5 engineers), the shift almost always makes sense — the opportunity cost of a senior engineer doing DBA work dwarfs the platform premium. For large organizations with dedicated infrastructure teams, self-managed can be more economical.
| Cost Category | Self-Managed PG | Self-Managed MongoDB | MongoDB Atlas | Hyperscaler Managed |
|---|---|---|---|---|
| Compute | You provision | You provision | Instance-based | Instance or serverless |
| DBA headcount | 1–2 FTEs | 1–2 FTEs | 0 (platform) | 0 (platform) |
| Monitoring | Build or buy | Build or buy | Included | Included (per-service) |
| Search / Vector | Separate infra | Separate infra (or Atlas Search) | Included | Separate service |
| Backups / DR | Manual setup | Manual setup | Included | Included |
| Vendor lock-in risk | Low | Low (open source) | Medium (MongoDB Inc.) | Medium (cloud provider) |
Hidden costs most teams miss
- Knowledge concentration risk — one DBA is a bus factor of one. When they leave, you lose institutional knowledge of vacuum tuning, replication topology, and failover runbooks.
- Upgrade burden — major version upgrades every 2–3 years require testing, dual-running periods, and rollback plans. Managed services handle this; self-managed teams absorb it.
- Multi-region replication — custom setup with streaming replication vs. a checkbox in Atlas or Aurora Global. The engineering effort difference is weeks, not hours.
- Data transfer and egress charges — every cloud provider charges for data leaving their network. Multi-region and cross-service architectures multiply this cost.
- Lock-in switching cost — not just the database migration itself, but rewriting application code, ops tooling, monitoring dashboards, and team training.
Layer 4: Decision Documentation
You have a winner. Now document why — for the teammates who will inherit this decision in two years. The Y-statement format compresses a database decision into a single sentence:
If you cannot fill in every blank, you have not finished the evaluation. Here are four filled-in examples:
04
Technology Selection Anti-Patterns
Before committing to a database, name the ways teams get this decision wrong. A 2021 ICSE study surveying 591 software professionals found that 82% of developers believed using trending technology makes them more attractive to employers, while 60% of hiring managers admitted trends influence their job postings. This self-reinforcing loop — resume-driven development — is just one of seven anti-patterns.
Resume-Driven Development
"Let's use the hot new DB — it'll be great for hiring"
Remedy
Evaluate fitness for workload, not for resumes
82% of developers believe trending tech makes them more employable (ICSE-SEIS 2021)
Cargo Culting
"Netflix uses it, so should we"
Remedy
Understand your actual scale and constraints
Netflix operates at 1,000× the scale of most companies making that argument
Polyglot Fragmentation
6 databases for 8 microservices
Remedy
Consolidate where access patterns overlap
Each database needs its own monitoring, backups, upgrades, and DR runbooks
Lock-In Fear Paralysis
Self-managing everything to stay portable
Remedy
Calculate actual switching cost vs. operational cost
Application-level coupling creates more lock-in than the deployment model
Analysis Paralysis
6-month evaluation of 15 databases
Remedy
Timebox to 2–4 weeks, narrow to 2–3 finalists early
Pinterest evaluated 3 candidates (YugabyteDB, CockroachDB, TiDB) — not 15
Silver Bullet Thinking
"This database solves all our problems"
Remedy
No database is optimal for all access patterns
If you can't name what your chosen database is bad at, you haven't evaluated it
Hype-Driven Development
Adopting tech based on conference talks alone
Remedy
Require a proof-of-concept with your actual data
Conference benchmarks rarely match your production workload characteristics
Recognizing these patterns is half the battle. The cargo culting counter-argument deserves airtime: following the crowd is a rational heuristic when your team lacks deep experience. The problem is when crowd-following replaces analysis rather than supplementing it.
05
Real-World Case Studies
Theory is useful. Evidence is better. Six companies whose database journeys — documented in public engineering blogs and investor filings — reveal patterns that repeat across the industry.
Discord went from MongoDB to Cassandra to ScyllaDB. Operational pain (JVM garbage collector pauses, on-call toil) drove the migration, not benchmarks. P99 read latency dropped from 40–125ms to 15ms. Uber moved from PostgreSQL to a custom MySQL-based Schemaless — growth urgency forced a pragmatic choice. Pinterest ran sharded MySQL for a decade, added HBase, then replaced HBase with TiDB after evaluating just three finalists. Shopify and Airbnb both chose “boring technology” and invested in operational tooling rather than novel databases. Temenos — the world’s largest banking software vendor (950+ banks) — moved core banking from relational to MongoDB Atlas, benchmarking 150,000 TPS on a composed workload of payments, financial crime mitigation, and digital channels.
Database evolution — six companies, 2010–2026
Discord
Uber
Shopify
Airbnb
Temenos
Discord: Operational pain drives migration, not benchmarks
Uber: Chose familiarity (MySQL) over novelty, iterated to custom
Pinterest: Start boring, evolve later — evaluated only 3 finalists
Shopify: Innovation capital is finite — spend it on your product
Airbnb: Deliberately chose the largest talent pool and tooling ecosystem
Temenos: The world's largest banking software vendor chose documents over tables for core banking
06
When Each Platform Wins
No platform wins everywhere. The right choice depends on what your team already knows, where your infrastructure lives, and which tradeoffs you can absorb.
Universal caveats
07
Closing Checklist
Before you commit to a database, answer these twenty questions grouped across five categories. Your answers map directly to the decision dimensions covered in this guide. The final group — Decision Hygiene — does not shift the tally. Instead, it surfaces process warnings drawn from the anti-patterns in Section 04: are you cargo-culting, paralyzed by analysis, or skipping TCO modeling?
A note on PostgreSQL extensions: if your PostgreSQL strategy relies on extensions (pgvector, PostGIS, TimescaleDB, Citus), verify availability on your target managed platform. Aurora supports ~90 extensions but not Citus. AlloyDB supports pgvector natively but not TimescaleDB. Supabase and Neon have curated extension lists. Self-managed PostgreSQL has no restrictions but you own the operational burden.
Decision checklist
Answer each question. Two independent tallies track which engine fits your data and how you should deploy it. Decision Hygiene questions (Q17–Q20) surface process warnings without shifting the tally.
A — Data Model & Access Patterns
Do most reads require joining data across multiple distinct entity types?
Do your entities contain variable-shape sub-objects or deep nesting (e.g. product variants, content blocks, permission trees)?
Will your schema evolve frequently with new entity types?
Do most writes affect a single entity (user, order, device)?
Do most writes update multiple entity types atomically (e.g. transfer funds between accounts, reserve inventory + create order)?
B — Scale & Capabilities
Do users search your data by meaning or relevance — not just exact filters (e.g. semantic product search, similar-item recommendations, faceted full-text search)?
Do you need to react to data changes in real-time — triggering workflows, syncing derived views, or feeding downstream systems as records update?
Do you handle high-frequency writes — IoT telemetry, clickstreams, event logging, or similar ingestion-heavy workloads?
Do you need multi-region data locality or multi-cloud deployment?
C — Operations & Deployment
Does your team lack dedicated database operations expertise (patching, replication, failover runbooks)?
Do you need sub-minute automated failover and continuous backups with point-in-time recovery?
Are zero-downtime upgrades, auto-scaling, and performance advisors worth a managed platform premium?
Is your team deeply invested in one cloud provider's ecosystem (AWS, GCP, or Azure)?
Does your compliance framework require field-level encryption, audit logs, or queryable encryption?
D — Cost & Portability
Is budget a primary constraint with predictable workloads?
Is cloud portability or exit strategy important?
E — Decision Hygiene
Have you modeled TCO at 3× current scale including DBA headcount, egress, and migration costs?
Are you choosing this database because your team knows it and it fits — not because it’s trending?
Have you timeboxed your evaluation to 2–4 weeks with 2–3 finalists?
Have you confirmed a simpler option can’t handle this before reaching for something complex?
Running tally (0/20 answered)
Which engine fits your data?
How should you deploy it?
Write Your Y-Statement
Now write the decision down. Use the Y-statement template from Section 03. If you cannot fill in every blank, you have not finished the evaluation. Go back to the layer where you got stuck.
Define Your Fitness Functions
A database choice is not a permanent commitment. It is a hypothesis that needs ongoing validation (Ford, Parsons, Kua, Building Evolutionary Architectures, 2nd ed., 2023). Pinterest started with MySQL and evolved to TiDB. Discord started with MongoDB and evolved to ScyllaDB. The best database is the one that is right now, with a clear signal for when it stops being right.
Define automated metrics that signal when your database choice needs revisiting:
When a fitness function consistently fails, that is your signal to revisit the decision — not to panic, but to run through this framework again with updated data.