MongoDB
A document database with flexible schemas, horizontal scaling, and a polished managed offering.
document nosql json sharding
What it is
MongoDB is a document-oriented database that stores BSON (a binary, typed extension of JSON). It was first released in 2009 by 10gen (now MongoDB Inc.). The pitch is schema-flexible storage with horizontal scaling built in: every record is a document, every collection can have heterogeneous documents, and sharding is a first-class operation.
In 2018, MongoDB moved from AGPL to SSPL (Server Side Public License), which is not OSI-approved as open source. This affects which organizations can ship or host it commercially.
Why people use it
- Schema flexibility. No migrations to add a field. Documents in the same collection can have different shapes. For rapidly evolving applications, this is faster than versioned schema changes.
- JSON-shaped storage. Documents map naturally to REST API responses and to nested data structures in most modern languages.
- Horizontal scaling. Sharding is built-in. Pick a shard key, register shards, and writes route automatically. (Picking the right shard key is still a hard problem.)
- Replication via replica sets. Three-node replica sets are the standard HA pattern, with automatic failover.
- Aggregation pipeline. Powerful framework for transformations and
analytics, with stages like
$match,$group,$lookup,$unwind. - MongoDB Atlas. The managed offering is genuinely excellent — global clusters, online schema changes, integrated full-text search and vector search.
When to use MongoDB
- Document-shaped data where each document is more or less self-contained.
- Apps where the schema evolves quickly and you don’t want migration overhead.
- Workloads that naturally fit the document model: user profiles, product catalogs, content management, event logs.
- Multi-region apps where Atlas Global Clusters fit the access pattern.
- When you want a managed database with strong DX and integrated extras (Atlas Search, Atlas Vector Search, App Services).
When not to use MongoDB
- Highly relational data with frequent joins. Postgres handles this
better; Mongo’s
$lookupworks but is slower than relational joins. - Strong cross-document consistency. Multi-document transactions exist (since 4.0 for replica sets, 4.2 for sharded clusters) but come with performance costs.
- Workloads where data is fundamentally structured. Postgres with JSONB gives you flexibility where you need it and structure where you don’t.
- When SSPL is a blocker. Some organizations and cloud providers can’t ship SSPL code. Atlas is the workaround; self-hosting requires legal review.
Notable trade-offs
- SSPL license. Not OSI-approved as open source. AWS and others have responded by building compatible APIs (AWS DocumentDB) on different engines.
- Schema flexibility is a double-edged sword. Without discipline, you end up with documents in the same collection that have different shapes, silently. JSON schema validation helps but is opt-in.
- Storage overhead. BSON documents repeat field names per document. Postgres JSONB is more compact for the same data.
- Aggregation pipeline syntax is unique. Skills don’t transfer to other databases. This is real lock-in.
- Default consistency was historically weak. Pre-2017, default write concern was unacknowledged. Modern versions default sensibly, but legacy apps may have outdated configs.
- Cost. MongoDB Atlas is excellent but priced accordingly. Self-hosting at scale requires real ops investment.
Ecosystem
- MongoDB Atlas. The dominant managed offering, with global clusters, serverless, full-text search, and vector search.
- MongoDB Community Edition. Self-host under SSPL.
- AWS DocumentDB. Mongo wire-compatible API on a different engine. Compatibility is partial — verify your specific queries work.
- Azure Cosmos DB. Offers a Mongo API option among others.
- FerretDB. Open-source (Apache 2) Mongo-compatible proxy backed by Postgres. Genuinely permissive license; compatibility is improving over time.