Row vs Column Store

01 Aug, 2025

Source material:

Row-oriented	Column-oriented
MySQL, PostgreSQL	BigQuery, RedShift, Clickhouse
---	---
OLTP (transactional) use cases	OLAP (analytical) use cases
---	---
Software engineers concerned with throughput, latency, consistency (ACID properties), concurrency, performance of CRUD operations, HA, scalability	Data scientists and analysts who do complex queries on large datasets "offline" are concerned with how efficiently data is stored and how long their queries have to run
---	---
Optimized for reading and writing entire records/rows	Optimized for aggregating data from a few columns
---	---
Data is stored contiguously in the same disk block(s) - Fast retrieval of an entire record since all data is physically co-located	Data is stored across a number of disk blocks, each disk location represents values from a single column
---	---
Inserting a new record is easy (append to a single disk block)	Inserting a new record is costly (have to write to multiple disk locations)
---	---
If `SELECT`ing for a single field, all fields (the entire row) must be loaded into memory because rows are stored contiguously on blocks and the entire block needs to be loaded into memory	Slow to add new data or update existing data because it requires writing to multiple disk locations
---	---
More difficult to compress a disk block because records have varying data types	Columns can be compressed more efficiently because compression algorithms like data of the same type.