ADR 0001: Use DuckDB as the storage and execution engine¶

Status: Accepted
Supersedes: none
Superseded by: none

Context¶

bqemulator needs an embedded SQL engine that can execute analytical queries locally with no external dependencies. Candidates considered:

Engine	Pros	Cons
DuckDB	Columnar analytical engine; excellent SQL coverage; ARRAY/STRUCT/JSON native; Arrow interop; tier-1 Python bindings; stable on-disk format	Single-writer model; some BQ features absent (GEOGRAPHY without ext)
SQLite	Ubiquitous; embedded	Row-oriented; weak analytics; no native ARRAY/STRUCT; goccy's chosen engine and source of its SQL gaps
In-memory custom	Maximum fidelity	Enormous implementation burden
ClickHouse / Databend	Columnar, analytical	Not embedded; requires a server process

Decision¶

Use DuckDB. It is the only embedded engine that matches BigQuery's analytical character, and it brings first-class ARRAY/STRUCT/JSON types that map cleanly to GoogleSQL semantics.

Consequences¶

Positive: mature, fast, embedded, no subprocess; Arrow interop makes Storage Read API implementation straightforward; spatial extension gives us GEOGRAPHY; active development and a stable v1.0+ format.
Negative: a single writer at a time — writes serialize on an asyncio lock. Acceptable for emulator workloads; documented characteristic.
Negative: a few BigQuery features (time travel, materialized views, partitioning) have no native DuckDB equivalent and must be built as a layer on top. Accepted; see ADRs 0006, 0009.