Usage¶
pgcarter has two subcommands. Full flag references live in the
CLI reference.
index — schema extraction¶
Connect to PostgreSQL and produce executable SQL, JSON metadata, and template-driven documentation.
pgcarter index \
--host localhost \
--port 5432 \
--database mydb \
--user postgres \
--password secret \
--output-dir ./inventory \
--templates-dir ./templates
Purpose:
- database schema extraction (tables, columns, indexes, constraints, functions, triggers, sequences, roles, privileges, relationships)
- SQL generation — deterministic DDL plus a single ordered
apply.sql - JSON inventory — structured metadata for every asset
The connection never writes: it issues SET default_transaction_read_only = on.
Output structure¶
inventory/
├── sql/
│ ├── apply.sql # full schema, dependency-ordered, one run
│ ├── database.sql extensions.sql roles.sql
│ └── schemas/<schema>/{tables,indexes,functions,views,triggers,sequences,permissions}/
├── json/
│ ├── database.json schemas.json tables.json indexes.json
│ ├── views.json functions.json triggers.json sequences.json
│ ├── extensions.json roles.json permissions.json relationships.json
│ ├── relationships.dot # Graphviz relationship graph
│ ├── schemas/<schema>.json tables/<table>.json
├── docs/
│ └── index.md database.md roles.md permissions.md schemas/<schema>/…
└── report.json # extracted / skipped / warnings / errors
The json/ directory is the input source for offline analysis (below).
analyze — shape analysis & profiling¶
Turn the inventory into an understanding of the database's shape: table sizes, column characteristics, data-quality signals, relationships, and likely design issues. It runs in two modes.
Offline mode (structure only, no database)¶
Analyze an existing JSON inventory with no connection. Identifies possible checks from structure, column names, types, constraints, relationships, and indexes — and records the exact read-only SQL each check would run online.
Online mode (connect and profile)¶
Connect to PostgreSQL and enrich the structural analysis with row counts, null statistics, cardinality estimates, value distributions, freshness checks, and size metrics.
Purpose:
- dataset profiling — row counts, sizes, null rates, cardinality, distributions
- table statistics — preferring PostgreSQL's own
pg_class.reltuplesandpg_total_relation_sizeover scanning - quality checks — duplicate keys, missing FK indexes, unused indexes, suspicious columns, orphaned relationships, and more
Output structure¶
analysis/
├── report.json # full analysis (tables, metrics, checks, warnings)
├── report.md # human-readable summary (rendered from a template)
├── warnings.json # every non-informational finding
├── tables/<table>.json # per-table metrics, columns, checks
├── run-report.json # run summary (extracted counts / errors)
└── docs/analysis/ # index.md, warnings.md, tables/<table>.md
See Configuration for enabling/disabling checks and tuning thresholds, sampling, and timeouts.
Quick start¶
# 1. Extract a schema inventory
pgcarter index --database mydb --output-dir ./inventory
# 2. Profile it offline (no DB needed)
pgcarter analyze --input ./inventory/json --output ./analysis
# 3. Or profile online with sampling and a per-query timeout
pgcarter analyze --database mydb --output ./analysis \
--sample-size 10000 --statement-timeout 15000
Logging¶
Logs are structured JSON on stdout by default. For local development, add
--pretty (or set LOG_PRETTY=true) for colourised console output. See
Configuration → Logging.