Testing
Verification has two layers:
- local implementation verification in this repository
- cross-repository contract verification with
oas-cli-conformanceand schemas fromoas-cli-spec
Baseline commands
From the repo root:
go test ./...
go build ./cmd/ocli ./cmd/oclird
Repository convenience targets:
make verify
Current make verify runs:
gofmt -w $(find . -name '*.go' -print)go test ./...go build ./cmd/ocli ./cmd/oclird
Docs verification
For documentation changes:
cd website
npm ci
npm run build
That verifies Markdown, sidebars, generated routes, and Docusaurus config. CI runs this build alongside root make verify, but make verify itself remains Go-only.
Cross-project contract verification
spec/ and conformance/ live in this repository as first-class subprojects, so contract verification runs in-repo without checking out external repositories:
make verify-spec # validate spec examples against schemas in spec/schemas/
make verify-conformance # run conformance fixtures; uses spec/schemas/ automatically
make verify-all # fmt + test + build + verify-spec + verify-conformance
verify-conformance uses spec/schemas/ as the default schema root via the OCLI_SCHEMA_ROOT fallback in conformance/scripts/run_conformance.py. You can override that with an explicit --schema-root flag if needed.
If you change config semantics, catalog output, schema-facing behavior, or anything else that affects the public contract, run make verify-all to confirm the Go implementation, the spec examples, and the conformance fixtures all agree.
Product tests
product-tests/ holds end-to-end tests that exercise the CLI against real infrastructure (REST API, OAuth stub, MCP servers). They are separate from go test ./... because they require Docker and npx.
The live Authentik runtime-auth slice is explicitly opt-in: plain go test ./... skips it unless OCLI_RUN_AUTHENTIK_TESTS=1 is set. Use cd product-tests && make test-runtime-auth-authentik for the real container-backed run; CI uses that dedicated target in its own job.
The broader fleet-based validation program is documented in Fleet Validation. Use that page for the capability matrix, live proof inventory, and rubric artifact workflow.
Smoke vs full harness
| Mode | Command | What it does | When to run |
|---|---|---|---|
| Smoke | make product-test-smoke | Checks prerequisites (Go, Docker, npx) and validates all docker compose config files without starting services | Every PR; runs in CI automatically |
| Full | make product-test-full | Validates smoke prerequisites/configs, starts services, then tears them down (capability test runs are a placeholder — no test targets execute yet) | Before merging changes that touch product behaviour or infra configs |
Running product tests locally
# Quick sanity check — no services started
make product-test-smoke
# Full suite — requires Docker and outbound npm access on first run
make product-test-full
From inside product-tests/ you can also target individual capability groups:
cd product-tests
# MCP stdio transport (uses npx, no Docker service needed)
make test-mcp-stdio
# MCP remote / streamable-http transport (starts a Docker container)
make test-mcp-remote
# Reproducible fleet lanes
make fleet-matrix-ci
# Dedicated MCP remote fleet lane
make fleet-matrix-mcp-remote
# Explicit control
make check-prereqs
make services-up
go test ./tests/... -run TestCapabilityCatalog -count=1 -v
make services-down
What smoke validates in CI
CI runs make product-test-smoke on every push and pull request. That target:
- Checks that Go, Docker, and
npxare available in the runner. - Runs
docker compose config --quietagainstproduct-tests/docker-compose.yml. - Runs
docker compose config --quietagainstproduct-tests/mcp/remote/docker-compose.yml.
No containers are started and no network traffic occurs. If either compose file is malformed the job fails immediately.
Useful targeted test entry points
- config merge/validation:
go test ./pkg/config -run TestLoadEffective - CLI runtime resolution and embedded mode:
go test ./cmd/ocli -run TestRootCommand - runtime HTTP API and auth/policy flows:
go test ./internal/runtime -run TestServer - discovery and catalog integration:
go test ./pkg/catalog -run TestBuild
What to test for common changes
Config changes
- schema validation errors
- merge behavior across scopes
- cross-reference validation
Discovery/catalog changes
- source provenance
- normalized tool IDs and command names
- workflow binding errors
- overlay effects
Execution/auth changes
- header/query rendering
- retries
- approval gates
- secret resolution behavior
Runtime HTTP API changes
- success payloads
- plain-text error behavior
- audit side effects
When you add or change behavior, prefer tests in the owning package instead of only end-to-end coverage.