Skip to content

Architecture

Junior is an AI review agent that runs in CI pipelines (GitLab CI, GitHub Actions) or locally as a CLI tool. It is a small review-runbook framework: a runbook collects domain context deterministically, hands it to a schema-agnostic LLM harness for analysis, and publishes the structured result. The built-in runbooks review code diffs (MRs/PRs), but the framework itself knows nothing about git.

Junior has exactly two independent extension points — Runbook and Harness. There is no separate “platform” selector — platform (local / GitHub / GitLab) is an internal detail of a runbook’s own collect/publish, not a globally selectable knob.

flowchart LR
    subgraph A1["RUNBOOK (per domain)"]
        direction TB
        P1["local_review"]
        P2["github_pr_review"]
        P3["gitlab_pr_review"]
        P4["bitbucket_pr_review"]
        P5["…a fork's runbook"]
    end

    subgraph A2["LLM HARNESS (shared, schema-agnostic)"]
        direction TB
        E1["claudecode"]
        E2["codex"]
        E3["pydantic"]
        E4["deepagents"]
        E5["pi"]
    end

    A1 -->|"render → output_schema"| A2
    A2 -->|"validated result"| A1
  • A Runbook (src/junior/runbook/base.py, generic over Context C and Result R) owns one domain. It brings its own context_model / result_model schemas and the domain logic: collect → render → system_prompt → publish. The platform variants live inside the runbook family.
  • A Harness (same file) is shared across all runbooks. It knows nothing about diffs, MRs, or Jira: complete() takes a system prompt, a user message, and an output_schema parameter, calls the model, and returns an LLMResult whose .output is a validated instance of that schema. One set of harnesses serves every runbook.

The key move: the output schema is a parameter of the harness, not a hard-coded type. That is what lets five harnesses serve every runbook. The code-review runbooks ask for LLMReviewOutput; a fork’s Jira runbook would ask the same harnesses for its own TicketAssessment.

flowchart TD
    A([junior CLI]) --> SEL{"--runbook NAME\n(explicit, required)"}
    SEL --> B

    subgraph PL["Runbook (domain-owned)"]
        B["runbook.collect(settings)\n→ Context (C)"] --> R["runbook.render(context,\nfile_access)\n→ user message"]
        R --> SP["runbook.system_prompt(settings)\n(role + BASE_RULES + AGENT.md)"]
    end

    SP --> ENG
    R --> ENG

    subgraph RUN["run_runbook (generic runner)"]
        ENG{"HARNESS?\n(settings.llm.harness)"}
        ENG -->|claudecode| Q2["claude -p subprocess\n--json-schema=output_schema"]
        ENG -->|codex| Q["codex exec subprocess\nstrict schema from output_schema"]
        ENG -->|pydantic| P["pydantic-ai\noutput_type=output_schema"]
        ENG -->|deepagents| D["LangChain agent\nsubmit_tool=output_schema"]
        ENG -->|pi| PI["pi --mode json subprocess\nschema embedded in prompt"]
        Q2 & Q & P & D & PI --> LR([LLMResult\n.output / .usage / .errors])
    end

    LR --> OUT

    subgraph PB["output (per --publish)"]
        OUT{"--publish?"}
        OUT -->|"no"| RAW["render_output() → stdout / -o\n(raw result, default JSON)"]
        OUT -->|"yes"| PUB["runbook.publish()"]
        PUB -->|github_pr_review| X[PR review comments]
        PUB -->|gitlab_pr_review| W[MR note + inline threads]
        PUB -->|bitbucket_pr_review| BB[PR comment + inline comments]
        PUB -->|local_review| MD[pretty Markdown locally]
        PUB -->|weather_advice| RICH[Rich panel]
    end

[!NOTE] run_runbook (src/junior/runbook/runner.py) is given an already-collected context — collection is the caller’s concern, so junior run --from-file ctx.json bypasses collect() entirely. The runner only does harness.complete(...) → runbook.publish(...).

junior run --publish-file out.md skips collect and the LLM entirely. Junior reads the .md, wraps it in a ReviewResult(pre_formatted=...), and calls the runbook’s publish_prepared to post it to the platform. Requires a publishing runbook (github_pr_review / gitlab_pr_review / bitbucket_pr_review) plus the platform token and CI variables — see CI Setup.

Every harness module exposes a module-level HARNESS instance implementing Harness:

class Harness(ABC):
name: ClassVar[str] # matches the HarnessKind member, e.g. "codex"
file_access: ClassVar[bool] # True if the harness reads repo files itself
@abstractmethod
def complete(
self, *,
system_prompt: str,
user_message: str,
output_schema: type[BaseModel], # ← the runbook's result_model
settings: Settings,
) -> LLMResult: ...

LLMResult is an envelope: .output (validated output_schema instance), .usage (Usage token counts), .errors (partial-failure notes, e.g. one sub-agent died). The only schema plumbing differs per harness — claudecode passes the schema via --json-schema, codex builds a strict JSON schema with openai’s to_strict_json_schema, pydantic-ai uses output_type=, deepagents builds a submit tool from it, pi embeds the schema in the system prompt and validates the reply.

file_access lets a runbook avoid inlining a full diff for harnesses that read files themselves (claudecode, codex, piTrue; pydantic, deepagents get the diff inline → False).

class Runbook(ABC, Generic[C, R]):
name: ClassVar[str]
context_model: ClassVar[type[BaseModel]] # C — for JSON round-trip (--from-file)
result_model: ClassVar[type[BaseModel]] # R — the schema handed to the harness
needs_git: ClassVar[bool] = False # preflight requires a .git repo?
@abstractmethod
def collect(self, settings) -> C: ... # may dispatch gitlab/github/local
@abstractmethod
def render(self, context, settings, *, file_access) -> str: ...
def system_prompt(self, settings) -> str: ... # default: ""
@abstractmethod
def publish(self, settings, result, usage, *, errors) -> None: ... # only with --publish
def render_output(self, result) -> str: ... # no --publish output (default JSON)
def validate(self, settings, *, publish_enabled) -> list[str]: ... # runbook-specific checks
def is_blocking(self, result) -> bool: ... # exit-code policy (default False)
def is_empty(self, context) -> bool: ... # skip the LLM (default False)
def summary(self, result) -> dict: ... # final "done" log line

ABCs are used deliberately: this is a forkable framework, and a forgotten method should fail loudly at instantiation (TypeError: Can't instantiate abstract class …) rather than surface as an obscure AttributeError later.

# config.py — enum value = harness module path
class HarnessKind(_ModulePathEnum):
PYDANTIC = "junior.harnesses.pydantic"
CODEX = "junior.harnesses.codex"
CLAUDECODE = "junior.harnesses.claudecode"
DEEPAGENTS = "junior.harnesses.deepagents"
PI = "junior.harnesses.pi"
# registry.py — import the module, read its HARNESS instance
def get_harness(kind: HarnessKind) -> Harness:
module = importlib.import_module(kind.value)
return module.HARNESS

Short names work via _missing_: HarnessKind("codex")HarnessKind.CODEX. New harness = one file exposing HARNESS + one enum member.

Runbook selection is explicit — there is no token auto-detection. Choose it with --runbook NAME, config runbook:, or env RUNBOOK; with none set, the run exits 2 — there is no implicit default.

The registry (src/junior/runbook/registry.py) merges runbooks from four sources:

  1. Built-in — every subpackage of junior.runbooks is auto-discovered via a pkgutil scan (no hardcoded list); importing it runs its registration.
  2. External plugin — a pip-installed package declaring a junior.runbooks entry point (name = "pkg.module:ClassName").
  3. Direct path--runbook "pkg.module:ClassName" loads a Runbook subclass directly.
  4. Repo-local — with local_runbooks: true (opt-in; executes repo code), registry.load_local_runbooks() loads <project>/.junior/runbooks/*.py (@register_runbook classes) and YAML manifests (driving a ScriptRunbook).

See Adding runbooks & harnesses and the runbook framework deep-dive.

The built-in code-review runbooks all review a git diff with the same CollectedContext in and LLMReviewOutput out. They share a base class CodeReviewRunbook (src/junior/runbooks/code_review/base.py) holding render, system_prompt, result_model, is_blocking, summary, and a template-method publish that delegates to _post_to_platform (run only with --publish). The variants differ only in collect() and _post_to_platform(). Without --publish, all of them emit the raw LLMReviewOutput as JSON to stdout/-o:

Runbookcollect--publish does
local_reviewlocal git diffrenders pretty Markdown locally
github_pr_reviewGitHub PR via APIposts PR review comments
gitlab_pr_reviewGitLab MR via APIposts MR note + inline threads
bitbucket_pr_reviewBitbucket DC PR via APIposts PR comment + inline comments

The collector/publisher implementations live in src/junior/collect/{local,gitlab,github,bitbucket}.py (collect(settings) -> CollectedContext) and src/junior/publish/{local,gitlab,github,bitbucket}.py (post_review(settings, result)). Each runbook variant imports the one it needs directly — there is no central resolved_collector/resolved_publisher dispatch.

Settings split into three frozen groups plus two top-level scalars:

GroupHolds
settings.context (ContextSettings)source/diff inputs, project dir, context.prompts (the task layer)
settings.llm (LLMSettings)harness (the LLM driver), model, API keys, system_prompt (role layer), limits
settings.output (OutputSettings)output file, platform tokens/IDs, publish flag
settings.runbookwhich runbook to run (registry name or module:ClassName)
settings.log_levellogging verbosity

[!NOTE] settings.llm was previously settings.review (ReviewSettings). system_prompt is the role layer the runner merges on top of the runbook’s own system_prompt(); for code review the task layer is context.prompts.

Validation is split: generic checks (context files, LLM API key) live in Settings.preflight; runbook-specific checks (e.g. a publishing runbook needs GITHUB_TOKEN) live in Runbook.validate.

src/junior/
cli/ ← Typer CLI surface (split by concern)
app.py ← @app.callback + subcommands (run, dry-run, list, init, config)
options.py ← reusable Annotated option types + panel constants
settings_builder.py ← Settings construction from config + CLI flags
observability.py ← structlog setup + startup logging
config_show.py ← `config show` (effective config + status) + `config env`
listing.py ← `junior list` / `config list`: runbooks + harnesses (+ readiness)
console.py ← shared Rich consoles: stdout (content) + stderr (status/errors)
actions.py ← orchestration: collect, run, emit raw output / publish
interactive/ ← wizard for `init` + `run -i`
config.py ← Settings groups (Context / LLM / Output) + runbook + log_level + local_runbooks
init_config.py ← `junior config init` saves to ~/.config/junior/settings.yaml
models.py ← deprecated shim → runbooks/code_review/models.py
run_record.py ← write_run_record(): secret-free .junior/output/{ts}.json
prompt_loader.py ← inline text + file:// URIs → Prompt objects
runbook/ ← the framework (domain-agnostic core)
base.py ← Runbook + Harness ABCs, LLMResult, Usage
runner.py ← run_runbook(): complete → publish only when --publish
registry.py ← get_runbook (built-in + entry-point + path + repo-local), get_harness
harnesses/ ← shared LLM drivers (each exposes HARNESS)
claudecode.py ← claude -p subprocess, --json-schema
codex.py ← codex exec subprocess, strict JSON schema
pydantic.py ← pydantic-ai, single structured call (output_type)
deepagents.py ← LangChain agent, submit_tool from output_schema
pi.py ← pi --mode json subprocess, schema embedded in prompt
runbooks/ ← built-in runbooks (auto-discovered)
code_review/
base.py ← CodeReviewRunbook (shared render/prompt/publish)
models.py ← domain models: CollectedContext, LLMReviewOutput, ReviewResult, … (frozen)
local.py github.py gitlab.py bitbucket.py ← per-platform collect + _post_to_platform
render.py ← build_user_message()
instructions.py ← build_review_prompt() + BASE_RULES (AGENT.md/CLAUDE.md)
weather/ ← example non-code-review runbook (live weather → what to wear)
script/ ← ScriptRunbook: manifest-driven (prompt + optional schema/collect/publish)
collect/ ← deterministic context collection (imported by runbooks)
local.py gitlab.py github.py bitbucket.py
core/ ← collect_base, diff parsing, base-SHA resolve
publish/ ← result posting (imported by runbooks)
local.py gitlab.py github.py bitbucket.py
core/ ← markdown formatter, format_inline_comment

The harnesses are schema-agnostic and never import a runbook; the runbooks own all prompt/context/platform logic and import the collect/publish helpers they need. The runner ties runbook and harness together generically.