Junior — Architecture¶
Overview¶
Junior is an AI code review agent that runs in CI pipelines (GitLab CI, GitHub Actions) or locally as CLI tool. It collects MR/PR context deterministically, delegates analysis to AI agents, and publishes structured review comments.
Review Flow¶
flowchart TD
A([junior CLI]) --> B
subgraph P1["Phase 1 — Collect (deterministic)"]
B[git diff + parse changed files] --> C[commit messages]
C --> D["extra context\n--context (text) + --context-file (files)"]
D --> H{platform token?}
H -->|GITLAB_TOKEN| I[fetch MR metadata\nfrom GitLab API]
H -->|GITHUB_TOKEN| J[fetch PR metadata\nfrom GitHub API]
H -->|none| K[local only]
I & J & K --> L([CollectedContext])
end
L --> M
subgraph P2["Phase 2 — AI Review"]
M[load prompts/*.md] --> N[build user message]
N --> O{AGENT_BACKEND?}
O -->|claudecode| Q2[claude -p subprocess\nreads files via tools]
O -->|pydantic| P[parallel sub-agents\nvia asyncio.gather\n+ summary agent]
O -->|codex| Q[codex exec subprocess\nreads files via sandbox]
O -->|deepagents| R[LLM orchestrator\n+ subagents]
P & Q2 & Q & R --> S([ReviewResult])
end
S --> T
subgraph P3["Phase 3 — Output"]
T[format markdown] --> U[stdout / -o file]
U -->|"--publish"| V{platform?}
V -->|GITLAB_TOKEN| W[MR note +\ninline threads]
V -->|GITHUB_TOKEN| X[PR comment +\nreview comments]
end
--publish FILE shortcut¶
When --publish receives a file path, the entire pipeline is skipped. Junior reads the .md file, wraps it in a ReviewResult(pre_formatted=...), and publishes directly to the platform. Requires a platform token and CI variables — see CI Setup.
Pipeline (text)¶
junior --prompts common
│
├─ Phase 1: COLLECT (deterministic, no AI)
│ ├─ git diff → parse changed files → commit messages
│ ├─ extra context: --context (text) AND --context-file (files)
│ └─ platform enrichment: GitLab/GitHub API → MR/PR metadata (if token set)
│
├─ Phase 2: REVIEW (AI)
│ ├─ load prompts from prompts/*.md
│ ├─ build user message (context_builder.py)
│ └─ dispatch to agent backend:
│ ├─ pydantic → parallel sub-agents + summary agent
│ ├─ claudecode → claude -p subprocess (reads files via tools)
│ ├─ codex → codex exec subprocess (reads files via sandbox)
│ └─ deepagents → LLM orchestrator + subagents
│
└─ Phase 3: OUTPUT
├─ always: format markdown → stdout or -o file
└─ if --publish: also post to GitLab MR notes / GitHub PR comments
Backend Dispatch Pattern¶
All three components (collect, agent, publish) are interfaces — each has multiple implementations that are interchangeable at runtime. The implementation is selected by enum value, which is a Python module path.
Contracts¶
Each backend module must export one function with a fixed signature:
| Component | Function | Signature |
|---|---|---|
| Collector | collect() |
(settings: Settings) -> CollectedContext |
| Agent | review() |
(context: CollectedContext, settings: Settings, prompts: list[Prompt]) -> ReviewResult |
| Publisher | post_review() |
(settings: Settings, result: ReviewResult) -> None |
Implementations vary widely — subprocess calls (claudecode, codex), async SDK (pydantic), LLM orchestrator (deepagents) — but all conform to the same interface.
Dispatch¶
# config.py — enum value = module path
class AgentBackend(str, Enum):
PYDANTIC = "junior.agent.pydantic"
CLAUDECODE = "junior.agent.claudecode"
CODEX = "junior.agent.codex"
DEEPAGENTS = "junior.agent.deepagents"
# agent/__init__.py — dispatch
module = importlib.import_module(backend.value)
return module.review(context, settings, prompts)
New backend = one file + one enum member. See Adding backends.
Short names work via _missing_: AgentBackend("pydantic") → AgentBackend.PYDANTIC.
Platform Auto-Detection¶
Collector and publisher are auto-detected from token presence: