Architecture
Seven stages, one job, zero prompt engineering.
Praxa converts what a real person does into a custom agent that does it. Each stage is auditable; nothing is magic.
Audit log on every action
input → reasoning → considered alternatives → confidence → output
- 01
Observe
Praxa connects via OAuth + a Praxa-provisioned bot user — never by scraping a real person's credentials. It reads commits, PRs, tickets, threads, calendar events, and decisions in the systems you already use. Output: a structured profile of the role, with examples, edge cases, escalation patterns, and tool usage.
- 02
Distill
The observation pipeline converts raw artifacts into a job specification: north-star description, concrete examples, acceptance criteria, out-of-scope examples. This document is what the Skill is held accountable to.
- 03
Match
Praxa picks the closest open-source Skill template from the catalog — PR-Reviewer, Issue-Triager, Sentry-Investigator, etc. — and uses it as the architectural base.
- 04
Generate
The agent-builder pipeline (the IP) layers the distilled job spec on top of the template, producing a custom Skill manifest. The manifest contains LOCKED fields (job spec, trigger source, tool allowlist, memory namespace, action policy, confidence threshold) and CONFIGURABLE fields (additional context, filters).
- 05
Eval
A tailored eval suite is generated alongside the manifest. Test cases come from real artifacts you observed, plus synthetic edge cases the builder generates. The Skill must pass a published threshold (typically 85%+) before it can deploy.
- 06
Review
A human reviews the manifest in the dashboard before deploy. Locked fields are immutable; configurable fields can be tuned without redeploying. The reviewer can request changes — the agent-builder regenerates and re-runs evals.
- 07
Deploy
Once approved, the Skill goes live. The runtime middleware enforces every locked field on every action: trigger sources outside the allowlist are dropped, tool calls outside the allowlist are blocked, memory writes to other namespaces are refused, sub-agent dispatches are blocked. The middleware is the contract.
Runtime guarantees
What "one Skill = one job" actually buys you.
Trigger isolation
A PR-Reviewer Skill bound to github.pull_request.opened
cannot process a Slack message, a Jira ticket, or a Sentry alert.
The runtime drops events outside the allowlist before the agent
ever sees them.
Tool allowlisting
Each Skill ships with a max-20 tool allowlist. The runtime blocks
calls to anything else with a typed ToolNotAllowedError.
No "give the agent every tool and pray" antipattern.
Memory namespacing
Every Skill writes only to its own memory namespace; reads include workspace-shared. The 5th Skill arrives pre-trained on what the first 4 learned, but it can't corrupt their state.
Self-validation
Every action runs through a generate → critique → revise chain. Below the configured confidence threshold, the Skill escalates instead of acting.
Skip the demo, read the ADRs.
Architectural decision records are public. Engineering teams reading them learn faster than they would from a sales call.