Architecture

Seven stages, one job, zero prompt engineering.

Praxa converts what a real person does into a custom agent that does it. Each stage is auditable; nothing is magic.

01 Observe

02 Distill

03 Match

04 Generate

05 Eval

06 Review

07 Deploy

Audit log on every action

input → reasoning → considered alternatives → confidence → output

01
Observe

Praxa connects via OAuth + a Praxa-provisioned bot user — never by scraping a real person's credentials. It reads commits, PRs, tickets, threads, calendar events, and decisions in the systems you already use. Output: a structured profile of the role, with examples, edge cases, escalation patterns, and tool usage.
02
Distill

The observation pipeline converts raw artifacts into a job specification: north-star description, concrete examples, acceptance criteria, out-of-scope examples. This document is what the Skill is held accountable to.
03
Match

Praxa picks the closest open-source Skill template from the catalog — PR-Reviewer, Issue-Triager, Sentry-Investigator, etc. — and uses it as the architectural base.
04
Generate

The agent-builder pipeline (the IP) layers the distilled job spec on top of the template, producing a custom Skill manifest. The manifest contains LOCKED fields (job spec, trigger source, tool allowlist, memory namespace, action policy, confidence threshold) and CONFIGURABLE fields (additional context, filters).
05
Eval

A tailored eval suite is generated alongside the manifest. Test cases come from real artifacts you observed, plus synthetic edge cases the builder generates. The Skill must pass a published threshold (typically 85%+) before it can deploy.
06
Review

A human reviews the manifest in the dashboard before deploy. Locked fields are immutable; configurable fields can be tuned without redeploying. The reviewer can request changes — the agent-builder regenerates and re-runs evals.
07
Deploy

Once approved, the Skill goes live. The runtime middleware enforces every locked field on every action: trigger sources outside the allowlist are dropped, tool calls outside the allowlist are blocked, memory writes to other namespaces are refused, sub-agent dispatches are blocked. The middleware is the contract.

Runtime guarantees

What "one Skill = one job" actually buys you.

Trigger isolation

A PR-Reviewer Skill bound to github.pull_request.opened cannot process a Slack message, a Jira ticket, or a Sentry alert. The runtime drops events outside the allowlist before the agent ever sees them.

Tool allowlisting

Each Skill ships with a max-20 tool allowlist. The runtime blocks calls to anything else with a typed ToolNotAllowedError. No "give the agent every tool and pray" antipattern.

Memory namespacing

Every Skill writes only to its own memory namespace; reads include workspace-shared. The 5th Skill arrives pre-trained on what the first 4 learned, but it can't corrupt their state.

Self-validation

Every action runs through a generate → critique → revise chain. Below the configured confidence threshold, the Skill escalates instead of acting.

Skip the demo, read the ADRs.

Architectural decision records are public. Engineering teams reading them learn faster than they would from a sales call.

Book a demo Security model