Evaluation License

Choose this if you want to keep evaluating on your own

Evaluation is the right next step when Community has proven the basics and you now need more realistic limits, approval behavior, simulation, and audit depth without committing to a partnership process.

Best for platform teams, security reviews, staging environments, and regulated internal evaluations
Still self-hosted, still reversible, still safe to run inside your environment
If you want direct architecture support and full enterprise rollout help instead, use Design Partner

Start here if you want to evaluate quickly

Getting Started if you still need to bring the stack up locally
Regulated Environment Evaluation if security, audit, and deployment boundaries are central to the review
Community vs Evaluation vs Enterprise if you are deciding which landing zone fits your team

Most teams reach this page after they already know AxonFlow fits their use case and now want a more realistic validation run.

Evaluation vs Community -- more capacity, plus exclusive features

Feature	Community	Evaluation (Free)
Tenant Policies	20	50
Org-Wide Policies	0	5
Connectors with Custom Policies	2	5
Audit Retention	3 days	14 days
LLM Providers	2	3
Execution History	50	500
Concurrent Executions	5	25
MAP Plans	25	100
Versions per Plan	10	25
SSE Connections	5	25
Cost Estimates / Day	10	100
Pending Execution Approvals	5	25
Media Analyzers	2	2
HITL Approval Gates	—	100 pending, 24h expiry
MAP Plane-Scoped HITL Approve/Reject	—	Available (v7.4.0)
Cross-Plane Response Parity (WCP + MAP)	—	Available (v7.4.0)
MAP Plane-Scoped Pending Approvals (with plan_id filter)	—	Available (v7.4.0)
Risk-Tiered Approval Routing	Severity metadata only (no queue)	Severity + queue filter
Session Overrides	Context only	Create, 60m default / 24h cap
Decision Explainability	7-day retention	30-day retention
Workflow Checkpoint Resume	List only	Resume from last
Policy Simulation	—	300/day
Evidence Export	—	14-day, 3/day
Retry Context on Step Gates (wire-level)	Available (per-workflow)	Available (per-workflow)
Idempotency Keys on Step Gates (wire-level)	Available (per-workflow)	Available (per-workflow)
Retry-Aware Policy Conditions	—	Author policies on `step.gate_count`, `step.prior_completion_status`, `step.idempotency_key`, etc.

Concurrent Executions -- MAP and WCP executions running at the same time per tenant

Pending Execution Approvals -- executions waiting for human approval in MAP confirm/step mode or WCP queues

MAP Plans -- multi-agent plans that break complex tasks into coordinated steps

Versions per Plan -- how many revisions of a single MAP plan are retained

SSE Connections -- server-sent event connections for streaming execution progress in real time

Cost Estimates / Day -- number of LLM cost estimation requests allowed per day

Execution History -- completed execution records kept for review and audit

Media Analyzers -- concurrent image analysis modules (OCR, content safety, face detection) per request

HITL Approval Gates -- human-in-the-loop checkpoints requiring explicit approval before execution proceeds

MAP Plane-Scoped HITL Approve/Reject -- plan-scoped approve / reject endpoints (POST /api/v1/plans/{id}/steps/{step_id}/approve|reject). Previously Enterprise-only; v7.4.0 lowered them to Evaluation so reviewer integrations hit the same tier gate on both planes (WCP and MAP). The response shape matches the WCP workflow-scoped endpoints byte-for-byte plus a plan_id field

Cross-Plane Response Parity (WCP + MAP) -- both approve/reject endpoints now return the same rich response shape: decision, retry_context, approval_id, approver metadata, policies_matched. One projection helper feeds both endpoints, so any field added to the response shape surfaces on both planes by default. A CI contract test locks the two responses together (v7.4.0)

MAP Plane-Scoped Pending Approvals (with plan_id filter) -- reviewer-tool convenience endpoint at GET /api/v1/plans/approvals/pending. Lists steps awaiting approval across MAP-backed workflows (confirm / step execution mode) with plan_id populated on every entry -- the one intentional asymmetry with the WCP-plane listing, mirroring the approve/reject response parity. Optional ?plan_id= filter scopes to a single plan. Same tier gate as the plane-scoped approve/reject endpoints (v7.4.0)

Risk-Tiered Approval Routing -- approval requests carry a severity (critical / high / medium / low) derived from the triggering policy or the evaluation risk score; Evaluation adds a queue that can be filtered by severity (Community has the metadata but no queue surface to filter)

Session Overrides -- time-bounded, audit-logged policy overrides for a specific developer session; every override requires a free-text justification and hits a hard 24-hour cap

Decision Explainability -- per-decision endpoint returning matched policies, risk level, override availability, and the rolling 24-hour hit count for the same rule

Workflow Checkpoint Resume -- every step gate evaluation records a governance-aware checkpoint; Evaluation can resume an interrupted workflow from the last checkpoint (Community can list them but not resume)

Policy Simulation -- dry-run policy evaluations to test rules before deploying to production

Evidence Export -- export audit evidence packages for compliance review and regulatory submissions

Retry Context on Step Gates (wire-level) -- every step-gate response carries explicit retry state (gate count, prior completion status, last decision, optional prior output) so a resumed agent can tell a first attempt apart from a retry without guessing. Available to clients on every tier

Idempotency Keys on Step Gates (wire-level) -- optional caller-supplied business-level key (invoice number, wire transfer ID, content hash) pinned to a step; a repeat gate or complete with a different key is rejected with 409 before downstream side-effects fire. Same-workflow enforcement on every tier; cross-workflow prevention is a planned future enhancement

Retry-Aware Policy Conditions -- Evaluation and Enterprise tiers can author dynamic policies whose conditions read step-retry state (step.gate_count, step.prior_completion_status, step.idempotency_key, step.first_attempt_age_seconds, step.last_decision, step.completion_count, step.prior_output_available). Patterns like "require approval on attempt 3", "block on rapid retries within 30s", or "escalate severity when a step keeps hitting gated_not_completed" become declarative instead of custom code. Creating a policy with any step.* condition on a Community license is rejected at create time

When to request this

You have a working local setup and want to move toward an MVP
You need organization-wide policies across multiple teams
You want 14-day audit trails for compliance visibility
You're running larger tests and need more execution visibility
You want HITL approval gates, policy simulation, or evidence export
You need session overrides so developers can self-grant a time-bounded bypass on a policy that would otherwise deny during an evaluation sprint
You want to resume an interrupted workflow from its last governance-aware checkpoint, not replay from scratch

Community is fully functional for local exploration. Evaluation unlocks the capacity and features you need when you're ready to move past tinkering into intentional evaluation. Free, 90 days, renewable. AxonFlow runs fully self-hosted, and anonymous telemetry can be disabled for regulated evaluations.

Typical evaluation flow

Install AxonFlow locally (Docker or binary)
Run SDK examples with your LLM provider
Test with internal agent workflows and policies
Scale to staging or production-like environments

Most teams complete initial evaluation within a few days.

What happens after you submit

Your evaluation license is sent directly to your inbox
You add it to your existing self-hosted deployment and restart AxonFlow
You keep testing in the same environment and with the same integration path
If you hit a blocker, you can reply directly to the email for help

The goal is to help you finish a serious evaluation safely and quickly, not push you into a sales motion.

What helps teams get value fastest

Use the same workflow, connector, and provider path you already validated in Community
Keep the evaluation bounded to one or two meaningful workflows instead of spreading it across the org
Use Evaluation to answer rollout questions, not to restart discovery from scratch

First name

Last name

Company

AxonFlow ORG_ID

Check your docker-compose.yml ORG_ID value. Default is local-dev-org. This must match your deployed ORG_ID or old and new data will diverge.

Work email

We'll send your license key and setup instructions here

Primary use case

Where is this running?

Enable 5 business day priority support Free during evaluation. Get faster responses on setup issues and integration questions.

Free. No credit card. Valid for 90 days, renewable. No sales call. The license is sent directly to your inbox.

Check your email

Your license key and setup instructions have been sent to your email. Check your inbox (and spam folder).

Didn't receive it? Email hello@getaxonflow.com and we'll sort it out.

Want help getting started? Totally optional.

Book 15 min with the founder

Recommended next reads: evaluation rollout guide, regulated-environment evaluation, tier comparison

What's the one thing that would make you not adopt AxonFlow?

Thanks, noted.

Need enterprise workflows, direct rollout help, or a longer partnership motion? Apply for the Design Partner Program instead.

Questions? Email hello@getaxonflow.com

Start a Safe Evaluation

Choose this if you want to keep evaluating on your own

Start here if you want to evaluate quickly

Evaluation vs Community -- more capacity, plus exclusive features

When to request this

Typical evaluation flow

What happens after you submit

What helps teams get value fastest

Check your email