← Back to AxonFlow

Start a Safe Evaluation

For teams that already have AxonFlow running locally and want a serious, self-hosted evaluation path. We send a free 90-day Evaluation license directly to your inbox. No sales call required.

Choose this if you want to keep evaluating on your own

Evaluation is the right next step when Community has proven the basics and you now need more realistic limits, approval behavior, simulation, and audit depth without committing to a partnership process.

Start here if you want to evaluate quickly

Most teams reach this page after they already know AxonFlow fits their use case and now want a more realistic validation run.

Evaluation vs Community -- more capacity, plus exclusive features

Feature Community Evaluation (Free)
Tenant Policies 20 50
Org-Wide Policies 0 5
Connectors with Custom Policies 2 5
Audit Retention 3 days 14 days
LLM Providers 2 3
Execution History 50 500
Concurrent Executions 5 25
MAP Plans 25 100
Versions per Plan 10 25
SSE Connections 5 25
Cost Estimates / Day 10 100
Pending Execution Approvals 5 25
Media Analyzers 2 2
HITL Approval Gates 100 pending, 24h expiry
MAP Plane-Scoped HITL Approve/Reject Available (v7.4.0)
Cross-Plane Response Parity (WCP + MAP) Available (v7.4.0)
MAP Plane-Scoped Pending Approvals (with plan_id filter) Available (v7.4.0)
Risk-Tiered Approval Routing Severity metadata only (no queue) Severity + queue filter
Session Overrides Context only Create, 60m default / 24h cap
Decision Explainability 7-day retention 30-day retention
Workflow Checkpoint Resume List only Resume from last
Policy Simulation 300/day
Evidence Export 14-day, 3/day
Retry Context on Step Gates (wire-level) Available (per-workflow) Available (per-workflow)
Idempotency Keys on Step Gates (wire-level) Available (per-workflow) Available (per-workflow)
Retry-Aware Policy Conditions Author policies on `step.gate_count`, `step.prior_completion_status`, `step.idempotency_key`, etc.
Concurrent Executions -- MAP and WCP executions running at the same time per tenant
Pending Execution Approvals -- executions waiting for human approval in MAP confirm/step mode or WCP queues
MAP Plans -- multi-agent plans that break complex tasks into coordinated steps
Versions per Plan -- how many revisions of a single MAP plan are retained
SSE Connections -- server-sent event connections for streaming execution progress in real time
Cost Estimates / Day -- number of LLM cost estimation requests allowed per day
Execution History -- completed execution records kept for review and audit
Media Analyzers -- concurrent image analysis modules (OCR, content safety, face detection) per request
HITL Approval Gates -- human-in-the-loop checkpoints requiring explicit approval before execution proceeds
MAP Plane-Scoped HITL Approve/Reject -- plan-scoped approve / reject endpoints (POST /api/v1/plans/{id}/steps/{step_id}/approve|reject). Previously Enterprise-only; v7.4.0 lowered them to Evaluation so reviewer integrations hit the same tier gate on both planes (WCP and MAP). The response shape matches the WCP workflow-scoped endpoints byte-for-byte plus a plan_id field
Cross-Plane Response Parity (WCP + MAP) -- both approve/reject endpoints now return the same rich response shape: decision, retry_context, approval_id, approver metadata, policies_matched. One projection helper feeds both endpoints, so any field added to the response shape surfaces on both planes by default. A CI contract test locks the two responses together (v7.4.0)
MAP Plane-Scoped Pending Approvals (with plan_id filter) -- reviewer-tool convenience endpoint at GET /api/v1/plans/approvals/pending. Lists steps awaiting approval across MAP-backed workflows (confirm / step execution mode) with plan_id populated on every entry -- the one intentional asymmetry with the WCP-plane listing, mirroring the approve/reject response parity. Optional ?plan_id= filter scopes to a single plan. Same tier gate as the plane-scoped approve/reject endpoints (v7.4.0)
Risk-Tiered Approval Routing -- approval requests carry a severity (critical / high / medium / low) derived from the triggering policy or the evaluation risk score; Evaluation adds a queue that can be filtered by severity (Community has the metadata but no queue surface to filter)
Session Overrides -- time-bounded, audit-logged policy overrides for a specific developer session; every override requires a free-text justification and hits a hard 24-hour cap
Decision Explainability -- per-decision endpoint returning matched policies, risk level, override availability, and the rolling 24-hour hit count for the same rule
Workflow Checkpoint Resume -- every step gate evaluation records a governance-aware checkpoint; Evaluation can resume an interrupted workflow from the last checkpoint (Community can list them but not resume)
Policy Simulation -- dry-run policy evaluations to test rules before deploying to production
Evidence Export -- export audit evidence packages for compliance review and regulatory submissions
Retry Context on Step Gates (wire-level) -- every step-gate response carries explicit retry state (gate count, prior completion status, last decision, optional prior output) so a resumed agent can tell a first attempt apart from a retry without guessing. Available to clients on every tier
Idempotency Keys on Step Gates (wire-level) -- optional caller-supplied business-level key (invoice number, wire transfer ID, content hash) pinned to a step; a repeat gate or complete with a different key is rejected with 409 before downstream side-effects fire. Same-workflow enforcement on every tier; cross-workflow prevention is a planned future enhancement
Retry-Aware Policy Conditions -- Evaluation and Enterprise tiers can author dynamic policies whose conditions read step-retry state (step.gate_count, step.prior_completion_status, step.idempotency_key, step.first_attempt_age_seconds, step.last_decision, step.completion_count, step.prior_output_available). Patterns like "require approval on attempt 3", "block on rapid retries within 30s", or "escalate severity when a step keeps hitting gated_not_completed" become declarative instead of custom code. Creating a policy with any step.* condition on a Community license is rejected at create time

When to request this

Community is fully functional for local exploration. Evaluation unlocks the capacity and features you need when you're ready to move past tinkering into intentional evaluation. Free, 90 days, renewable. AxonFlow runs fully self-hosted, and anonymous telemetry can be disabled for regulated evaluations.

Typical evaluation flow

Most teams complete initial evaluation within a few days.

What happens after you submit

The goal is to help you finish a serious evaluation safely and quickly, not push you into a sales motion.

What helps teams get value fastest

Check your docker-compose.yml ORG_ID value. Default is local-dev-org. This must match your deployed ORG_ID or old and new data will diverge.

We'll send your license key and setup instructions here

Free. No credit card. Valid for 90 days, renewable. No sales call. The license is sent directly to your inbox.

Check your email

Your license key and setup instructions have been sent to your email. Check your inbox (and spam folder).

Didn't receive it? Email hello@getaxonflow.com and we'll sort it out.

Want help getting started? Totally optional.

Book 15 min with the founder
Recommended next reads: evaluation rollout guide, regulated-environment evaluation, tier comparison
Need enterprise workflows, direct rollout help, or a longer partnership motion? Apply for the Design Partner Program instead.
Questions? Email hello@getaxonflow.com