Axiom Observability
Monitor Arial runs with Axiom for production-grade observability. Get structured logs, timing metrics, and insights into your AI workstreams.
Overview
Arial integrates with Axiom to provide centralized logging and observability for your workstream executions. When enabled, Arial automatically sends structured events to Axiom, giving you:
- Real-time visibility into run progress
- Historical data for debugging failed runs
- Timing metrics for performance analysis
- Centralized logs across multiple machines/CI runs
The integration is optional and non-blocking. If Axiom is unavailable, Arial continues running normally.
Setup
1. Create an Axiom Account
Sign up at axiom.co if you don't have an account. Axiom offers a generous free tier.
2. Create a Dataset
In the Axiom dashboard:
- Go to Datasets
- Click New Dataset
- Name it
arial(or your preferred name)
3. Create an API Token
- Go to Settings > API Tokens
- Click New API Token
- Give it a name like "Arial CLI"
- Grant it Ingest permission for your dataset
- Copy the token (starts with
xaat-)
4. Configure Environment Variables
export AXIOM_TOKEN=xaat-your-token-here
export AXIOM_DATASET=arialFor multi-org accounts, also set:
export AXIOM_ORG_ID=your-org-id5. Run Arial
arial runYou'll see confirmation that Axiom is enabled:
Axiom logging enabledEnvironment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
| `AXIOM_TOKEN` | Yes | - | API token with ingest permissions |
| `AXIOM_DATASET` | No | `arial` | Dataset to send events to |
| `AXIOM_ORG_ID` | No | - | Organization ID (for multi-org accounts) |
Event Types
Arial sends structured events to Axiom. Each event includes:
_time: ISO timestampservice: Always"arial"type: Event type identifier- Event-specific fields
Run Events
| Event | Fields | Description |
|---|---|---|
| `run.started` | `specsDir`, `workstreamCount` | Emitted when `arial run` begins |
| `run.completed` | `duration`, `success` | Emitted when all workstreams finish |
Workstream Events
| Event | Fields | Description |
|---|---|---|
| `workstream.started` | `workstreamId`, `branch` | Agent begins execution |
| `workstream.completed` | `workstreamId`, `duration` | Agent finishes successfully |
| `workstream.failed` | `workstreamId`, `error`, `retryCount` | Agent encountered an error |
Merge Events
| Event | Fields | Description |
|---|---|---|
| `merge.started` | `workstreamId`, `branch` | Branch merge begins |
| `merge.completed` | `workstreamId` | Branch merged successfully |
| `merge.conflict` | `workstreamId`, `retryCount` | Merge conflict detected |
Other Events
| Event | Fields | Description |
|---|---|---|
| `context.added` | `workstreamId` | Context from another workstream added |
Example Events
Run Started
{
"_time": "2024-01-15T10:00:00.000Z",
"service": "arial",
"type": "run.started",
"specsDir": "./specs",
"workstreamCount": 3
}Workstream Completed
{
"_time": "2024-01-15T10:02:15.000Z",
"service": "arial",
"type": "workstream.completed",
"workstreamId": "auth",
"duration": 135000
}Workstream Failed
{
"_time": "2024-01-15T10:01:30.000Z",
"service": "arial",
"type": "workstream.failed",
"workstreamId": "api",
"error": "Agent exceeded context limit",
"retryCount": 0
}Querying Data
Axiom APL Examples
All events from the last hour:
['arial']
| where _time > ago(1h)Failed workstreams:
['arial']
| where type == "workstream.failed"
| project _time, workstreamId, errorAverage workstream duration:
['arial']
| where type == "workstream.completed"
| summarize avg(duration) by bin(_time, 1h)Run success rate:
['arial']
| where type == "run.completed"
| summarize
total = count(),
successful = countif(success == true)
| extend success_rate = (successful * 100.0) / totalWorkstreams with merge conflicts:
['arial']
| where type == "merge.conflict"
| summarize conflicts = count() by workstreamId
| order by conflicts descCreating Dashboards
Build an Arial monitoring dashboard with these widgets:
Run Overview
- Success Rate: Pie chart of
run.completedbysuccess - Runs Over Time: Time series of
run.startedevents
Workstream Health
- Status Distribution: Count by
workstream.completedvsworkstream.failed - Average Duration: Line chart of workstream durations
- Failure Reasons: Table of
workstream.failedgrouped byerror
Merge Analytics
- Conflict Rate: Percentage of workstreams with
merge.conflictevents - Retry Distribution: Histogram of
retryCountvalues
CI/CD Integration
GitHub Actions
- name: Run Arial
env:
AXIOM_TOKEN: ${{ secrets.AXIOM_TOKEN }}
AXIOM_DATASET: arial-ci
run: arial runGitLab CI
run-arial:
script:
- arial run
variables:
AXIOM_TOKEN: $AXIOM_TOKEN
AXIOM_DATASET: arial-ciUse a different dataset (e.g., arial-ci) to separate CI runs from local development.
Troubleshooting
"Axiom logging enabled" Not Shown
Verify your token is set:
echo $AXIOM_TOKENEnsure it starts with xaat-.
Events Not Appearing in Axiom
- Check dataset name: Ensure
AXIOM_DATASETmatches your Axiom dataset - Verify token permissions: Token needs Ingest permission
- Wait for flush: Events are buffered and sent every 5 seconds
- Check for errors: Look for
Axiom ingest failedmessages in console output
Multi-Org Authentication Issues
If your token belongs to a specific organization, set AXIOM_ORG_ID:
export AXIOM_ORG_ID=your-org-idFind your org ID in Axiom under Settings > Organization.
Network Errors
Arial retries failed requests by buffering events. If network issues persist, events may be lost. Arial continues executing regardless of Axiom availability.
Best Practices
- Use separate datasets for development, staging, and production
- Set up alerts in Axiom for
workstream.failedevents - Monitor duration trends to catch performance regressions
- Review merge conflicts to identify specs that need better isolation
- Archive old data periodically to manage costs
Security
- Store
AXIOM_TOKENsecurely (environment variable, secrets manager) - Use tokens with minimal permissions (ingest-only)
- Create separate tokens for different environments
- Rotate tokens periodically