Open Source · v0.1.0 · Python 3.11+ · 10 Providers · Async-native

Route AI workloads
to any edge.

Deploy inference to GPU clusters, TEE nodes, and major cloud providers — all through one async Python interface. No lock-in. No rewriting.

Axon is to edge compute what httpx is to HTTP — one client, any backend.

Get started → View on GitHub
$ pip install axon
GitHub stars PyPI Python versions License

From OpenAI to the edge
in two lines of Python

Your existing OpenAI SDK code works unchanged. Swap the base_url and api_key — Axon routes requests automatically to the cheapest available provider across io.net, Akash, Acurast, AWS and more.

  • Drop-in with the openai Python package, LangChain, LlamaIndex, DSPy
  • Async-native with httpx — compatible with FastAPI, Django Async, Starlette
  • Automatic failover, latency & cost routing, circuit-breaker protection
  • Pydantic v2 models throughout — fully typed, IDE-friendly
  • ~$0.40/hr A100 spot vs. $0.06/1K tokens on GPT-4
See the router → GitHub →
import asyncio
from axon import AxonClient
from axon.types import DeploymentConfig

async def main():
    async with AxonClient(
        provider="ionet",
        secret_key="your-key"
    ) as client:

        # Deploy AI workload to edge GPU
        deployment = await client.deploy(
            DeploymentConfig(
                name="my-inference-worker",
                entry_point="worker.py",
                memory_mb=4096,
                replicas=2,
            )
        )

        # Stream results from the worker
        client.on_message(lambda msg:
            print(f"Result: {msg.payload}")
        )

        await client.send(
            deployment.id,
            {"prompt": "Summarise this article..."}
        )

asyncio.run(main())
from openai import AsyncOpenAI
import os

# ── Before: OpenAI ──────────────────────────
client = AsyncOpenAI(
    api_key=os.getenv("OPENAI_API_KEY")
)

# ── After: Axon edge routing ─────────────────
# Change just these two lines — nothing else
client = AsyncOpenAI(
    base_url="http://localhost:8787/v1",
    api_key=os.getenv("AXON_SECRET_KEY")
)

# Your existing code works unchanged ↓
response = await client.chat.completions.create(
    model="llama-3-8b",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)
async for chunk in response:
    print(chunk.choices[0].delta.content, end="")

Live provider health

All 10 providers operational — updated every 5 minutes.

io.net
Akash
Acurast
Fluence
Koii
AWS
GCP
Azure
Cloudflare
Fly.io

One interface, any network

Deploy to GPU clusters or major cloud platforms without changing your code. Axon routes to the fastest, cheapest available option.

Edge & Private compute

io.net
GPU · Live
GPU clusters — A100, H100, RTX 4090 spot compute at a fraction of cloud pricing. Best for large model inference and training jobs.
🖥️ GPU clusters 🔑 API key ⚡ ~$0.40/hr A100
Akash
Live
Open cloud marketplace where providers bid to run containerised workloads. Pay per use with no lock-in and no vendor overhead.
☁️ Docker containers 🔑 BIP-39 mnemonic ⚡ nodejs / docker
Acurast
TEE · Live
237,000+ smartphone nodes running inside hardware Trusted Execution Environments. Your code and data remain private — even from device owners.
📱 TEE-based 🔑 P256 auth ⚡ nodejs / wasm
Fluence
Live
Serverless compute built on a peer-to-peer network. Deploy functions globally with no cold starts and no central point of failure.
🌐 P2P relay 🔑 Ed25519 key ⚡ nodejs
Koii
Live
Community-owned compute with Solana-compatible task nodes. Ideal for recurring data tasks, oracles, and persistent agent workloads.
🔗 Task nodes 🔑 Solana keypair ⚡ nodejs

Cloud Providers Live

AWS
Lambda · Fargate
Lambda zip deployments with automatic Function URLs, or ECS Fargate for containerised workloads. Uses boto3 — no extra config boilerplate.
⚡ Lambda · Fargate 🔑 IAM credentials 🐍 pip install axon[aws]
Google Cloud
Cloud Run · Functions
Cloud Run v2 containers or 2nd-gen Cloud Functions via GCS source upload. ADC support — works with service accounts, Workload Identity, and Cloud Shell.
⚡ Cloud Run · Functions 🔑 Service account / ADC 🐍 pip install axon[gcp]
Azure
ACI · Functions
Container Instances or Azure Functions v2 via Kudu zip deploy. Managed identity, service principal, and DefaultAzureCredential all supported.
⚡ ACI · Functions 🔑 Service principal / MI 🐍 pip install axon[azure]
Cloudflare Workers
Edge · 300+ PoPs
Deploy Workers scripts to 300+ edge locations worldwide via the Workers REST API. Sub-millisecond cold starts, global by default. No extra dependencies.
⚡ Workers 🔑 CF API token 🐍 pip install axon
Fly.io
Machines
Fly Machines — fast-booting Docker containers placed close to your users. Deploy replicas globally with the Machines REST API. No extra dependencies.
⚡ Fly Machines 🔑 flyctl auth token 🐍 pip install axon

Up and running in minutes

One pip install, one config, one async client — regardless of which provider you deploy to.

1

Install Axon

Core install covers all edge & cloud providers that use only httpx. Add optional extras for AWS, GCP, or Azure SDKs.

2

Set your credentials

Copy .env.example to .env and fill in your provider key. Run axon auth to validate them.

3

Deploy your workload

Point Axon at your Python or Node.js entry point. It handles bundling, upload, and registration automatically.

4

Route with the AxonRouter

Add multiple providers and let the router pick the fastest or cheapest automatically — with circuit-breaker failover built in.

$ pip install axon

# Optional cloud SDK extras
$ pip install "axon[aws]"
$ pip install "axon[gcp]"
$ pip install "axon[azure]"

# Or everything at once
$ pip install "axon[all]"

# Initialise a new project
$ axon init my-worker

# Validate credentials
$ axon auth

# Deploy to ionet
$ axon deploy --provider ionet

# Check deployment status
$ axon status
import asyncio
from axon.router import AxonRouter
from axon.types import DeploymentConfig, RoutingStrategy

async def main():
    router = AxonRouter(
        providers=["ionet", "akash", "aws"],
        secret_key="your-axon-key",
        strategy=RoutingStrategy.LATENCY,
    )

    async with router:
        # Connects all providers concurrently,
        # tolerates individual failures
        estimates = await router.estimate_all(config)
        for e in sorted(estimates,
                        key=lambda e: e.usd_estimate):
            print(f"{e.provider}: ${e.usd_estimate:.4f}/hr")

        # Deploy — router picks the best provider
        deployment = await router.deploy(
            DeploymentConfig(
                name="inference-worker",
                entry_point="worker.py",
                memory_mb=2048,
            )
        )
        print(f"Deployed on {deployment.provider}")

asyncio.run(main())

Intelligent multi-provider routing

The AxonRouter connects to every provider concurrently, runs a background health loop, and routes deployments based on your chosen strategy — with automatic circuit-breaker failover.

  • LATENCY Routes to the provider with the lowest measured response time. Re-ranks after every health check.
  • COST Always picks the cheapest available provider based on live estimate() calls before each deployment.
  • ROUND_ROBIN Distributes workloads evenly across all healthy providers for balanced resource usage.
  • FAILOVER Primary provider first, automatic fallback down a ranked list if the primary is unavailable.

Circuit breaker states

CLOSED Normal operation — requests pass through
OPEN Provider skipped after 5 consecutive failures
HALF-OPEN Probe request sent after 60s recovery timeout
circuit_breaker.py
from axon.router import AxonRouter, CircuitBreaker
from axon.types import RoutingStrategy

# Custom circuit breaker — lower threshold for critical workloads
router = AxonRouter(
    providers=["ionet", "akash", "aws", "fly"],
    secret_key="your-key",
    strategy=RoutingStrategy.FAILOVER,
    health_check_interval=30.0,
)

async with router:
    # Inspect circuit state per provider
    for name, slot in router._slots.items():
        cb = slot.circuit
        print(f"{name}: {cb.state.value} "
              f"(failures: {cb.failure_count})")

    # estimate_all() fetches cost from every live provider
    estimates = await router.estimate_all(config)

    # Router auto-skips OPEN circuits
    deployment = await router.deploy(config)

    # health() returns ProviderHealth with latency_ms
    health = await router.health()
    for h in health:
        print(f"{h.provider}: {h.latency_ms:.0f}ms")

Every operation from the terminal

Built with Typer and Rich — coloured output, spinners, and interactive prompts out of the box. Install the extras for the full interactive experience.

axon init <name> Scaffold a new project — generates axon.json, .env.example, and entry point
axon auth Validate credentials for all configured providers and report health status
axon deploy Bundle, upload, and register your workload. Supports --provider and --config flags
axon status List all active deployments and their live health across every configured provider
axon send <id> Send a JSON payload to a running processor and print the response inline
axon --help Full command reference with examples, generated automatically from source
$ pip install "axon[cli]"

Built for production AI workloads

Everything you need to route inference at scale — security, observability, and resilience built in.

Fully async

Built on httpx and asyncio throughout. All providers are async context managers — connect, deploy, send, and disconnect without blocking your event loop.

🔒

SSRF protection

All endpoint and IPFS URLs are validated against a private IP regex before any request is made — blocking 169.254.x.x, 10.x.x.x, 172.16–31.x.x, 192.168.x.x, and localhost.

🛡️

Secret filtering

Environment variables ending in _KEY, _SECRET, _TOKEN, _PASSWORD, or _MNEMONIC are automatically stripped before any value reaches a cloud runtime.

🔄

Circuit breaker

Per-provider circuit breakers with configurable failure thresholds and recovery timeouts. Unhealthy providers are automatically skipped and retried — no cascading failures.

📐

Pydantic v2 models

All config, deployment, cost, and health objects are fully typed Pydantic v2 models. IDE completion, runtime validation, and JSON serialisation all included.

🧩

Pluggable providers

Implement IAxonProvider ABC and register in PROVIDER_REGISTRY. Any custom backend — private cloud, on-prem, exotic hardware — slots in automatically.

💰

Cost estimation

Call estimate() before deploying to get a USD breakdown per provider. Compare across all 10 providers in one estimate_all() call.

🌐

OpenAI-compatible inference

Install axon[inference] for a FastAPI server exposing /v1/models and /v1/chat/completions — drop-in replacement for OpenAI's endpoint.

Also available for TypeScript & Node.js

The @axonsdk monorepo brings the same provider-agnostic interface to the JavaScript ecosystem — with packages for Node.js, CLI, OpenAI-compatible inference, and React Native mobile.

  • @axonsdk/sdk — core client for Node.js and edge runtimes
  • @axonsdk/inference — OpenAI-compatible endpoint for Express / Next.js
  • @axonsdk/mobile — React Native hooks for iOS & Android
  • @axonsdk/cli — same axon commands for JS projects
View @axonsdk on GitHub →
index.ts
import { AxonClient } from '@axonsdk/sdk';

const client = new AxonClient({
  provider: 'ionet',
  secretKey: process.env.AXON_SECRET_KEY,
});

await client.connect();

client.onMessage((msg) => {
  console.log('Result:', msg.payload.result);
});

await client.send('worker-id', {
  prompt: 'Summarise this article…',
});

await client.disconnect();

Start routing AI workloads today

One pip install. Any provider. Zero lock-in. MIT licensed and fully open source.

$ pip install axon