Skip to main content

HTTP API

ACI Inference API

Use this surface when ACI Inference sits inside a shared service and the job is tenant updates, rollback, deletion, and operator control after launch.

Tenant-scoped control plane

The HTTP surface is organized around a registered model plus an isolated tenant engine. The stable targeting tuple is `model_id`, `tenant_id`, and `namespace`.

Structured-task cloud starting point

For current structured-task cloud deployments, create tenants and keep memory off until a recall-heavy evaluation shows a measured lift.

Typed request validation

Requests are schema-checked at the boundary. IDs are bounded, text inputs are capped, and mutation endpoints reject unknown fields.

Explicit mutation receipts

Mutating calls return an accepted envelope with receipt and hash metadata rather than hiding state changes behind side effects.

Operational health endpoints

Use `/livez`, `/readyz`, and `/health` for service state. Use `/metrics` with an admin-scoped key for Prometheus-formatted metrics.

Base URL and authentication

The base URL is deployment-specific. The shipped Python client defaults to http://127.0.0.1:8000, which matches the local `aci serve` flow.

Auth uses the X-ACI-API-Key header. Use tenant-scoped keys for automated traffic and reserve admin-scoped keys for model registration, tenant lifecycle, and metrics access.

export ACI_BASE_URL="http://127.0.0.1:8000"
export ACI_API_KEY="your-key"

curl "$ACI_BASE_URL/livez"
curl "$ACI_BASE_URL/models" -H "X-ACI-API-Key: $ACI_API_KEY"

Machine-readable contract

  • The machine-readable schema is available from FastAPI at `/openapi.json` on a running service.
  • Auth is header-based with `X-ACI-API-Key`. Do not place API keys in query strings or payload fields.
  • The current structured-task cloud default leaves memory off for the first baseline.
  • Mutation endpoints return `status`, `namespace`, receipt or hash metadata, optional certificates, and per-call metrics.
  • Inference returns the output result.

Obtain and deploy

ACI Inference is a service product. The serious delivery forms are local evaluation through the wheel, private hosted rollout through the shipped service assets, or managed delivery that preserves the same public API and operator model.

Local evaluation

Install `aci-engine`, run `aci serve`, and point the HTTP client or CLI at `http://127.0.0.1:8000` for contract testing and integration work.

Private service deployment

Use the shipped container and Cloud Run assets, keep PostgreSQL-backed shared state for multi-instance serving, and preserve the documented health and metrics surface.

Managed delivery

If ACI Inference is delivered as a managed service, the buyer should still see the same API contract, tenant lifecycle, health endpoints, and operator-scoped access model described here.

Primary verbs

These calls act on an existing tenant engine. The stable routing tuple is `model_id`, `tenant_id`, and `namespace`.

POST/bind

Bind supervised items into a tenant.

POST/adapt

Adapt from transition-style payloads.

POST/constrain

Apply a typed rule capsule.

POST/infer

Run inference against a tenant engine.

POST/unbind

Remove specific contributions by ID.

GET/audit/{model_id}/{tenant_id}

Return tenant audit status and certificates.

POST/consolidate

Run consolidation with optional caps.

Admin and health surface

These routes support provisioning and operator workflows. They should not be treated as the same trust surface as tenant automation.

POST/models

Register a model backbone.

GET/models

List models visible to the current key.

POST/models/{model_id}/tenants

Create a tenant engine.

GET/models/{model_id}/tenants

List tenants for a model.

GET/models/{model_id}/tenants/{tenant_id}/status

Inspect tenant readiness and counts.

DELETE/models/{model_id}/tenants/{tenant_id}

Delete a tenant engine.

GET/metrics

Prometheus metrics. Admin-scoped API key required.

GET/livez | /readyz | /health

Liveness and readiness checks.

Request examples

The public API surface is built around explicit JSON payloads. Below are representative calls for the most common flows.

Create tenant

curl -X POST "$ACI_BASE_URL/models/support-assistant/tenants?tenant_id=tenant-a&namespace=default&target_dim=4" \
  -H "X-ACI-API-Key: $ACI_API_KEY"

Bind

curl -X POST "$ACI_BASE_URL/bind" \
  -H "Content-Type: application/json" \
  -H "X-ACI-API-Key: $ACI_API_KEY" \
  -d '{
    "model_id": "support-assistant",
    "tenant_id": "tenant-a",
    "namespace": "default",
    "write_memory": false,
    "items": [
      {
        "id": "faq-1",
        "input": "Refunds require proof of purchase.",
        "target": {"label": "policy_refund"},
        "tags": ["policy"]
      }
    ]
  }'

Infer

curl -X POST "$ACI_BASE_URL/infer" \
  -H "Content-Type: application/json" \
  -H "X-ACI-API-Key: $ACI_API_KEY" \
  -d '{
    "model_id": "support-assistant",
    "tenant_id": "tenant-a",
    "namespace": "default",
    "input": "What is the refund policy?",
    "use_memory": false
  }'

Audit

curl "$ACI_BASE_URL/audit/support-assistant/tenant-a?namespace=default" \
  -H "X-ACI-API-Key: $ACI_API_KEY"

Mutation envelope

`bind`, `adapt`, `constrain`, `unbind`, and `consolidate` return the common accepted envelope.

{
  "status": "accepted",
  "job_mode": "sync",
  "namespace": "default",
  "state_verified": true,
  "receipt_id": "...",
  "certificates": [],
  "metrics": {"latency_ms": 12.4}
}

Inference envelope

Inference returns the output result.

{
  "output": [0.12, 0.88, 0.0, 0.0]
}

Audit envelope

Audits report pass or fail and return any certificates generated by the audit path.

{
  "passed": true,
  "certificates": [
    {
      "cert_id": "cert_123",
      "type": "rollback",
      "verified": true
    }
  ]
}

Prefer a thin client instead of writing raw HTTP calls? `ACIInferenceClient` wraps this surface directly, and the CLI and MCP server sit on top of the same contract.

Keep secrets in environment or host secret storage. Keep operator admin paths separate from agent traffic.