Local API serving
Run ACI Inference locally with `aci serve` for development, test harnesses, and packaging validation.
Command Line
Use the ACI CLI when the workflow belongs in a terminal: local serving, shared-service inference operations, native edge-runtime packaging, or MCP startup.
Run ACI Inference locally with `aci serve` for development, test harnesses, and packaging validation.
For current structured-task cloud deployments, create tenants and keep memory off in the first baseline.
The inference subcommands call the same HTTP contract exposed by `ACIInferenceClient` and the API docs surface.
Mutation verbs accept payloads from `--json`, `--file`, or `--stdin`, which keeps scripts and CI paths explicit.
The same binary also exposes native runtime packaging helpers and stdio MCP startup.
Installing the Python package provides `aci`, `aci-mcp`, and `aci-inference-mcp`. The CLI itself is the operator and integration wrapper around the documented public surfaces.
pip install aci-engine aci --version aci serve --host 127.0.0.1 --port 8000 aci-mcp aci-inference-mcp
ACI_BASE_URLDefault base URL for inference and MCP flows.
ACI_API_KEYDefault API key for inference and MCP flows.
ACI_TIMEOUT_SECSDefault request timeout in seconds.
ACI_MCP_ALLOW_ADMINDefault admin-tool toggle for MCP startup.
The CLI is not a separate product. It is the operator and packaging surface that comes with the wheel, manages the service locally, and exports the native edge artifacts.
The CLI arrives with the `aci-engine` wheel and gives operators one entry point for local service startup, health checks, and tenant lifecycle work.
Use the CLI as the build and export surface for native runtime artifacts; the final shipped edge runtime is the compiled package it produces, not the Python CLI itself.
The CLI also starts the stdio MCP server, but that process still sits on top of an existing ACI Inference deployment rather than replacing it.
`aci serve` launches the FastAPI app locally. It is the fastest path for contract testing, local integration work, and MCP attachment against a development instance.
aci serve --host 0.0.0.0 --port 8000 --log-level info curl http://127.0.0.1:8000/livez curl http://127.0.0.1:8000/readyz
Inference subcommands mirror the public HTTP surface: model registration, tenant lifecycle, operational verbs, and health checks. The legacy `aci platform ...` path remains supported as an alias.
aci inference models list --base-url "$ACI_BASE_URL" --api-key "$ACI_API_KEY" aci inference models register \ --base-url "$ACI_BASE_URL" \ --api-key "$ACI_API_KEY" \ --model-id support-assistant \ --device cpu aci inference tenants create \ --base-url "$ACI_BASE_URL" \ --api-key "$ACI_API_KEY" \ --model-id support-assistant \ --tenant-id tenant-a \ --target-dim 4
aci inference bind --base-url "$ACI_BASE_URL" --api-key "$ACI_API_KEY" --file bind.json
aci inference infer \
--base-url "$ACI_BASE_URL" \
--api-key "$ACI_API_KEY" \
--json '{"model_id":"support-assistant","tenant_id":"tenant-a","namespace":"default","input":"hello","use_memory":false}'
aci inference audit \
--base-url "$ACI_BASE_URL" \
--api-key "$ACI_API_KEY" \
--model-id support-assistant \
--tenant-id tenant-a
aci inference livez --base-url "$ACI_BASE_URL"
aci inference readyz --base-url "$ACI_BASE_URL"The mutation commands `bind`, `adapt`, `constrain`, `infer`, `unbind`, and `consolidate` require exactly one payload source.
# Inline JSON
aci inference constrain --api-key "$ACI_API_KEY" --json '{"model_id":"m1", ... }'
# JSON file
aci inference bind --api-key "$ACI_API_KEY" --file bind.json
# STDIN
cat infer.json | aci inference infer --api-key "$ACI_API_KEY" --stdinThe CLI also exposes native build helpers and the stdio MCP launcher, so packaging and agent attachment stay inside the same public binary surface.
aci edge-runtime build-native --output-dir build/native aci edge-runtime package-embedder --output-dir build/embedder aci mcp serve \ --base-url "$ACI_BASE_URL" \ --api-key "$ACI_API_KEY" \ --allow-admin # equivalent branded entry point aci-inference-mcp