Maniac Docs

Maniac Integration

How Maniac Desktop vendors mach, spawns mach-serve, and exposes Settings and env overrides.

Maniac Desktop runs mach in-process as a Python sidecar — not embedded in the Electron renderer. The main process spawns mach-serve from a managed venv and routes chat completions to its OpenAI-compatible HTTP API.

Most users interact through Settings → Inference. Direct mach-serve remains the power-user path documented in Serving.

Vendored snapshot

The engine source is a committed snapshot at vendor/local_moe_engine/ — not a git submodule. Release builds cannot clone the private upstream repo with the repo-scoped GITHUB_TOKEN, so the tree is inlined.

FilePurpose
vendor/local_moe_engine/Full engine source at pinned revision
vendor/local_moe_engine.pin.jsonMachine-readable upstream commit pin
vendor/local_moe_engine/VENDORED.mdProvenance and refresh instructions

Do not hand-edit the vendored tree. Refresh with:

pnpm run vendor:local-moe          # refresh to latest upstream main
pnpm run vendor:local-moe:check    # report drift (exit 1 if behind)

Implementation: scripts/vendor-local-moe-engine.mjs.

Managed Python venv

managedModelsVenv.ts installs:

vendor/local_moe_engine[dev,dflash,native]

into the desktop local-models venv when Local MoE is enabled. It builds lme_mlx_pread_ext if the extension is missing, matching Installation.

LocalMoeEngine.ts

src/main/localModels/engines/LocalMoeEngine.ts implements the LocalModelEngine contract:

  1. Resolves checkpoint and draft paths from catalog + env overrides
  2. Adjusts serving mode when expert_sidecar/ is missing (stacked → --no-streaming; sliced full-expert (e.g. k256) → keep --streaming with diagnostic fallback policy on desktop)
  3. Spawns mach-serve <checkpoint> --streaming|--no-streaming with runtime options from buildLocalMoeServerArgv
  4. Polls GET /v1/models until ready (longer startup timeout than mlx-lm — first expert streams can take tens of seconds)

The upstream CLI owns the production streaming path; the desktop only passes the public mode selector and monitors HTTP readiness.

Compared to MlxLmEngine:

  • Longer startup timeout (600s) for sliced MoE cold start
  • Sidecar-aware serving mode resolution instead of hard desktop preflight failure

Environment variables

VariableScopePurpose
MANIAC_LOCAL_MOE_ENABLEDMain processGate engine registry (1 to enable)
VITE_ENABLE_LOCAL_MOERenderer buildUI exposure for Local MoE catalog
MANIAC_LOCAL_MOE_PATHMain / venvOverride path to engine root (default: vendor/local_moe_engine)
MANIAC_LOCAL_MOE_CHECKPOINT_PATHRunnerExplicit checkpoint when catalog path lacks sidecar
MANIAC_LOCAL_MOE_DRAFT_PATHRunnerExplicit DFlash draft directory
MANIAC_LOCAL_MOE_DEFAULT_CATALOG_SLUGCatalogPin default model row in UI

Example .env fragment:

MANIAC_LOCAL_MOE_ENABLED=1
MANIAC_LOCAL_MOE_PATH=/absolute/path/to/local_moe_engine
# MANIAC_LOCAL_MOE_CHECKPOINT_PATH=/path/to/stacked-or-sliced-checkpoint

See .env.example and resources/local-models/README.md for friend-machine runbooks.

Dev workflow

scripts/dev.mjs auto-enables Local MoE defaults when a usable engine path exists:

MANIAC_LOCAL_MOE_ENABLED=1 pnpm run dev

Disable fast path:

MANIAC_LOCAL_MOE_ENABLED=0 pnpm run dev

User-facing surfaces

SurfaceBehavior
Settings → InferencePick Local MoE catalog model; desktop spawns sidecar
Local model catalogRows tagged engineId: local-moe via looksLocalMoeCompatible
Direct mach-servePower users / benchmarks — bypasses desktop spawn

Checkpoint resolution

resolveLocalMoeRunnerCheckpointPath.ts maps catalog slugs to paths under MANIAC_LOCAL_MOE_PATH, with fallbacks to MANIAC_LOCAL_MOE_CHECKPOINT_PATH and default pipeline artifacts when the catalog entry lacks expert_sidecar/.

Draft path resolution mirrors this in resolveLocalMoeRunnerDraftPath.ts.

  • Installation — extras the venv installs
  • Serving — flags the desktop forwards
  • CLI — building engine-format artifacts for local dev
  • Troubleshooting — slow path when sidecar or native ext is missing

On this page