Maniac Integration

How Maniac Desktop vendors mach, spawns mach-serve, and exposes Settings and env overrides.

Maniac Desktop runs mach in-process as a Python sidecar — not embedded in the Electron renderer. The main process spawns mach-serve from a managed venv and routes chat completions to its OpenAI-compatible HTTP API.

Most users interact through Settings → Inference. Direct mach-serve remains the power-user path documented in Serving.

Vendored snapshot

The engine source is a committed snapshot at vendor/local_moe_engine/ — not a git submodule. Release builds cannot clone the private upstream repo with the repo-scoped GITHUB_TOKEN, so the tree is inlined.

File	Purpose
`vendor/local_moe_engine/`	Full engine source at pinned revision
`vendor/local_moe_engine.pin.json`	Machine-readable upstream commit pin
`vendor/local_moe_engine/VENDORED.md`	Provenance and refresh instructions

Do not hand-edit the vendored tree. Refresh with:

pnpm run vendor:local-moe          # refresh to latest upstream main
pnpm run vendor:local-moe:check    # report drift (exit 1 if behind)

Implementation: scripts/vendor-local-moe-engine.mjs.

Managed Python venv

managedModelsVenv.ts installs:

vendor/local_moe_engine[dev,dflash,native]

into the desktop local-models venv when Local MoE is enabled. It builds lme_mlx_pread_ext if the extension is missing, matching Installation.

LocalMoeEngine.ts

src/main/localModels/engines/LocalMoeEngine.ts implements the LocalModelEngine contract:

Resolves checkpoint and draft paths from catalog + env overrides
Adjusts serving mode when expert_sidecar/ is missing (stacked → --no-streaming; sliced full-expert (e.g. k256) → keep --streaming with diagnostic fallback policy on desktop)
Spawns mach-serve <checkpoint> --streaming|--no-streaming with runtime options from buildLocalMoeServerArgv
Polls GET /v1/models until ready (longer startup timeout than mlx-lm — first expert streams can take tens of seconds)

The upstream CLI owns the production streaming path; the desktop only passes the public mode selector and monitors HTTP readiness.

Compared to MlxLmEngine:

Longer startup timeout (600s) for sliced MoE cold start
Sidecar-aware serving mode resolution instead of hard desktop preflight failure

Environment variables

Variable	Scope	Purpose
`MANIAC_LOCAL_MOE_ENABLED`	Main process	Gate engine registry (`1` to enable)
`VITE_ENABLE_LOCAL_MOE`	Renderer build	UI exposure for Local MoE catalog
`MANIAC_LOCAL_MOE_PATH`	Main / venv	Override path to engine root (default: `vendor/local_moe_engine`)
`MANIAC_LOCAL_MOE_CHECKPOINT_PATH`	Runner	Explicit checkpoint when catalog path lacks sidecar
`MANIAC_LOCAL_MOE_DRAFT_PATH`	Runner	Explicit DFlash draft directory
`MANIAC_LOCAL_MOE_DEFAULT_CATALOG_SLUG`	Catalog	Pin default model row in UI

Example .env fragment:

MANIAC_LOCAL_MOE_ENABLED=1
MANIAC_LOCAL_MOE_PATH=/absolute/path/to/local_moe_engine
# MANIAC_LOCAL_MOE_CHECKPOINT_PATH=/path/to/stacked-or-sliced-checkpoint

See .env.example and resources/local-models/README.md for friend-machine runbooks.

Dev workflow

scripts/dev.mjs auto-enables Local MoE defaults when a usable engine path exists:

MANIAC_LOCAL_MOE_ENABLED=1 pnpm run dev

Disable fast path:

MANIAC_LOCAL_MOE_ENABLED=0 pnpm run dev

User-facing surfaces

Surface	Behavior
Settings → Inference	Pick Local MoE catalog model; desktop spawns sidecar
Local model catalog	Rows tagged `engineId: local-moe` via `looksLocalMoeCompatible`
Direct `mach-serve`	Power users / benchmarks — bypasses desktop spawn

Checkpoint resolution

resolveLocalMoeRunnerCheckpointPath.ts maps catalog slugs to paths under MANIAC_LOCAL_MOE_PATH, with fallbacks to MANIAC_LOCAL_MOE_CHECKPOINT_PATH and default pipeline artifacts when the catalog entry lacks expert_sidecar/.

Draft path resolution mirrors this in resolveLocalMoeRunnerDraftPath.ts.

Installation — extras the venv installs
Serving — flags the desktop forwards
CLI — building engine-format artifacts for local dev
Troubleshooting — slow path when sidecar or native ext is missing

Maniac Integration

On this page