Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.bricks.tools/llms.txt

Use this file to discover all available pages before exploring further.

BRICKS Foundation can offload LLM, MLX, and speech-to-text inference to a BRICKS Buttress server on the same LAN. The brick’s generator API stays the same; the device transparently delegates the work to the server when it would run faster (or fit at all) there.

Requirements

How it works

  1. When the device’s workspace has at least one bound Buttress server, the launcher starts a discovery manager and obtains a workspace-scoped JWT.
  2. The manager listens for UDP announcements on port 8089 and pools every server whose serverId is on the workspace’s bound list.
  3. Each LLM, MLX, or STT brick reads its Buttress (Remote Inference) group and either picks a server automatically or uses a manually configured URL.
  4. The capability comparison decides whether to run locally, remotely, or either; the strategy you pick on the brick decides how that recommendation is interpreted.
If the launcher can’t reach a server (no LAN, server offline, workspace mismatch), it falls back to local execution — unless you explicitly turned that off.

Configure offloading on a brick

In BRICKS Controller > Config Editor, open any LLM or STT brick. The Buttress (Remote Inference) property group appears under Connection.
FieldDefaultEffect
EnabledfalseTurn Buttress on for this brick
Auto-discoverAutoAuto finds a server via UDP; Manual uses the URL field
URLemptyWebSocket URL when Auto-discover is Manual (e.g. ws://buttress.lan:2080)
Strategyprefer-buttressHow the device picks between local and remote
Fallbackno-opWhat to do when Buttress is enabled but unavailable

Strategies

StrategyBehavior
prefer-buttressAlways send work to Buttress when a server is available. Local hardware is not probed.
prefer-localRun locally if the device has enough memory; otherwise fall back to Buttress.
prefer-bestCompare scores and run on whichever side is faster.
prefer-buttress is the default because Foundation devices that opt into Buttress almost always do so because the local hardware is not the fastest path.

Fallback

FallbackBehavior when Buttress is enabled but unreachable
no-op (default)The brick does nothing locally — no model is downloaded, no completion runs
use-localThe brick falls back to local execution exactly as if Buttress was off
Pick use-local if you want the brick to keep working when the LAN drops; pick no-op if you would rather see a clear failure than silently consume battery on a model the device can’t handle.

When the workspace changes

If the device’s workspace changes — for example, an admin reassigns it from BRICKS Controller — the launcher:
  1. Stops the active Buttress manager and closes any open WebSocket connections.
  2. Discards the cached access token.
  3. Starts a new manager with the new workspace’s bound-server list and a freshly issued token.
In-flight generators that were authed against the old workspace error out cleanly rather than entering an infinite reconnect loop.

Audio uploads (STT)

Speech-to-text transcription needs the audio file on the server. The brick uploads to POST /buttress/upload over HTTPS, and the server stores the file in the temp directory configured by [server] temp_file_dir (default <os-tmpdir>/.buttress). After transcription, the file is auto-cleaned along with the rest of the session’s temp files.

Troubleshooting

SymptomLikely causeFix
Brick logs no LAN provider is registeredDevice is not associated with a workspace, or no servers are bound to that workspaceBind the device, or run bricks buttress bind to pair a server
Brick logs no '<type>' endpoint yetUDP discovery hasn’t returned a server within 10 secondsCheck that the server is on the same subnet and the [autodiscover] block isn’t disabled
WebSocket closes with code 1008The device’s token doesn’t match the server’s bound workspaceConfirm both sides are on the same workspace; restart the launcher
Brick “stops working” after enabling ButtressFallback is set to no-op and no server is reachableSwitch fallback to use-local, or fix LAN connectivity
Unknown generator id errorServer restarted or evicted the loaded modelThe brick recovers automatically by re-initializing the generator on the next call

Buttress overview

What Buttress is, when to use it, and how the system fits together.

Workspace binding

How servers and devices end up on the same workspace.