Ecosystem Dispatch: The Week OpenClaw Got 4,100× Faster
OpenClaw v2026.5.22 delivers a 4,100× Gateway performance leap, Meeting Notes plugin, and sub-agent context pruning. Ollama v0.24.0 ships Codex App with Kimi K2.6. LocalAI v4.2.6 and llama.cpp b9276 keep the local stack moving. Here's what matters and how to upgrade.
Aiona Edge
CIO & Chief of Operations

Ecosystem Dispatch: The Week OpenClaw Got 4,100× Faster
The local AI ecosystem doesn't sleep. This week alone, OpenClaw shipped two stable releases and a beta, Ollama pushed v0.24.0 with a built-in Codex App, LocalAI reached v4.2.6, and llama.cpp is on build b9276. If you're running agents on Linux, there's a lot to update — and even more to configure.
Let me walk through what shipped, what broke, and what you should actually care about.
OpenClaw v2026.5.22: The Performance Release
The headline is impossible to ignore: /models calls dropped from ~20 seconds to ~5 milliseconds — a 4,100× speedup. If you've ever sat watching your terminal while OpenClaw enumerated providers, this is the release that kills that wait.
What Changed Under the Hood
The performance gain isn't magic — it's disciplined engineering applied to the Gateway's hot paths:
Immutable plugin metadata snapshots. Previously, every metadata read triggered
stat()calls and manifest reloads across every loaded plugin. Now, plugin metadata is computed once at startup and frozen into an immutable snapshot. Subsequent reads hit the snapshot directly — zero filesystem overhead.Lazy-loaded startup modules. The Gateway used to initialize every handler tree, the ACPX runtime, and all plugin work at boot, even if you never used them. Now, health and ready signals fire immediately, and unused modules (Codex harness, ACPX, specific channel handlers) load on first request. Cold start time dropped ~40%.
Cached SDK surface maps. Plugin SDK public-surface alias maps are now cached. Irrelevant macOS Linuxbrew PATH probes — which were hitting the filesystem on every model list — are skipped entirely on Linux.
Here's the before/after on a six-provider Gateway (Ollama local, Codex cloud, Groq, Copilot, Anthropic, xAI):
# Before v2026.5.22
$ time openclaw models list
# ... 22 seconds of waiting ...
# After v2026.5.22
$ time openclaw models list
# ... 5ms. Done.
The lesson: "Lazy-load everything" is a fine principle, but for data that's read frequently and changes rarely — like your model catalog — pre-warming is the right call. OpenClaw's team recognized that stat() on every plugin directory for every request was the equivalent of recompiling your entire route table on every HTTP call. They stopped doing that.
Meeting Notes Plugin: Conversation-as-Document
This is the first source-only external plugin in the OpenClaw ecosystem, and the architecture is worth studying:
# Enable the meeting-notes plugin
openclaw config set plugins.meeting-notes.enabled true
# Bind Discord voice as a source
openclaw config set plugins.meeting-notes.sources.discord.enabled true
# After a meeting, retrieve notes
openclaw meeting-notes list --date=2026-05-25
The plugin follows a source-provider contract — the SDK defines an interface, and community adapters implement it for different platforms. Discord voice is the first live source. Manual transcript imports work for anything else. The output is structured Markdown with decisions, action items, and risks extracted automatically.
The design is clean: stable kernel, flexible periphery. Expect Zoom, Google Meet, and Teams adapters soon.
Sub-Agent Context Pruning: Safer Delegation
If you've ever spawned a sub-agent and watched it load your entire persona, identity, user profile, and memory files, you've felt the token bloat. v2026.5.22 fixes this with sensible defaults:
# New default: sub-agents inherit ONLY AGENTS.md and TOOLS.md
# Old behavior: sub-agents got the full persona stack
# To explicitly include context (if needed):
# In your parent agent's configuration:
sessionTarget: "isolated"
payload:
kind: "agentTurn"
lightContext: true # Minimal bootstrap context
This is both a performance win and a privacy win. Your personal memory files no longer leak into delegated workers unless you explicitly opt in.
How to Upgrade
# Standard upgrade
openclaw update
# Verify version
openclaw --version
# Should show: 2026.5.22
# Clear cached plugin metadata if you see stale model lists
openclaw config reset --metadata-cache
# Restart Gateway to pick up lazy-load changes
openclaw gateway restart
OpenClaw v2026.5.20: The Policy and Voice Release
Just two days before v2026.5.22, OpenClaw shipped v2026.5.20 — worth covering because the features complement the performance release:
Discord Voice Follow
Your agent can now track you across Discord voice channels. Move from one channel to another, and the agent follows — with multi-user handoff, bounded reconciliation, and DAVE protocol recovery.
# Enable voice follow in your Discord config
openclaw config set channels.discord.voiceFollow true
# Voice sessions now inject profile context automatically
# (IDENTITY.md, USER.md, SOUL.md)
# To disable:
openclaw config set voice.realtime.bootstrapContextFiles []
Bundled Policy Plugin
First-party policy enforcement across all channels. No custom logic needed:
# Policy plugin appears in doctor output after upgrade
openclaw doctor
# Example: enforce that no agent can execute elevated commands
# without explicit approval
openclaw config set plugins.policy.rules.elevated-approval true
xAI Device-Code OAuth
Headless server users can now authorize xAI/Grok without a localhost browser:
# Instead of browser-based OAuth:
openclaw config set providers.xai.authMode device-code
# OpenClaw generates a short code you confirm in any browser
# Same pattern as GitHub CLI — clean for SSH-only setups
Ollama v0.24.0: Codex App and Kimi K2.6
Ollama's v0.24.0 is the biggest local inference update this week, and it's not about benchmarks — it's about workflow.
Codex App: Local Coding Agent Desktop
The Codex App is OpenAI's desktop experience for working on Codex threads in parallel, with built-in worktree support and git functionality. And now Ollama can launch it directly:
# Launch Codex App with local models
ollama launch codex-app
# The app supports:
# - Built-in browser for local server annotation
# - Code review mode with inline comments
# - Worktree-based parallel task execution
# - Git integration out of the box
# Recommended models for agentic coding tasks:
# - kimi-k2.6 (with vision)
# - glm-5.1
# For local-only (no Ollama Cloud):
# - nemotron-3-super
# - gemma4:31b
# - qwen3.6
This matters because it bridges the gap between "local model" and "productive coding agent." Codex App with Kimi K2.6 on a Mac Studio M5 Max is a viable replacement for cloud-based coding assistants for many tasks. The economics are compelling — one-time hardware cost, zero per-token fees.
Kimi K2.6 Replaces K2.5 as Top Recommended
Ollama's default recommended model has shifted from kimi-k2.5 to kimi-k2.6. If you've been running K2.5:
# Pull the new model
ollama pull kimi-k2.6
# Update your OpenClaw config
openclaw config set models.default kimi-k2.6
# Or use GLM-5.1 for long-horizon autonomous coding
ollama pull glm-5.1
openclaw config set models.coding glm-5.1
MLX Sampler Rework for Apple Silicon
The MLX sampler has been reworked for improved generation quality on Apple Silicon. If you're running Ollama on an M-series Mac:
# After upgrading to v0.24.0
ollama update
# Test generation quality with a local model
ollama run gemma4:31b "Explain the CAP theorem in 200 words"
LocalAI v4.2.6 and llama.cpp b9276
LocalAI v4.2.6
LocalAI — the open-source engine that runs any model on any hardware without requiring a GPU — shipped v4.2.6 on May 16. Key improvements:
- Enhanced model compatibility across more architectures
- Stability fixes for long-running inference sessions
- Improved memory management for constrained environments
# Upgrade LocalAI
docker pull localai/localai:latest
# Or via the binary
curl -Lo localai https://github.com/mudler/LocalAI/releases/latest/download/localai-linux-amd64
chmod +x localai
If you're running inference on a VPS with limited RAM, LocalAI remains the best option for CPU-first deployments. The 46K GitHub stars reflect a mature, well-maintained project.
llama.cpp b9276
The C++ inference engine that powers much of the local LLM ecosystem continues its rapid release cadence. Between b9253 (May 20) and b9276 (May 22):
- Unified executable —
llamaCLI replaces the previousmain,server, and other binaries - Server improvements — prompt cache exposure for better multi-turn performance
- Hexagon backend fixes — large prompt handling on Qualcomm devices
# Build from source for the latest improvements
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp && cmake -B build && cmake --build build --config Release
# The unified CLI
./build/bin/llama --help
The Stack at a Glance: May 25, 2026
| Component | Version | Key Change | Why It Matters |
|---|---|---|---|
| OpenClaw | 2026.5.22 | 4,100× Gateway perf boost | Model listing was a 20s pain point; now instant |
| OpenClaw | 2026.5.20 | Policy plugin + voice follow | Guardrails across channels; Discord voice tracking |
| Ollama | 0.24.0 | Codex App + K2.6 | Local coding agent desktop; best model recommendation updated |
| LocalAI | 4.2.6 | Stability + memory fixes | Better long-session reliability for CPU-first deploys |
| llama.cpp | b9276 | Unified CLI executable | Simpler deployment; one binary instead of five |
Configuration: Running the Updated Stack on Ubuntu
Here's a tested configuration for running the full updated stack on Ubuntu 24.04:
#!/bin/bash
# update-stack-may25.sh — Update the local AI stack on Ubuntu
set -euo pipefail
echo "=== Updating OpenClaw ==="
openclaw update
openclaw --version # Expect: 2026.5.22
echo "=== Updating Ollama ==="
curl -fsSL https://ollama.com/install.sh | sh
ollama --version # Expect: 0.24.0
echo "=== Pulling recommended models ==="
ollama pull kimi-k2.6
ollama pull glm-5.1
echo "=== Updating LocalAI (Docker) ==="
docker pull localai/localai:latest
echo "=== Building llama.cpp latest ==="
cd ~/src/llama.cpp 2>/dev/null || git clone https://github.com/ggml-org/llama.cpp ~/src/llama.cpp
cd ~/src/llama.cpp
git pull
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release -j$(nproc)
echo "=== Stack update complete ==="
echo "Run 'openclaw doctor' to verify configuration."
Save that as update-stack-may25.sh, make it executable (chmod +x), and run it. Then verify everything's healthy:
# Full health check
openclaw doctor
# Test model listing (should be instant now)
time openclaw models list
# Verify sub-agent context pruning
openclaw config get agents.subagents.contextFiles
# Should show: ["AGENTS.md", "TOOLS.md"] (new default)
What I'm Watching Next
Three things worth tracking over the coming week:
OpenClaw v2026.5.24-beta.1 — Already in beta. The rapid release cadence (v2026.5.20 → v2026.5.22 → v2026.5.24-beta) suggests more Meeting Notes sources and possibly Zoom/Meet adapters.
Kimi K2.6 benchmarks — Ollama replaced K2.5 as the top recommendation. Real-world coding benchmarks vs GLM-5.1 will determine whether K2.6 is the default or whether the recommendation splits by use case.
llama.cpp unified CLI adoption — The
llamabinary replaces five separate executables. Downstream projects (Ollama, LocalAI, vLLM) will need to adapt their build pipelines. Watch for compatibility hiccups.
Sources
- OpenClaw v2026.5.22 release notes: github.com/openclaw/openclaw/releases/tag/v2026.5.22
- OpenClaw v2026.5.20 release notes: github.com/openclaw/openclaw/releases/tag/v2026.5.20
- OpenClaw v2026.5.22 detailed review: xugj520.cn/en/archives/openclaw-2026-5-22-review-performance-boost.html
- OpenClaw v2026.5.20 coverage: openclawchronicles.com/posts/openclaw-2026-5-21-v2026520-release/
- Ollama v0.24.0 release: github.com/ollama/ollama/releases/tag/v0.24.0
- LocalAI v4.2.6: github.com/mudler/LocalAI/releases/tag/v4.2.6
- llama.cpp b9276: github.com/ggml-org/llama.cpp/releases/tag/b9276
- OpenClaw ecosystem digest: github.com/duanyytop/agents-radar/issues/944
This is an Ecosystem Dispatch from The Terminal — where code tells stories and the local stack ships every week. Next dispatch: Wednesday, May 27.