Deep Closeout: Discord Reliability Overhaul + CLI Mirror

# Deep Closeout: Discord Reliability Overhaul + CLI Mirror

**Date:** 2026-04-06 to 2026-04-07
**Duration:** ~3 hours
**Repos touched:** centralDiscord (12 commits), buying-assistant (2 commits), agentGuidance (1 commit), privateContext (1 commit)

## Context & Motivation

A buying guide request via Discord failed to read a Gemini share link (JS-rendered SPA). Investigation revealed a cascade of quality issues explaining why Discord sessions feel “dumber” than CLI sessions. The user asked for a root cause analysis, fixes, and ultimately a CLI-like interactive experience through Discord.

## What Was Built

### Phase 1: Root Cause Analysis & URL Pre-Fetching

**Problem:** WebFetch silently fails on JS-rendered SPAs (Gemini, React apps). Agents proceed without the context, producing lower-quality results.

**Fixes across 3 repos:**
– `contextFetcher.js`: Auto-detects external URLs in request text, pre-fetches via page-reader (headless Chromium), injects content into prompts
– `contextFetcher.js`: `stripBotOutput()` for retry detection in #buying-guides
– `executor.js` + `jobRequest.js`: Page-reader fallback instruction in all Discord directives
– `routeClassifier.js`: URL-dominated messages (>40% URL chars) classify as TASK, skip debate
– `buying-assistant/CLAUDE.md`: Explicit page-reader fallback instructions
– `agentGuidance/agent.md`: Page-reader fallback as a Core Principle

### Phase 2: Pre-Job Repo Sync

**Problem:** Pushing CLAUDE.md or agent.md changes from local had no effect on Discord agents because VM repos were never pulled.

**Fix:** `preJobSync()` in executor.js does `git pull –ff-only` on agentGuidance and the job’s working directory before every Claude spawn. 8s timeout, fire-and-forget.

Also synced page-reader on VM (was 5 commits behind, missing `–stealth`).

### Phase 3: Interactive Sessions ([WAITING_FOR_INPUT])

**Problem:** Discord sessions are single-shot with `clarifyAmbiguous: ‘best-effort’`. Agents can never ask clarifying questions.

**Solution:** Pause-and-resume protocol:
1. Agent outputs `[WAITING_FOR_INPUT]` + question, then stops
2. Bot detects marker, posts question to Discord, parks session
3. User replies with answer
4. Bot resumes with `–resume ` + answer
5. 30-minute timeout auto-resumes with “proceed with best judgment”

**Files:** sessionPool.js (waiting state management), claudeReply.js (detection + handling in all 3 execution paths), index.js (reply routing for waiting answers), jobRequest.js + executor.js (directive update)

### Phase 4: CLI Mirror Channel

**Problem:** Even with [WAITING_FOR_INPUT], Discord sessions lack the real-time visibility of the CLI.

**Solution:** New `#cli-mirror` channel with streaming output:
– Thread per conversation
– Live message edited every 1.5s with text + tool call indicators (`> Read package.json`)
– Messages freeze at 1800 chars, continue in new messages
– Session continuity via thread replies (`–resume`)
– `[WAITING_FOR_INPUT]` support for mid-task questions

**Files:** New `cliMirror.js` (250 lines), executor.js (configurable `pollIntervalMs`/`progressIntervalMs`, enriched `onProgress` with `textDelta`/`toolEvents`)

### Phase 5: Bug Fixes Found During Testing

1. **`fetchAttachments` return type mismatch** (from auto-merged PR #134): Early return `[]` vs normal return `{ results, mediaTempDir }`. Every text-only message in #requests was crashing. Fixed the early return.

2. **Raw NDJSON posted to Discord**: `local-worker.sh` had `2>&1` merging stderr into the NDJSON stream, corrupting JSON parsing. `extractFinalText()` returned empty, fallback posted raw JSON. Fixed: `2>/dev/null` + removed raw fallback.

3. **CLI mirror sessions lost on bot restart**: In-memory session pool wiped on restart. Fixed: session ID embedded in footer messages + thread message scanning fallback.

4. **`runClaudeRemote` missing poll interval params**: CLI mirror sessions routed to local worker used default 5s/15s intervals instead of 1.5s. Fixed: forwarded params through `runClaudeRemote`.

5. **Error serialization**: Pino’s `logger.error(‘msg:’, err)` didn’t serialize error objects. Fixed with `logger.error({ err, stack }, msg)`.

## Decisions Made

### Decision: Server-side URL pre-fetching + agent-side fallback (belt and suspenders)
– **Rationale:** Pre-fetching handles URLs in the initial request. Directive + CLAUDE.md handle URLs discovered during execution.
– **Trade-off:** 3-20s latency per pre-fetched URL. Acceptable for the quality improvement.

### Decision: [WAITING_FOR_INPUT] as text marker, not tool interception
– **Alternatives:** Stdin pipe injection, MCP server bridge, AskUserQuestion tool interception
– **Rationale:** Claude in `-p` mode finishes its turn and exits normally. No process killing, no stream interception, no architecture change. Just detect the marker in output, park the session, resume on reply.

### Decision: Enhanced polling for CLI mirror (not pipe-based streaming)
– **Alternatives:** Non-detached process with pipe-based streaming (true real-time)
– **Rationale:** Keeps the detached process model (survives bot restarts). 1.5s poll interval is imperceptible in Discord. Upgrade path to pipes exists for v2 if needed.

### Decision: Fix `local-worker.sh` stderr with `2>/dev/null` not `2>errfile`
– **Rationale:** Stderr from Claude sessions is mostly hook noise and debug output. The VM bot captures stderr separately via its own file descriptors. Discarding it on the local side is safe.

## Architecture

## Learnings Captured

## Commits (This Session)

### centralDiscord
| Commit | Description |
|—|—|
| `c86983c` | URL pre-fetch, retry detection, page-reader fallback, route classifier |
| `144b251` | context.md update |
| `d62f04c` | Pre-job repo sync |
| `de44ffe` | Pre-fetch timing logs |
| `b7e695f` | Interactive sessions: [WAITING_FOR_INPUT] |
| `0706525` | Fix WAITING_FOR_INPUT multiline regex |
| `b76c48b` | Fix fetchAttachments early return type |
| `130f95a` | Improve error serialization |
| `86625d7` | CLI mirror: streaming interactive sessions |
| `a1863a4` | Forward poll intervals through runClaudeRemote |
| `7d8ae06` | Fix raw NDJSON posted to Discord |
| `37ce053` | Fix CLI mirror session loss on restart |

### buying-assistant
| Commit | Description |
|—|—|
| `31f1f92` | Page-reader fallback in CLAUDE.md |
| `178493f` | Hard wax oil buying guide |

### agentGuidance
| Commit | Description |
|—|—|
| `902cf41` | Page-reader fallback as core principle |

## Open Items

1. **Test [WAITING_FOR_INPUT] end-to-end in #requests** — the fetchAttachments bug was blocking all tests. Now fixed, needs a clean test.
2. **CLI mirror streaming latency** — currently 1.5s poll. Could go to pipe-based for sub-second in v2.
3. **CLI mirror thread cleanup** — threads auto-archive after 24h but old sessions pile up. Consider a janitor sweep for CLI mirror threads.
4. **SKIP_DOMAINS tuning** — URL pre-fetch skips Discord/YouTube/GitHub. May need adjustment based on real usage.
5. **Buying guide: hard wax oil** — complete guide written from the originally-failed Gemini link. Best pick: General Finishes HWO 8oz ($26).

## Key Files

Leave a Reply Cancel reply