What I’ve Built With Claude Code in 4 Weeks

I started using Claude Code CLI extensively on February 27, 2026. Four weeks and roughly 400 commits later, I have an autonomous agent system that surveys my repos every 30 minutes, picks improvements, creates PRs, and posts the results to Discord for me to approve. Here is how that happened, week by week.

Week 1: Just a Coding Assistant (Feb 24 – Mar 1)

Eight commits. I used Claude Code the way most people start: fixing specific bugs. A double basePath issue in one project, a pickling error in another. Surgical, one-off fixes. Nothing systematic.

But two days in, I created a repo called agentGuidance. The idea was simple: if I am going to give Claude instructions every session, those instructions should be version-controlled and consistent. That repo would become the backbone of everything that followed.

Week 2: The Infrastructure Explosion (Mar 2-8)

101 commits across 20+ repos. This was the week I stopped using Claude Code as a tool and started building a system around it.

The first move was propagating a SessionStart hook to every repo I own. Each Claude Code session now fetches centralized rules from agentGuidance on startup. Same guardrails everywhere, maintained in one place.

That same week I built centralDiscord from scratch: a Discord bot that dispatches Claude Code jobs, manages a queue, streams output to channels, and coordinates multiple agents. It crashed on memory issues by day three, got rebuilt with proper fault tolerance, and by the end of the week had streaming progress, a kill command, and metrics tracking.

I also launched three new projects that week: a prompt library, a job scraper, and a cross-LLM context tool. The pattern was becoming clear. I was not just coding with Claude; I was building tools for working with Claude.

Key learning: Discord beat WordPress as the reporting channel. I tried auto-posting sessions as narrative blog posts, but the feedback loop was too slow. Discord gives real-time visibility into what every agent is doing.

Week 3: Stabilization (Mar 9-15)

47 commits. Half the velocity, but the work shifted from building to hardening.

The centralDiscord bot got reliability fixes, metrics backup, and failed request handling. More importantly, I started encoding post-mortems as code. After accumulating stale PRs that caused cascading merge conflicts, I wrote branch hygiene rules directly into agentGuidance. Every time something broke, the fix went into the instruction set so it would not break the same way again.

This is the week I learned that autonomous code generation creates maintenance debt fast. You need guardrails before you need features. Every failure should become a rule.

Week 4: The Autonomous Era (Mar 17-23)

217 commits. The biggest week by far, and a fundamental shift from “developer using Claude Code” to “developer orchestrating multiple Claude Code instances.”

I completed the Local Worker Bridge: an SSH reverse tunnel from my GCP VM through Windows OpenSSH into WSL, letting the Discord bot dispatch Claude jobs to my local PC. Seven distinct bugs in the tunnel chain, all found and fixed in one session. Git Bash vs WSL conflicts, Windows authorized_keys locations, WSL localhost pointing to the wrong network interface.

Then came autonomousDev: a standalone agent on a 30-minute cron that surveys all repos, picks the highest-impact improvement, executes it, creates a PR, and posts results to Discord. In its first day it produced work across six repos: security fixes, npm audit patches, 59 new tests, physics bug fixes, regex corrections, and vulnerability patches.

I built claude-bakeoff for A/B testing different instruction environments against each other. I added a #tasks channel with parameterized templates (!task pr-review, !task deploy). I added a #prompts channel that logs every prompt from every source. The daily roundup analyzes prompt patterns and suggests new task templates. The system watches its own usage and improves itself.

By the end of the week, the autonomous dev agent had completed 68 runs. It was also causing problems: two agents finishing the same work simultaneously, PRs merging without my approval. So I added collision detection and an approval gate. The system is learning its own failure modes, just like I learned mine in weeks 2 and 3.

What I Have Learned

Deduplication is essential. Autonomous agents will redo work unless you give them a shared memory of what has been done. I maintain a completed-work.md that every session checks before starting.

Every failure becomes a rule. Restart storms, divergent branches, WSL localhost misrouting: all encoded in the instruction set as prevention rules. The system gets more reliable over time because its rules grow from real incidents.

Reporting solves coordination. I have six Discord channels dedicated to different aspects of agent activity. The Stop hook ensures every session reports its work. Without visibility, autonomous agents are a black box.

Slim instructions beat dense ones. agent.md went from 511 to 178 lines this week. Dense guidance causes agents to miss rules. Modular files with clear scoping work better.

PR-based deploys over SSH commands. After race conditions with raw git commands, everything flows through GitHub PRs. Safer, more auditable, and the bot can manage merge and deploy as separate approval steps.

The Current Setup

From 8 commits fixing a basePath bug to 393 commits across 34 repos with autonomous agents, in under four weeks.

The stack: centralized guidance fetched on every session start, Stop hooks posting to Discord, autonomous agents on 30-minute cron, a Discord bot routing jobs between VM and local PC, A/B testing for instruction quality, and a job pipeline that scrapes, filters, and generates application materials daily.

It is not done. The approval gates need tightening, the Cowork reporting pipeline does not work from the Chrome extension yet, not all of the Discord reporting works as intended, and the autonomous agent still occasionally picks up work that is not worth doing. But the foundation is solid, and each session makes the system a little smarter than the last one.