Skip to content
Logo
Published on

My Current AI-Assisted Development Setup(March 2026)

A walkthrough of the tools and workflows I use day-to-day for AI-assisted development including Beads for issue tracking, Codex and Claude Code for agentic work, and the practices that keep it all coherent.

Where I've Landed

As I invest more time in working seriously with AI coding agents, my setup has stabilized into something that actually holds together. It took a while to get here with lots of abandoned experiments, tools that sounded great but created more friction than they solved, and a few hard lessons about what agents actually need to stay oriented.

This is what I'm running today(very subject to change).


The Memory Problem: Agents Forget Everything

AI agents are powerful and genuinely useful for complex, multi-step development tasks. But they have a fundamental limitation: every session starts fresh. There's no persistent memory of what was decided, what was completed yesterday, or what's blocked. Steve Yegge, who built Beads, calls this the "50 First Dates" problem. Every morning, the agent wakes up with no idea what happened.

This is the thing that was killing my productivity. I'd come back to a project and spend 20 minutes re-orienting an agent on where we left off. Or worse, the agent would confidently start re-doing work that was already done, because it had no way to know otherwise.


Beads: Issue Tracking That Lives in the Repo

The biggest shift in my workflow has been adopting Beads, a git-based issue tracker designed from the ground up for AI-assisted development.

The thing I value most about it: the issue tracker lives inside the repository. No external service, no API credentials, no integration to maintain. You git clone a project and the entire task history comes with it. You switch machines and nothing breaks. You hand off a repo to someone else and they have full context immediately.

This is a different approach than how most teams track work using systems like GitHub Issues, Jira, or Linear. My historical approach was to use Github Issues - I loved that my code version control system and issues used the same provider and the depth of connection that provided and the GH CLI is also extremely powerful and easy to use. Unfortunately, when I started engaging with my code in cloud environments that had more restricted default configurations GH was not available and the agents could not work with our issue tracking system. This is an issue Beads does not have as the issues are in the repo directly, versioned alongside the code.

Portability matters to me because I work across multiple machines and multiple projects, and I want my tools to work the same way everywhere without setup cost.

A few other things that make Beads well-suited to AI-assisted work:

Dependency-aware task ordering. Beads tracks dependencies between issues. Running bd ready shows only the tasks that are currently unblocked so instead of dumping the entire backlog at an agent, you hand it exactly what it's able to work on right now.

Hash-based IDs. Issues get IDs like bd-a1b2, not sequential integers. This prevents collisions when multiple agents or branches are creating tasks simultaneously.

Agent-friendly output. JSON output by default, structured for programmatic consumption. Agents don't have to parse prose to understand the issue queue.

How I use it

I initialize Beads at the start of any non-trivial project. I add a note to my agent config files like Agent.md and CLAUDE.md (the project-level instructions files for Codex and Claude Code respectively) telling the agent to use bd for task tracking.

From there, the workflow looks like:

  1. I create issues for the work I want done, sometimes via CLI, sometimes by asking the agent to break down a feature and create the issues for me
  2. When starting a session, I ask the agent to run bd ready and pick up from there
  3. The agent marks issues complete as it works, creating a running record of what's been done

The result is that every new session starts with real context—not from trying to remember, but from a structured, queryable record of the project's state.


Codex and Claude Code: Where the Work Gets Done

My primary agents are Codex from OpenAI and Claude Code from Anthropic. I've tried others but at this moment I have not found anything else that works as well for me as Codex and Claude Code. I use both equally and part of the reason for that is that they each have their own strengths and the other part is that it forces me to keep my workspace and processes agent agnostic which aligns with my portability principle.

The project scaffold

After setting up enough new projects, I got tired of recreating the same structure every time. So I extracted it into project-init, a template repository I clone at the start of anything non-trivial.

Here's what's in it and why each piece earns its place:

CLAUDE.md and AGENTS.md are both thin wrappers that point to docs/agent-instructions.md. Two entrypoints, one file to actually maintain. CLAUDE.md is what Claude Code reads on startup and AGENTS.md is what Codex reads. Keeping them both as stubs means I never have to remember which file I last updated.

docs/agent-instructions.md is the real instructions file with working principles, repo boundaries, issue tracking guidance, validation commands. When I start a project, I fill in the Project-Specific Notes section at the bottom with the package manager, dev commands, and anything else the agent needs to not ask me constantly.

docs/architecture.md is a narrative walkthrough of how the system works — not a file catalog, but a connected story about how data flows through the app. I add this to any project with architecture that needs explaining using a pattern I call the Linear Walkthrough. The idea is that an agent reading it should come away with genuine understanding of the system, not just a list of directories. I use it with both Claude Code and Codex — it's one of the most reliably useful things I do at the start of a project.

skills/ holds repo-local agent skills. These are instructions for workflows that are specific to this codebase—things that don't make sense as global skills because they reference project structure, naming conventions, or tooling choices that vary by repo. Local skills take precedence over global ones when both apply.

.temp/ is scratch space. Agents use it for notes, plans, research artifacts, intermediate files and anything they need to produce during a session that isn't meant to live permanently in the repo. The directory is tracked in git so it exists on clone, but its contents are gitignored. This keeps the working tree clean without forcing agents to drop everything into memory or lose it between steps.

Repository Boundaries I have found that agents will wander if you let them. Without clear instructions, an agent working on a project might modify files outside the repo, touch global config, or leave artifacts scattered wherever it happened to need them. None of that is catastrophic, but it adds up to a working environment that's hard to reason about and works against my core principle of portability.

.beads/ is where Beads keeps the issue tracker. It shows up here because I run bd init as part of any real project setup, and the tracker needs a home. The configuration and a JSONL task snapshot are version-controlled; the live Dolt database is local-only.

Human-in-the-loop as a hard requirement. I review everything before it ships. Agents move fast, and fast + unreviewed is a recipe for technical debt at scale. The goal is meaningful leverage, not replacing judgment.


Portability and Agent Interoperability

There's a thread running through every decision above: does this work if I switch agents tomorrow?

I use Claude Code and Codex regularly, and they have different strengths. Claude Code is better for long, context-heavy sessions with complex reasoning. Codex handles parallel tasks and isolated cloud execution well. I want to move between them without losing anything — no re-explaining the project, no orphaned state, no context trapped in one agent's memory system.

That constraint shapes the whole setup:

Instructions are agent-agnostic. docs/agent-instructions.md is the canonical source. CLAUDE.md and AGENTS.md are both thin stubs that point to it. Both agents get identical guidance from one file I maintain in one place.

Task tracking is in the repo. Beads issues live in .beads/. Any local agent with the bd CLI can read the backlog, claim work, and close issues. Cloud agents without bd use committed READY_TASKS snapshots as a fallback. There's no external service and no agent-specific integration to break.

Memory is in the repo. Both Claude Code and Codex have native memory systems — Claude Code writes to ~/.claude/projects/<project>/memory/, Codex writes to <codex_home>/memories/. Both are machine-local and invisible to the other agent. I disable Claude Code's at the project level and instruct both to use bd remember instead. One memory system, in git, readable by any tool.

Skills are in the repo. Reusable workflow instructions live in skills/ alongside the code, not in ~/.claude/skills/ or a cloud-side agent config. Any agent can read them.

Scratch space is neutral ground. .temp/ is gitignored and available to any agent. Nothing important lives there, but it gives every agent a place to work without making a mess.

The underlying principle: if it matters, it lives in the repo. Git is the one thing every agent, every machine, and every environment shares. Anything stored outside the repo is effectively invisible to half my workflow.

This also future-proofs the setup to some degree. The agent ecosystem is moving fast. A workflow that only works with one tool is a workflow that will need to be rebuilt. Building for portability now means the next agent — whatever it is — can pick up where the last one left off.


The Supporting Tools

A few other things that round out the setup:

aieyes for screenshot-grounded UI work. When an agent is working on frontend changes, it can't see what it's building. aieyes is an MCP server that closes that loop—it gives Claude Code a screenshot tool that captures any URL and returns the image directly into context. Make a change, the agent screenshots the result, sees what happened, and iterates. No manual pasting required.

It's backed by shot-scraper, which drives a headless Chromium browser. There's also an open_browser tool for when I want to pull something up myself without leaving the terminal. Registered as a global MCP server, so it's available in every project automatically.

Git worktrees for parallel development. When I'm running multiple agents on different parts of a project simultaneously, worktrees let each agent have its own working directory without stomping on each other. Underutilized feature of git that becomes essential once you're doing serious agentic work.

Conventional commits + branch discipline. Sounds boring, but a clean git history is genuinely useful when an agent needs to orient itself. bd ready tells it what to do next; git log tells it what already happened.


What I've Stopped Doing

As much as what I'm using now, it's worth noting what I've dropped:

  • Markdown todo files in the repo. These seemed like a good idea but in practice they left me wanting. They get out of sync, agents have trouble updating them reliably, and they create merge conflicts constantly. Beads is strictly better.
  • Agent-native memory systems. Both Claude Code and Codex have automatic memory systems that write notes between sessions. The problem is they're machine-local and agent-specific — Claude Code can't see what Codex remembered, Codex can't see what Claude Code saved, and neither survives a machine switch. I disable Claude Code auto memory at the project level via .claude/settings.json and instruct both agents to use bd remember instead. One memory system, in the repo, visible to every agent.
  • Switching agents mid-task without a handoff. Early on I'd swap between Claude Code and Codex opportunistically. Now if I switch agents, I do a proper handoff—making sure the current state is committed, issues are updated, and the new agent has context. Otherwise things go sideways.

It's Still Evolving

This setup works well for me right now. I expect it to keep changing—the tooling in this space is moving fast, and Beads in particular is still growing. But the underlying principle is stable: agents need context, and that context needs to live somewhere outside the conversation window.

Getting that right is what makes the difference between AI development that's actually productive and AI development that feels like it should be productive but keeps dropping things.


Resources