auto-engineer came out of a weekend experiment where I was trying to see how far Claude could get building an operating system on its own. The short answer is "surprisingly far", but the more interesting answer is that I got bored sitting at my computer hitting enter between tasks.
The fix turned out to be pretty simple.
claude can use the command line just fine, so if I install gh inside a sandboxed container and pass --dangerously-skip-permissions, there's no reason I need to be in the loop at all.
File an issue, Claude picks it up, implements it, opens a PR, waits for CI, responds to review, merges.
Repeat.
That loop is /auto-engineer. The repo here is the toolkit that seeds it into any project:
> /seed
Target project path? ../my-project
Detected stack: Rust + Cargo
Wrote skills: /auto-engineer, /auto-manager, /sdlc, /file-issue, /wait-for-pr
Wrote infra: scripts/sandbox.sh, Dockerfile, .env.example
After that, a single shell script starts the autonomous loop in a container:
scripts/auto-engineer.sh
The whole thing is sandboxed in Docker so the "dangerously skip permissions" part is only dangerous to the container.
It mounts your host ~/.claude for auth, reads a GITHUB_TOKEN from the environment, and otherwise has no access to anything on your machine.
I've been using this to build vibix — an operating system that Claude is writing, mostly unattended. It's not serious, but it is a remarkably good way to find out where agentic coding loops actually break down. Turns out the answer is less often "the model can't do it" and more often "nobody told it what done looks like".