9 Agent Loops You Can Steal Today
Nine ready-to-run agent loops from the Loops channel — Ship PR Until Green, Test Until Green, PR Babysitter and more, each with its one-line goal and exit condition.
Nine ready-to-run agent loops from the Loops channel — Ship PR Until Green, Test Until Green, PR Babysitter and more, each with its one-line goal and exit condition.
The fastest way to understand loop engineering is to steal a working loop and run it. Reading about act → observe → decide → repeat is one thing; watching an agent open a pull request and refuse to stop until CI goes green is another. The Loops channel ships 152 ready recipes, each with a paste-ready kickoff prompt. This post pulls nine of the most useful — the ones I reach for weekly — and gives you the one-line goal and exit condition for each so you can pick the right one in ten seconds.
Every loop below follows the same anatomy: a verifiable goal, a deterministic check command, a max-iterations cap, and anti-gaming guardrails. If those terms are new, what loop engineering is is the five-minute primer. Otherwise, grab a loop and go.
This piece reflects public discussion across X and engineering blogs as of June 2026; verify primary sources before relying on specifics.
What it does: implements the task, pushes, opens a PR, and fixes CI failures until everything is green.
Exit condition: gh pr checks reports all checks success.
This is the flagship. It turns "open a PR and babysit the Actions tab" into one unattended run. The full field-by-field breakdown is in Ship PR Until Green: Anatomy of a Loop, and the recipe lives at /skills/ship-pr-until-green. Cap: 10 iterations.
What it does: runs the local suite, reads failures, fixes the root cause, repeats. Exit condition: test runner exits 0 with zero failures.
The best first loop because it needs nothing but your test runner — no git remote, no CI. The step-by-step build is in Test Until Green: Self-Healing Test Loops, recipe at /skills/test-until-green. Cap: 8.
What it does: runs the build, reads compiler/bundler errors, fixes them, rebuilds. Exit condition: build command exits 0 with no errors.
The compile-time sibling of Test Until Green. Perfect for type errors, missing imports, and broken bundler configs that cascade. Recipe at /skills/build-until-green. Cap: 6.
What it does: writes tests for uncovered code until coverage hits your target.
Exit condition: coverage report shows coverage ≥ your threshold (e.g., 90%).
A coverage number is a textbook gameable metric, so this loop's guardrails forbid trivial assertion-free tests. Recipe at /skills/coverage-until-threshold. Cap: 12. Watch out for metric-gaming here — it's the canonical example in why your agent loops forever.
What it does: runs the suspect test many times, diagnoses the source of non-determinism, and fixes it. Exit condition: the test passes N consecutive runs (e.g., 20/20).
Flaky tests are the number-one cause of loops that never stop, so a loop that fixes flakiness pays for itself. The deeper guide is kill flaky tests agent loop, recipe at /skills/kill-flaky-tests. Cap: keep iterating until 20 clean runs or the cap.
What it does: watches an open PR, answers review comments, addresses requested changes, and re-runs checks. Exit condition: the PR is merge-ready — CI green and all review threads resolved (a human does the actual merge).
This one runs while you sleep — it watches a single PR and answers whatever CI and reviewers throw at it, a natural complement to overnight runners like continuous Claude Code. It never self-merges: it drives the PR to merge-ready and leaves the merge to you. Recipe at /skills/pr-babysitter. It self-paces rather than running on a fixed interval. Cap: bounded by review rounds.
What it does: before each commit, runs lint/format/type checks and fixes violations until the commit is clean.
Exit condition: all pre-commit hooks pass without --no-verify.
The guardrail that matters most here is the obvious one: it must never push with --no-verify. Recipe at /skills/pre-commit-guard. Cap: 5.
What it does: bumps a dependency, runs the suite, and either keeps the bump (if green) or reverts and reports (if red). Exit condition: dependency updated with all tests passing, or cleanly reverted with a report.
A safe, revert-on-red loop that lets you keep dependencies current without breakage risk. Pair it with a nightly /schedule — see /loop, /goal & /schedule. Cap: 1 per dependency.
What it does: polls a deploy's health endpoint after release until it reports healthy or rolls back. Exit condition: health check passes for a sustained window, or the deploy is rolled back.
This is a /loop-shaped monitoring recipe rather than a /goal-shaped fixing one — it watches over time. Cap: bounded by a timeout window.
| # | Loop | Exit condition | Typical cap |
|---|---|---|---|
| 1 | Ship PR Until Green | All gh pr checks success | 10 |
| 2 | Test Until Green | Suite exits 0 | 8 |
| 3 | Build Until Green | Build exits 0 | 6 |
| 4 | Coverage Until Threshold | Coverage ≥ target | 12 |
| 5 | Kill Flaky Tests | N consecutive passes | until clean |
| 6 | PR Babysitter | Merge-ready (human merges) | review rounds |
| 7 | Pre-Commit Guard | Hooks pass, no --no-verify | 5 |
| 8 | Dependency Update | Updated+green or reverted | 1/dep |
| 9 | Deploy Health Watch | Healthy window or rollback | timeout |
The kickoff prompts are paste-ready, but spend ten seconds on three things before you run:
1. CHECK COMMAND — is it deterministic for your repo? (flaky => quarantine first)
2. MAX ITERATIONS — set a cap you're comfortable paying for
3. GUARDRAILS — keep the anti-gaming rules; they're load-bearing
That's the whole discipline. A loop you steal without checking those three is the loop you'll find spinning at iteration forty. For the structural reasons these recipes beat freeform prompting, read loop engineering vs prompt engineering. For the catalog itself and the community behind it, the open awesome-agent-loops collection (CC-BY) and Anthropic's agent loop docs are the primary sources.
Test Until Green. It needs nothing but your test runner — no git remote, no CI permissions, no PR — so it's the lowest-risk way to feel a loop work. Once you trust it, graduate to Ship PR Until Green, which is the same idea over the full pipeline.
The fixing loops (1–5, 7, 8) are safe with their guardrails intact and a max-iterations cap. Don't strip the anti-gaming rules to "speed things up" — those rules are what stop the agent from faking green. PR Babysitter (6) is designed for unattended overnight runs specifically.
Yes, and people do. A common stack is Pre-Commit Guard → Test Until Green → Ship PR Until Green, each handing off to the next. A nightly /schedule can drive the whole chain, which is how continuous Claude Code while you sleep works.
It stops and reports its state — which tests still fail, which checks are red, its best diagnosis. Hitting the cap is information, not failure: it usually means the task was underspecified or the environment is flaky. Take over with the agent's report as your starting context.
The recipes are listed in the Loops channel with paste-ready kickoff prompts, and the underlying patterns are documented openly in collections like awesome-agent-loops. You can hand-build any of these nine from the goal/exit/cap/guardrails described above.
Browse 150+ ready-to-run agent loops in the Loops channel, or explore the full skill catalog at aiskill.market.
Simplifies AI responses to be shorter, clearer, and more direct. Reduces verbosity, cuts unnecessary explanations, and gets to the point faster. 98.7K installs.
Compress AI responses into telegraphic "caveman speak" — minimal words, maximum signal. Reduces token usage by 30-50% in long sessions. 92.3K installs.
Generate, document, and improve GitHub Actions workflows. Covers triggers, jobs, steps, matrix builds, and reusable workflows. 77.9K installs.
Automate Lark/Feishu approval workflows: create, submit, query, and manage approval forms programmatically. Official Lark skill. 56.1K installs.