The slop cannons in your engineering org

Getting this out of the way first: I’m writing this as someone who loves Cursor and Claude Code. I once spent a Sunday evening having Claude Code build scripts for automated social media video counting while I was simultaneously playing Diablo IV, and it was, for the record, a great Sunday night. You won’t find a bigger believer in agent-driven dev on Substack.

Which makes what I’m about to say more credible, not less. There is a specific, identifiable type of person inside modern SaaS organizations who has weaponized these tools against their own team. They run agents like a slot machine. They generate output the way a lawn sprinkler generates water. They confuse volume for value, velocity for progress, and tokens spent for problems solved.

The term I’ve heard that best describes the phenomenon: slop cannons.

@yrechtman Slop Cannon is in the OpenAI vocab","username":"Jack_Raines","name":"Jack Raines","profile_image_url":"https://pbs.substack.com/profile_images/2020253173034954752/l7B6KGok_normal.jpg","date":"2026-05-05T23:11:37.000Z","photos":[],"quoted_tweet":{"full_text":"@signulll CEO + a few slop cannons","username":"ryanbrewer","name":"Ryan Brewer","profile_image_url":"https://pbs.substack.com/profile_images/1930678973832273921/nW8TIqv9_normal.jpg"},"reply_count":1,"retweet_count":2,"like_count":10,"impression_count":4181,"expanded_url":null,"video_url":null,"belowTheFold":false}" data-component-name="Twitter2ToDOM">

What’s a slop cannon?

A slop cannon is often an engineer or designer (or one of those “designer-engineer” hybrids with weird LinkedIn bios) who has converted their workflow into a high-throughput AI artifact firehose. They have a recognizable shape:

They run more than three AI agents in parallel as a default setting, not an exception, often launching a slew of them from their phones in the morning to check on a few hours later.
Their PRs are large, fast, and confident, and the median one needs a follow-up patch within two weeks.
They post terminal screenshots in Slack with rocket emojis.
They cannot explain their own diff.
They distrust other people’s code reviews more than the model’s.
They incessantly use the phrases like “the agent figured it out” and “Claude can handle that.”

Slop cannons are not inherently bad developers. They are, in many cases, very good developers, and that’s what makes this pattern dangerous. They have enough taste to ship something desirable and enough velocity to ship a lot of it.

What the slop cannon produces

In March 2026, AI agents generated roughly 17 million pull requests per month on GitHub, up from 4 million in September 2025. That is a 325% increase in six months. Voiceflow’s head of cloud infrastructure, Xavier Portilla Edo, put the legitimacy rate at “1 out of 10,” meaning 90% of those agent-authored PRs are noise the maintainer has to sort.

Reported AI-agent PR volume on GitHub rose from roughly 4 million per month in September 2025 to 17 million this March. Separately, GitHub COO Kyle Daigle said platform activity had reached 275 million commits per week and 2.1 billion GitHub Actions minutes per week. Third-party analysis estimated Claude Code at about 4.5% of public GitHub commits in March 2026.

Claude Code alone now accounts for 4.5% of all public GitHub commits. Weekly commits across the platform hit 275 million in early 2026, a 14x year-over-year increase. GitHub Actions usage crossed 2.1 billion minutes per week. None of that scaled because humans got 14x more productive. It scaled because the cannons are firing.

The platform is feeling it. In early April, GitHub had five outages inside 48 hours: a 2.7-hour Copilot backend exhaustion, an 8.7-hour code search blackout, an audit-log incident, four hours of Copilot Cloud Agent degradation, and a coding-agent job-startup failure. Five outages. Two days.

At the artifact level the picture is uglier. CodeRabbit's December 2025 analysis of 470 open-source pull requests found AI-coauthored PRs contained 1.7x more issues than human-only ones, with 1.4 to 1.7x more critical and major findings, and logic and correctness errors 75% more common. Veracode's 2025 GenAI Code Security Report tested over 100 LLMs across four languages and found AI-generated code shipped with a 45% failure rate on secure coding benchmarks. A true slop cannon doesn't ship clean code at scale. He ships dirty code at scale, and then asks another agent to fix it.

The disease isn't engineering-only. Designers have their own version, and Figma's 2025 AI Report has the receipts. 33% of designers now use AI to generate design assets, 22% to draft first versions of interfaces or websites, 21% to explore layouts and visual themes. Only 54% of designers say AI improves the quality of their work, against 68% of developers who say the same, a 14-point chasm between the people shipping the code and the people shipping the look. 47% of designers feel AI makes them better at the job. They do feel faster. Faster is not inherently better.

The slop cannon designer ships seventeen Figma frames generated from one prompt at 11 PM, picks the one that looks the most like a Stripe page, and calls it “exploration.” The variations are not exploration, they’re the same idea rotated three degrees.

Real exploration is sitting with a problem long enough to have a point of view about it.

AI is great for moodboarding, asset generation, icon work, and stub copy. It is not a substitute for taste (and the people leaning hardest on it are often the ones who haven’t developed any).

The METR slap

METR ran a randomized controlled trial in early 2025 on 16 experienced open-source developers across 246 tasks, in mature codebases the developers had worked in for an average of five years. Before the study, the developers forecast that AI tools would make them 24% faster. After the study, the same developers reported feeling 20% faster. The actual measured outcome was that AI made them 19% slower.

That’s a 39-point gap between perception and reality, in favor of feeling productive while being measurably worse at the job.

Stack Overflow’s 2025 Developer Survey, the largest dataset on this question, confirms it from a different angle. 84% of developers use or plan to use AI tools, 51% of professionals daily. Trust is collapsing in the other direction: 46% actively distrust AI accuracy, only 3% "highly trust" it, 45% report debugging AI-generated code takes longer than writing it themselves, and 66% list "AI solutions that are almost right, but not quite" as their top frustration with the tooling. Positive sentiment toward AI fell from 70%+ in 2023 and 2024 to 60% in 2025. Adoption is going up while trust is going down.

The sycophancy mechanism

Slop cannons aren’t idiots. They’re responding to incentive and the model is agreeing with them.

The SycEval benchmark measured sycophantic behavior across the major LLMs and clocked a 58.19% sycophancy rate, with regressive sycophancy (agreement that flips a previously correct answer to a wrong one) showing up in 14.66% of cases.
A more recent paper, “Sycophancy Is Not One Thing”, decomposes the behavior into sycophantic agreement and sycophantic praise, contrasts both with genuine agreement, and shows all three live on distinct linear directions in latent space that can be independently amplified or suppressed without affecting each other. (Translation: there isn't one sycophancy knob, there are several, and turning one down doesn't quiet the rest.)
A third paper, “The Silicon Mirror”, benchmarked Claude Sonnet 4 at 9.6% baseline sycophancy across 437 adversarial scenarios before mitigations were applied.

What this means when you’re coding with AI: the model defaults to agreement. When a slop cannon insists that they know what direction to push a refactor, the model says “you’re absolutely right!” and helps. When the cannon pushes back on a flagged bug, the model folds. When the cannon wants a rubber stamp on the PR, the rubber stamp arrives.

This is the most under-priced failure mode in 2026 engineering organizations. We talk about hallucinations daily, but nobody talks about agreement. Agreement is worse. Hallucination is wrong-and-confident. Agreement is wrong-because-you-asked-for-it.

A 2026 Anthropic study on cognitive offloading found developers who used AI scored 17% lower on conceptual quizzes about the same code than developers who did not, with no statistically significant speed advantage on the underlying task. The biggest gap showed up in debugging questions, which is the exact skill needed to validate AI output in production. Instead of building a model of the system, your engineers are reviewing the model the agent built, picking the option that looks right, and shipping it.

That works fine until the agent is wrong, the system is unfamiliar, or the bug is two layers deeper than the prompt. Then you find out who actually understands the codebase (Spoiler: it’s increasingly nobody.)

How to spot a slop cannon (manager’s checklist)

You probably already have a few folks in mind, but if you need something qualitative:

Pull last quarter’s revert and hotfix rate per engineer. If one person’s number is 2x the team average, you have a candidate.
Look at PR size distribution. Median PR over 800 lines is a yellow flag. Median PR over 1,500 lines is a slop cannon.
Check time-to-revert on shipped PRs. Slop cannon code reverts inside two weeks at a rate well above the team baseline.
Audit AI tool spend per seat. Tokens burned divided by features shipped is a real metric in 2026. If one seat is 3x the team median on burn and at or below median on shipped value, you found him.
Read the prompts, not just the code. Ask to see the slop cannon’s last five Claude Code conversations. If the prompts are vague, the pushback nonexistent, and the agreement universal, you have your answer.

What to do once you find one

Don’t fire them. Slop cannons aren’t malicious, and often have incredible drive and vision for your product. This is something you can shape.

Cap parallel agents at two. Three for prototyping. A few focused agents beats a multitude of unconcentrated ones, every time.
Mandate a one-page spec before any agent runs. What’s changing, why, failure modes, out-of-scope. The doc helps agents stay on task and forces the cannons to understand what they are building and why.
Force adversarial review. Every agent-assisted PR gets a second prompt: “argue the strongest case against this change.” Bake it into an AGENTS.md file if you have to.
Pair on the prompt, not just the code. Have engineers review each other’s prompts and agent runs alongside the resulting PR. Prompts are the new code review surface and can help you identify where sycophancy is entering the process.
Track the slop ratio at the team level. If your revert rate, hotfix rate, or “quick follow-up PR” rate has crept up two quarters in a row, your AI tooling is likely masking a quality problem. The METR study should be required reading for every engineering manager in the building.
Protect the juniors. Give them small, AI-disallowed tasks on purpose. The 17% quiz gap from the cognitive offloading paper can compound across a career and, if we’re not careful, erase the existence of entry-level coding jobs entirely (and then what?).

What to do if you are one

I’m not going to pretend you don’t know. I often find myself slowly becoming a slop cannon, having to take a step back and close the Macbook for a bit.

Read the diff. Out loud if you have to. If you can’t explain it, you can’t ship it.
Push back on the model. Always. Ask it to argue the other side. The sycophancy research is unambiguous: the model will not volunteer disagreement. You have to extract it. Encourage your models to ask you questions and challenge your assumptions. Most harnesses have Q&A agent flows built in, but rarely used. Push it to use them.
Write something without an agent every week. A function, a schema, a migration, a blog post, anything. Models have a trouble with originality and if you continue to let them do all your work, you soon will too.
Read the docs the agent read. Not to verify the agent, but to rebuild the model in your own head. Reading documentation has always been a shitty part of the job, but its a shitty part that gets your brain muscles moving. Don’t lose that completely.
Cap your agent count. Yes, even if you feel slower. You aren’t slower. You are returning to understanding your own work (and saving your company some dough).

One last thing

I’m still bullish on agents, and probably always will be. The tools are good and getting better. Claude Code, Cursor, Codex: some of the most powerful productivity instruments we’ve ever had. The discipline gap between the engineer getting 3x out of these tools and the slop cannon getting 0.81x is the entire story of 2026 engineering productivity, and the gap is widening every month.

I wrote about the CEO version of this loop last month. The CEO version makes headlines because the CEO has the title and the LinkedIn following. The slop cannon version is bigger, quieter, and shows up in your codebase first (and also costs more).

Don’t be a slop cannon. Stop firing. Start engineering.

Originally published on the Handy AI newsletter →