Do The Things

Stop Talking, Start Doing

The Skill Gap Has a Number Now

In February I wrote that “AI slop” is a skill issue. The gap isn’t AI’s, it’s the human’s. People who treat AI output as a first draft and run the iteration loop ship work that’s orders of magnitude better than people who screenshot the rough first response and dunk on it.

Hack The Box just put a number on that argument. Their 2026 benchmark has AI-augmented security teams operating 3–4x faster than human-only counterparts, with junior and mid-tier practitioners showing the biggest gains.

Expected result, right? But there’s a catch.

What 3–4x Is Actually Measuring

The lazy read is that the model is the multiplier. Frontier capability. Better training run. Whatever.

But the human-only baseline in the study isn’t slow because those operators refused to use AI. They’re slow because they’re working at the natural ceiling of what one person produces without leverage. The AI-augmented group meanwhile isn’t sprinkling Claude on top of the same workflow. They figured out how to drive it: feed the right context, push back on bad output, iterate, ship. Those are skills shaped by domain mastery and experience. From the outside the workflow looks unremarkable, which is part of what makes it confusing — you watch someone do it and think “well of course, they just pointed Claude at the problem.” Then you compare what came out at the end of the day.

That’s what HTB is measuring. Not the model. The operator.

A Concrete Version

In the AI Slop post I described auditing my own tool, dram, against a class of recent Claude Code CVEs. Twenty minutes, four real findings, full write-ups with file paths, line numbers, vulnerable code, exploit examples.

I wasn’t faster than Claude. The judgment about what to audit and what mattered was mine.

If I’d done that audit the 2022 way — read the codebase, take notes, manually trace the data flows, write up the findings, format the report — it would have been half a day’s work, maybe a full day to be thorough. Twenty minutes is the multiplier, and in that specific case it’s probably more than 3–4x because the task was bounded and the loop was a clean fit. Other tasks won’t be such a clean fit. Some tasks I’ve tried to run the loop on were closer to 1.5x. The aggregate number is what it is because the population is mixed.

The point is that the people who built the workflow are operating somewhere on that curve. The people still treating AI as autocomplete are operating at 1x and writing posts about how it’s all hype.

Why Mid-Tier Hits the Inflection

The biggest lift for capability isn’t at the elite tier. Senior operators already had judgment and pattern recognition; AI smooths their work but doesn’t transform it. Mid-tier people were the ones limited by throughput — judgment without speed, or speed without breadth — and AI fills whichever side was missing.

I’m honestly not sure how to feel about that. It’s good for the people in the middle who can move up. It’s also a real problem for the people coming up behind them, because the entry-level rungs of the ladder are exactly the ones AI is best at automating. We’re going to spend the next few years figuring that out.

This Number Is Going to Land on Hiring Desks

The HTB benchmark isn’t going to stay on HTB’s blog. Numbers like “3–4x faster” travel. They show up in board decks, all-hands slides, hiring planning meetings, and CISO budget asks.

HTB didn’t leave the implication for the reader to figure out. From their report:

For enterprises, the competitive advantage will not come from AI adoption alone. It will come from training cybersecurity professionals to effectively orchestrate, validate, and govern AI-driven workflows and agents.

Same thesis as 3–4x, in business-deck language. The operator is the asset, and the asset needs training.

For people interviewing in 2026, that means the hiring manager evaluating you has either already seen this number or will see it before they sign your offer. If their org is below the inflection point, they want someone who can pull them across it. The question stops being “have you used AI?” and starts being “what have you shipped with it?”

You can’t fake your way through that. You either have the iteration muscle or you don’t. Decompose a problem, push back on bad output, steer the workflow into something that ships. Those are the same skills that made good engineers good before AI showed up. The difference now is that the gap between people who built those skills and people who didn’t is denominated in time saved per ticket, and the number is on the CFO’s desk.

The Real Number

Three months ago I wrote that “the ceiling is wherever your expertise and willingness to iterate take it.” HTB’s benchmark is the same statement with a number attached.

The naive read of 3–4x is “AI is fast, get on or get left behind.” That’s wrong, or at least it misses the point. What the number actually says is that this is what skilled iteration produces, and most people aren’t there yet because the loop is a skill that takes practice. The model isn’t getting better at making people skilled. It’s getting better at amplifying whoever shows up with skills.

The good news, same as before, is that the loop is learnable. Find the people operating in the multiplier, watch how they prompt and push back and review, and then practice. The number in the press release is the upside. The path to it is just reps.

Stop screenshotting the bad first draft. Start running the loop.


This post was drafted with Claude Code. The benchmark referenced is Hack The Box’s 2026 AI cybersecurity benchmark report.