Frontier AI

The Patch Won't Save You: What Frontier Models Actually Mean for Security Practitioners

by Brad Hibbert, COO & CSO//17 min read/

Key takeaways

  1. Frontier AI models are accelerating vulnerability discovery faster than patches can keep up. Security teams need a different model.
  2. Continuous threat exposure management (CTEM) shifts the goal from patch counts to risk reduction across attack paths.
  3. Compensating controls are a first-class risk reduction strategy, not a fallback. For OT, legacy, and embedded systems, they are often the only option.
  4. Reachability is not the same as internet exposure. Assume initial access has already happened and prioritize accordingly.
  5. The CTEM verification stage is the most overlooked. Validate that mitigations actually closed the path, not just that controls were deployed.

There's a lot of noise right now about what frontier AI models mean for security. Most of it is either hype or panic, and neither helps a security team make a decision on Monday morning. In my role I talk to a lot of customers, and I hear a steady stream of feedback about what they're worried about and what they want from the products they buy. I've been working through what frontier models actually change for the people running security programs, and I want to share where I've landed, including the part that doesn't get talked about enough.

A sustained wave of disclosures, not a tidy spike

I expect a sustained run of elevated vulnerability disclosures. Software and hardware providers are going to race to find and patch what AI-assisted analysis surfaces. The cost of finding bugs is collapsing. The fuzzing, code analysis, and exploit-path discovery that used to take a skilled researcher weeks can now be done by a capable model in hours.

The number I hear most often for how long this lasts is twelve to eighteen months. I think it runs longer, closer to two or three years. I also don't think the period will be linear. We're not going to see a smooth convergence between discovered and patched vulnerabilities. Every time a new frontier model comes out, the bar for what can be discovered moves again. So this isn't a single wave that rises and then settles back down. It's the new normal, where discovery just keeps running at machine speed, finding new and more sophisticated ways to break down your security defenses. So the practical takeaway is to build for a steady stream, not a cleanup project. The team and tooling you size for a one-time surge will be underwater in a year.

Discovery, hard as it is, turns out to be the more manageable side of this, and I say that with full respect for the people who do it. The real strain shows up in triage and fixing. Open-source maintainers are already being flooded with low-quality, AI-generated vulnerability reports that take real time to disprove. It's cheap now to produce a finding, but someone still has to sit down and check whether it's real, and that someone is a person. So at least early on, this makes more work, not less.

The good news: agentic AI is coming for the development lifecycle

I'm optimistic that software and hardware providers will start using agentic AI to find and fix their own flaws earlier, inside their development process, and a lot of them already are. AI-augmented discovery is the furthest along today, but many teams are also starting to lean on AI to recommend, implement, test, and validate the patches themselves. The catch is that shipping a patch which fixes a piece of software doesn't guarantee it won't break something else in a customer's production environment. So I expect a human will stay in the loop on those fixes for a while yet, at least for the changes that matter most. This is the right direction, though, and over time it should mean fewer flaws shipping in the first place.

The end state is more secure products, and I believe we get there. The catch is that the transition period is where customers have to live today, and that period is the dangerous part. That's the piece I want to spend the most time on, because it's what I hear customers wrestling with most.

CTEM in practice: measure risk reduction, not patch counts

What I keep hearing from customers comes down to one thing. If finding vulnerabilities keeps speeding up while fixing them stays slow, the companies that do well will be the ones that act early to bring risk down before a full patch is even possible.

Start with a simple premise. You can't patch everything, and the gap is widening, so you need a different approach. The mistake is to keep measuring success by how many patches you've applied. What you actually care about is how much risk you've taken off the table. So change what you report on. Track risk burned down, not patch counts closed, because the second number can look great while your real exposure barely moves.

Frontier models are what make this shift urgent. Mythos, for example, has shown it can chain together several low-severity vulnerabilities, the kind that normally sit untouched at the bottom of a backlog because each one looks harmless on its own, and turn them into a single serious exploit. That breaks the assumption a lot of programs are built on, which is that you can rank work by individual CVE severity and safely ignore the low end. In a world where the low end can be chained into a full compromise, risk no longer lives in individual CVEs. It lives in attack paths, the routes through your environment that one vulnerability, or several strung together, open up for an attacker. So that's what you go after. You reduce the number of attack paths available to an attacker, and you do that by breaking the chain somewhere along it.

Sometimes breaking the chain means applying the patch. But with the volume of vulnerabilities and zero days we're now dealing with, more often it's going to be a compensating control instead. That could be virtual patching at the WAF or the IPS, network segmentation, disabling a vulnerable feature or protocol, tightening privileges, or removing a reachable path, each of which can take a vulnerability from critical down to acceptable while the real fix is staged and tested.

I want customers to hear that this isn't a fallback you reach for when everything else fails. For a large class of assets it's the primary lever they have. You can't agentically negotiate a hospital's maintenance window, and you can't reboot a power substation for patch Tuesday. For operational technology, legacy systems, and embedded hardware, compensating controls are frequently the only viable form of risk reduction, because patching isn't an option at all, sometimes for years and sometimes ever.

The hard part comes first, and it's knowing which attack paths you're actually exposed to. This is where I expect a lot of exposure management programs to step up their investment, because you can't disrupt a path you can't see. Once you can see the paths, the next question is which mitigation neutralizes a given path at the lowest cost, or at least reduces it enough to matter. And then comes the part nobody has fully solved, which is having real confidence that the mitigation actually worked.

That last part is worth a bit more discussion, because the verification stage in the CTEM lifecycle is so often overlooked. You can't verify everything, so the realistic approach is a hybrid one. You verify the critical risks directly, and you sample the rest. Put your real effort where the danger is highest. That means the vulnerabilities known to be actively exploited, such as those on CISA's Known Exploited Vulnerabilities catalog, the ones sitting on your most important systems, and the ones an attacker could actually reach in your environment. Everywhere you've applied the same control the same way across the board, you can reasonably assume it holds without testing every instance. And notice I said an attacker could reach, not that the system is exposed to the internet. That difference matters enough that I want to come back to it.

This is where automated penetration testing earns its keep, and honestly it suits this job better than it suits open-ended hunting for unknown bugs. The question you're asking is simple and has a clear answer. With this control in place, can the attack still get through, yes or no? That's much easier to act on than asking a tool to go find everything. Some tools already do a basic version of this, and the newer agentic ones can follow the more creative, chained-together attack paths the older tools miss.

There's one limit worth being upfront about. A control can pass the test against the attack path you know about and still leave another door open. So this kind of validation buys you time and lowers your risk, but it's not a guarantee that you're safe. In the stretch before a real patch is available, that's a fair trade, as long as everyone going in understands that's exactly the trade they're making.

Reachability is not the same as internet-facing

I said I'd come back to reachability. I spent a lot of time at BeyondTrust, which builds industrial-strength privilege management solutions, and both there and at Brinqa I keep running into the same blind spot in customer conversations, one that quietly corrupts how organizations prioritize. A lot of teams think about reachability almost entirely in terms of external exposure, meaning whatever an attacker can touch from the internet. It's an understandable way to think, but it's not how most attackers actually get in.

Plenty of the most damaging entry points aren't externally accessible at all. The attacker doesn't port-scan their way in through the hardened perimeter. They get one of your curious users to click on something interesting. Phishing, malicious links, weaponized attachments, and the rest of the social engineering toolkit remain the workhorse techniques precisely because they sidestep the perimeter completely. A user clicks something in an email, it installs something locally, and now the vulnerability you'd filed away as internal-only and not exposed is one short hop from the crown jewels.

So when customers talk about prioritizing by reachability, they shouldn't be talking about only prioritizing by external exposure. A vulnerability that's only reachable after initial access through a phished endpoint is still a reachable vulnerability.

So here's the practical shift, and the obvious question is how you actually get this visibility. It comes from connecting data you mostly already have. Your identity and privilege relationships, your network reachability, and your vulnerability findings, stitched together into a single picture of how someone could move through your environment rather than three separate lists that never talk to each other. That combined view is what lets you map the paths an attacker would actually walk once they're inside, not just the doors facing the internet. Start from the assumption that initial access has already happened, through a phished user or a compromised credential, and ask where they can go from there. Which low-severity bugs sit on the route to a domain controller, a secrets store, or your crown-jewel data? Those are the ones to prioritize, even though each looks trivial in isolation, because chaining is exactly what turns trivial into critical. A vulnerability's severity score tells you how bad that one bug is. Its position on an attack path tells you how much it actually matters to you.

This is also why least privilege, segmentation, and blast-radius reduction aren't box-checking exercises. They're the controls that matter most in this model, because they govern what an attacker can do once they're already inside through a route nobody was ever going to firewall away. You can't WAF your way out of a user clicking a link. You can limit what that click is able to reach, escalate to, and move toward.

Why patching stays slow, and why you don't actually need to test everything

Customers often ask whether the slowness is really justified, and whether they genuinely need to test every patch for spillover. The honest answer is no, they don't need to test all of them equally. But the cost of skipping the wrong one is severe enough that blanket caution wins by default, and that's why the delay persists even when a patch is technically trivial to apply.

The real reasons patching lags are mostly not about testing. Regression risk is dangerously asymmetric. A patch that takes down a revenue-critical system can cost more than the breach it was meant to prevent, and one bad kernel patch or one CrowdStrike-grade incident is enough to make every team gun-shy, so change management gates everything uniformly. It's also rarely a clean single patch. In practice it becomes a dependency cascade, where one bump forces several others, breaks compatibility, and demands a scheduled maintenance window with downtime and approvals. The fix takes minutes. The coordination takes weeks. On top of that, you can't patch what you can't see, and asset inventory is chronically incomplete, especially across operational technology, embedded systems, and the cloud instances everyone has forgotten about. Then there are the plain operational constraints, where some systems can't take downtime, some are locked to a vendor, and some have fallen out of support entirely.

AI compresses exactly one of these problems. Automated regression testing and patch-impact analysis really can shrink the testing bottleneck. They do very little for coordination, inventory, or uptime constraints. That's the whole point. Those are the durable sources of delay, and they're precisely why proactive mitigation has to become a first-class discipline rather than an afterthought.

What this means for exposure management programs

If I pull all of this together from the vendor seat, a clear message comes through about what customers are starting to ask for. They want to stop treating compensating controls as the embarrassing thing they fall back on when they can't patch in time. In a world of machine-speed discovery and human-speed remediation, the ability to rapidly identify, deploy, and validate a mitigating control is becoming a core competency, and arguably a more strategic one than raw patch velocity. They also want to stop equating reachability with internet exposure. The more useful posture is to assume the attacker is already inside through a clicked link, and to invest accordingly in the least privilege and segmentation that decide how bad that click is allowed to get.

If I had to leave a security team with three things to act on, they would be these.

  1. See your attack paths, because you can't defend a route you can't trace.
  2. Prioritize and mitigate by where a vulnerability sits on a path, not by its severity score in isolation.
  3. Validate that the mitigation actually closed the path, rather than assuming it did.

The providers will get there in the end, and their products really will become more secure. But that's the destination, and it's going to take time. The job in front of customers right now is surviving the journey, and on this journey the patch isn't going to save them fast enough. The control they can stand up today, and verify well enough to trust, just might.

Ready to see how Brinqa helps your team prioritize and act on what matters most?

Consult a Brinqa FDEConsult a Brinqa FDE

FAQs

B
Brad Hibbert
Chief Operating Officer & Chief Strategy Officer
Brad Hibbert brings over 30 years of executive experience in the software industry, with a proven track record of aligning business and technical teams to drive growth and customer success.
See all of Brad's posts

Articles

Related Articles

Insights from cybersecurity leaders and risk practitioners.

Ready to Unify Your Cyber Risk Lifecycle?

Get a DemoGet a Demo