Simone Orsi
All posts
·aisecuritysoftware-engineering

Code is Cheap, Correctness is Expensive

AI makes writing code trivial, but verifying it works correctly under adversarial conditions is becoming the real bottleneck.

Everyone's excited about AI making code generation faster and cheaper. I think they're missing the point.

Yes, I can spin up a working API endpoint in minutes now. Yes, Claude can write decent React components. But here's what I've learned from years of vulnerability research: the hard part was never writing code that works. The hard part is writing code that works correctly under conditions you didn't anticipate.

Last month, I witnessed a vulnerability in a fintech application that paid out a $25,000 bounty. The bug wasn't in some complex cryptographic implementation. It was in input validation logic that worked perfectly for 99.9% of cases but failed catastrophically when you sent a specific sequence of Unicode characters that triggered an edge case in their parsing library.

The developers weren't incompetent. The code worked exactly as intended. It just didn't work correctly in an adversarial context.

This is the gap that's widening. AI can generate syntactically correct, functionally working code at an incredible pace. But it fundamentally doesn't understand adversarial behavior. It can't (properly) reason about the edge cases that attackers will find, the race conditions that emerge under load, or the subtle interactions between components that create vulnerabilities.

I've been thinking about this shift in terms of constraints and verification. We're moving from "can I write this feature?" to "can I prove this feature won't break in unexpected ways?" The bottleneck is shifting from code generation to code verification.

As AI systems become more capable and autonomous, this verification challenge becomes exponentially more complex. We're not just dealing with code that needs to work correctly once. We're dealing with systems that continuously generate and modify code (often) without human oversight.

I believe we should shift our focus on building systems that can prove their own correctness while they evolve. This means developing self-verifying architectures, continuous formal verification pipelines, and code that carries its own correctness proofs. The competitive advantage won't come from generating code faster, but from creating systems that can validate their own behavior as they adapt.

The future isn't about writing more code faster. It's about building systems that can prove their own correctness while they evolve.