> [citation needed]
If your test has a high false positive rate, you are spending extra time and resources investigating potential problems that are not actually problems. It also undermines trust in the system. The real world consequences of this are not hard to spot; If the fire alarm in your office or apartment has a record of going off without there ever being an actual fire, how much more likely are you to delay acting every time it goes off, or ignore it completely? People die because of this effect.
Medicine also has a real problem with false positives. Imagine testing positive for cancer, spending the next few months worrying about it and possibly getting treatments or surgeries which have their own risks (and expenses), only to find out it was all for nothing? This is also a very real thing that happens.
Now maybe the cost of false positives in the case of finding software vulnerabilities isn't quite so dire, but the effect is still real. For every positive result, someone has to spend time looking into it... otherwise, what's the point? So the LLM tells you there's a vulnerability in some part of the software, and you spend weeks trying to figure out how it works and how to fix it, only to conclude that it was never actually broken. How many times does that have to happen before people just stop taking the LLM's suggestions seriously? How much time and money are you willing to throw imaginary problems until you conclude it's not actually worth it?
> Now you're engaging in a regression chain
Not really, no. If an attacker is aware of a vulnerability and the LLM fails to find it for an extended period of time, that could give them clues as to what makes that vulnerability difficult for the LLM to identify, and therefore where they might look for new vulnerabilities or even how to craft new malware that exploits those same blindspots. Doesn't seem very far fetched.
And this is true regardless of how high the false positive rate is, because this is a false negative. Finding problems that aren't real, versus NOT finding problems that ARE real, are very different type of failures.
=Smidge=