Commercial LLMs refuse 65% of critical test cases. Polyphemus doesn't. Find jailbreaks, prompt injections, and goal hijacking before your users do.