When Fluency Masks Failure: Examining a Multimodal Reasoning Breakdown in GPT-5
AI in healthcare does not fail loudly. It does not throw an error message. It does not say “I don’t know.” It speaks calmly. It explains confidently. It sounds clinical. And sometimes, it is completely wrong. As part of our Human-in-the-Loop red-teaming evaluations at iCliniq, we test models in