Evaluation note
Hallucination detection
94%accuracy•Mavaia v1.0•2025-11-24
Dataset / task: Adaptive Cognition Layer safety suite (210 scenarios, mix of hallucination triggers)
Protocol: Runtime verification across ACL safety layer with human-scored ground truth; threshold tuned on held-out set.
Sample size: n=210
System version: Mavaia v1.0
Measured: 2025-11-24
Limitations
- Dataset focuses on text generation; multi-modal coverage pending
- Scenarios emphasize high-stakes refusals; benign hallucinations underrepresented
Sources
TR-2025-28