Evaluation note

Hallucination detection

94%accuracy•Mavaia v1.0•2025-11-24

Dataset / task: Adaptive Cognition Layer safety suite (210 scenarios, mix of hallucination triggers)

Protocol: Runtime verification across ACL safety layer with human-scored ground truth; threshold tuned on held-out set.

Sample size: n=210

System version: Mavaia v1.0

Measured: 2025-11-24

Limitations

Dataset focuses on text generation; multi-modal coverage pending
Scenarios emphasize high-stakes refusals; benign hallucinations underrepresented

TR-2025-28