Assay

The Proof

Benchmarks, audits, and verification results

Case Study

Assay vs OpenClaw — 5,000 files, 235 claims, 15.8% compliance

100% pass@5 on HumanEval (164/164)

Methodk=1k=3k=5
Baseline86.6%----
LLM-as-Judge98.2%99.4%97.2%
Assay98.8%100%100%

300 real software engineering tasks

18.3%
Baseline k=1
25%
LUCID k=1
+36.6%
30.3%
LUCID best
+65.6%

Won 7 of 10 head-to-head tasks

21.6/30
Baseline
27.2/30
Forward Assay
7 of 10 won
Try it yourself