Compare Runs
Diff two eval runs to detect regressions and improvements.
๐ Select Runs
Need at least 2 runs
Complete at least two eval runs to use the comparison tool.
Diff two eval runs to detect regressions and improvements.
Complete at least two eval runs to use the comparison tool.