/vault/backup/obsidian_vault/obsidian/Documents/AI/llm-evaluation/notes/meta-evaluation/

0 directories 36 files 63 KiB total
List Grid
Name
Size Modified
Up
adding-error-bars.md
1.9 KiB
benchmark-cheater.md
1.7 KiB
benchmarks-as-targets.md
1.7 KiB
data-contamination-time.md
1.6 KiB
detecting-pretraining-data.md
1.7 KiB
diversity-stability-tradeoffs.md
2.0 KiB
elo-uncovered.md
1.7 KiB
emergent-abilities-mirage.md
1.8 KiB
evaluating-open-qa.md
1.5 KiB
evaluating-qa-evaluation.md
1.5 KiB
evaluating-the-evaluations.md
1.7 KiB
evaluation-guidelines.md
1.6 KiB
evaluation-science.md
1.9 KiB
faithful-model-evaluation.md
1.8 KiB
fix-benchmarking-nlu.md
1.7 KiB
helm-holistic-evaluation.md
1.7 KiB
latent-factors-bias.md
1.8 KiB
leaderboard-illusion.md
1.8 KiB
lifelong-benchmarks.md
1.7 KiB
livetradebench.md
1.8 KiB
measuring-what-matters.md
1.9 KiB
mixeval-wisdom-of-crowd.md
1.9 KiB
multi-prompt-evaluation.md
1.7 KiB
ppi-plus-plus.md
1.5 KiB
prediction-powered-inference.md
1.8 KiB
rankers-judges-assistants.md
1.9 KiB
ranking-unraveled.md
1.7 KiB
re-evaluating-llm-ranking.md
1.9 KiB
reproducible-evaluation-trenches.md
1.8 KiB
sabotage-evaluations-blog.md
1.5 KiB
sabotage-evaluations.md
1.8 KiB
same-loss-better-downstream.md
1.6 KiB
score-consistency-robustness.md
2.1 KiB
synthetic-data-survey.md
1.7 KiB
text-to-image-gecko.md
1.6 KiB
theory-dynamic-benchmarks.md
1.6 KiB