Social Foundations of Computation Members Publications

Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks

Benchbench
BenchBench is a Python package that makes it easy for practitioners to evaluate the diversity and stability of multi-task benchmarks.

Members

Thumb ticker sm profile9
Social Foundations of Computation
Thumb ticker sm 20241104 hardt moritz 12 cleaned kleiner
Social Foundations of Computation
  • Director

Publications

Social Foundations of Computation Conference Paper Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks Zhang, G., Hardt, M. In Proceedings of the 41st International Conference on Machine Learning (ICML 2024), PMLR, The Forty-First International Conference on Machine Learning (ICML), July 2024 (Published) ArXiv Code URL BibTeX