Events & Talks
Social Foundations of Computation
Talk
Stratis Tsirtsis
11-02-2025
Counterfactual Token Generation in Large Language Models
Imagine the following story, generated by a large language model: "Captain Lyra stood at the helm of her trusty ship, the Maelstrom's Fury, gazing out at the endless sea. [...] Lyra's eyes welled up with tears as she realized the bitter truth—she had sacrificed everything for fleeting riches, and lost the love of her crew, her family, and herself.” Now, let’s conduct a thought experiment: how would the story have unfolded if the model had chosen “Captain Maeve” as the protagonist instead?
Moritz Hardt
Social Foundations of Computation
Talk
Stratis Tsirtsis
11-02-2025
Counterfactual Token Generation in Large Language Models
Imagine the following story, generated by a large language model: "Captain Lyra stood at the helm of her trusty ship, the Maelstrom's Fury, gazing out at the endless sea. [...] Lyra's eyes welled up with tears as she realized the bitter truth—she had sacrificed everything for fleeting riches, and lost the love of her crew, her family, and herself.” Now, let’s conduct a thought experiment: how would the story have unfolded if the model had chosen “Captain Maeve” as the protagonist instead?
Moritz Hardt
Social Foundations of Computation
Talk
Stratis Tsirtsis
11-02-2025
- 11-03-2025
Counterfactual Token Generation in Large Language Models
Moritz Hardt
Social Foundations of Computation
Talk
Evimaria Terzi
01-07-2024
Beyond accuracy: understanding the performance of LLMs on exams designed for humans
Many recent studies of LLM performance have focused on the ability of LLMs to achieve outcomes comparable to humans on academic and professional exams. However, it is not clear whether such studies shed light on the extent to which models show reasoning ability, and there is controversy about the significance and implications of such results. We seek to look more deeply into the question of how and whether the performance of LLMs on exams designed for humans reflects true aptitude inherent in LLMs. We do so by making use of the tools of psychometrics which are designed to perform meanin...
Ana-Andreea Stoica