Foundation Models | Stanford HAI

Holistic Evaluation of Large Language Models for Medical Applications

Nigam Shah, Mike Pfeffer, Percy Liang

Feb 28, 2025

News

Medical and AI experts build a benchmark for evaluation of LLMs grounded in real-world healthcare needs.

Holistic Evaluation of Large Language Models for Medical Applications

Nigam Shah, Mike Pfeffer, Percy Liang

Feb 28, 2025

Medical and AI experts build a benchmark for evaluation of LLMs grounded in real-world healthcare needs.

Healthcare

Foundation Models

News

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

Zhengxuan Wu, Atticus Geiger, Jing Huang, Noah Goodman, Christopher Potts, Aryaman Arora, Zheng Wang

Jun 01, 2024

Research

Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability. To facilitate such research, we introduce pyvene, an open-source Python library that supports customizable interventions on a range of different PyTorch modules. pyvene supports complex intervention schemes with an intuitive configuration format, and its interventions can be static or include trainable parameters. We show how pyvene provides a unified and extensible framework for performing interventions on neural models and sharing the intervened upon models with others. We illustrate the power of the library via interpretability analyses using causal abstraction and knowledge localization. We publish our library through Python Package Index (PyPI) and provide code, documentation, and tutorials at ‘https://github.com/stanfordnlp/pyvene‘.

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

Zhengxuan Wu, Atticus Geiger, Jing Huang, Noah Goodman, Christopher Potts, Aryaman Arora, Zheng Wang

Jun 01, 2024

Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability. To facilitate such research, we introduce pyvene, an open-source Python library that supports customizable interventions on a range of different PyTorch modules. pyvene supports complex intervention schemes with an intuitive configuration format, and its interventions can be static or include trainable parameters. We show how pyvene provides a unified and extensible framework for performing interventions on neural models and sharing the intervened upon models with others. We illustrate the power of the library via interpretability analyses using causal abstraction and knowledge localization. We publish our library through Python Package Index (PyPI) and provide code, documentation, and tutorials at ‘https://github.com/stanfordnlp/pyvene‘.

Natural Language Processing

Generative AI

Machine Learning

Foundation Models

Research

How Persuasive is AI-Generated Propaganda?

Josh A. Goldstein, Jason Chao, Shelby Grossman, Alex Stamos, Michael Tomz

Sep 03, 2024

Policy Brief

This brief presents the findings of an experiment that measures how persuasive AI-generated propaganda is compared to foreign propaganda articles written by humans.

How Persuasive is AI-Generated Propaganda?

Josh A. Goldstein, Jason Chao, Shelby Grossman, Alex Stamos, Michael Tomz

Sep 03, 2024

This brief presents the findings of an experiment that measures how persuasive AI-generated propaganda is compared to foreign propaganda articles written by humans.

Democracy

Foundation Models

Policy Brief

Stanford Researchers Say AI Models Are Often Too Racially Color-Blind

TechBrew

Feb 14, 2025

Media Mention

Stanford HAI researchers develop a new benchmark suite aimed to test difference awareness in AI models.

Stanford Researchers Say AI Models Are Often Too Racially Color-Blind

TechBrew

Feb 14, 2025

Stanford HAI researchers develop a new benchmark suite aimed to test difference awareness in AI models.

Foundation Models

Media Mention

A Large Scale RCT on Effective Error Messages in CS1

Sierra Wang, John Mitchell, Christopher Piech

Mar 07, 2024

Research

In this paper, we evaluate the most effective error message types through a large-scale randomized controlled trial conducted in an open-access, online introductory computer science course with 8,762 students from 146 countries. We assess existing error message enhancement strategies, as well as two novel approaches of our own: (1) generating error messages using OpenAI's GPT in real time and (2) constructing error messages that incorporate the course discussion forum. By examining students' direct responses to error messages, and their behavior throughout the course, we quantitatively evaluate the immediate and longer term efficacy of different error message types. We find that students using GPT generated error messages repeat an error 23.1% less often in the subsequent attempt, and resolve an error in 34.8% fewer additional attempts, compared to students using standard error messages. We also perform an analysis across various demographics to understand any disparities in the impact of different error message types. Our results find no significant difference in the effectiveness of GPT generated error messages for students from varying socioeconomic and demographic backgrounds. Our findings underscore GPT generated error messages as the most helpful error message type, especially as a universally effective intervention across demographics.

A Large Scale RCT on Effective Error Messages in CS1

Sierra Wang, John Mitchell, Christopher Piech

Mar 07, 2024

In this paper, we evaluate the most effective error message types through a large-scale randomized controlled trial conducted in an open-access, online introductory computer science course with 8,762 students from 146 countries. We assess existing error message enhancement strategies, as well as two novel approaches of our own: (1) generating error messages using OpenAI's GPT in real time and (2) constructing error messages that incorporate the course discussion forum. By examining students' direct responses to error messages, and their behavior throughout the course, we quantitatively evaluate the immediate and longer term efficacy of different error message types. We find that students using GPT generated error messages repeat an error 23.1% less often in the subsequent attempt, and resolve an error in 34.8% fewer additional attempts, compared to students using standard error messages. We also perform an analysis across various demographics to understand any disparities in the impact of different error message types. Our results find no significant difference in the effectiveness of GPT generated error messages for students from varying socioeconomic and demographic backgrounds. Our findings underscore GPT generated error messages as the most helpful error message type, especially as a universally effective intervention across demographics.

Natural Language Processing

Foundation Models

Generative AI

Research

Response to NTIA’s Request for Comment on Dual Use Open Foundation Models

Researchers from Stanford HAI, CRFM, RegLab, Other Institutions

Mar 27, 2024

Response to Request

In this response to the National Telecommunications and Information Administration’s NTIA) request for comment on dual use foundation AI models with widely available model weights, scholars from Stanford HAI, the Center for Research on Foundation Models (CRFM), the Regulation, Evaluation, and Governance Lab (RegLab), and other institutions urge policymakers to amplify the benefits of open foundation models while further assessing the extent of their marginal risks.

Response to NTIA’s Request for Comment on Dual Use Open Foundation Models

Researchers from Stanford HAI, CRFM, RegLab, Other Institutions

Mar 27, 2024

In this response to the National Telecommunications and Information Administration’s NTIA) request for comment on dual use foundation AI models with widely available model weights, scholars from Stanford HAI, the Center for Research on Foundation Models (CRFM), the Regulation, Evaluation, and Governance Lab (RegLab), and other institutions urge policymakers to amplify the benefits of open foundation models while further assessing the extent of their marginal risks.

Foundation Models

Regulation, Policy, Governance

Privacy, Safety, Security

Response to Request

All Work Published on Foundation Models