Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Institute Homepage

Institute Homepage DE Sign In

Back

Empirical Inference Perceiving Systems Conference Paper 2024

Home Code HuggingFace project

Empirical Inference

Weiyang Liu

Postdoctoral Researcher

Perceiving Systems

Yao Feng

Guest Scientist

Perceiving Systems

Yuliang Xiu

Guest Scientist

Embodied Vision

Yuxuan Xue

Perceiving Systems

Haiwen Feng

Doctoral Researcher

Perceiving Systems

Zhen Liu

Autonomous Vision

Songyou Peng

Perceiving Systems

Yandong Wen

Guest Scientist

Perceiving Systems

Michael Black

Director

Empirical Inference

Bernhard Schölkopf

Director

Large foundation models are becoming ubiquitous, but training them from scratch is prohibitively expensive. Thus, efficiently adapting these powerful models to downstream tasks is increasingly important. In this paper, we study a principled finetuning paradigm -- Orthogonal Finetuning (OFT) -- for downstream task adaptation. Despite demonstrating good generalizability, OFT still uses a fairly large number of trainable parameters due to the high dimensionality of orthogonal matrices. To address this, we start by examining OFT from an information transmission perspective, and then identify a few key desiderata that enable better parameter-efficiency. Inspired by how the Cooley-Tukey fast Fourier transform algorithm enables efficient information transmission, we propose an efficient orthogonal parameterization using butterfly structures. We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT). By subsuming OFT as a special case, BOFT introduces a generalized orthogonal finetuning framework. Finally, we conduct an extensive empirical study of adapting large vision transformers, large language models, and text-to-image diffusion models to various downstream tasks in vision and language.

Author(s):	Weiyang Liu and Zeju Qiu and Yao Feng and Yuliang Xiu and Yuxuan Xue and Longhui Yu and Haiwen Feng and Zhen Liu and Juyeon Heo and Songyou Peng and Yandong Wen and Michael J. Black and Adrian Weller and Bernhard Schölkopf
Links:	Home Code HuggingFace project
Book Title:	Proceedings of the Twelfth International Conference on Learning Representations (ICLR)
Year:	2024
Month:	May

Bibtex Type:	Conference Paper (inproceedings)

Event Name:	The Twelfth International Conference on Learning Representations
Event Place:	Vienna, Austria
State:	Published
URL:	https://openreview.net/forum?id=7NzgkEdGyr

Electronic Archiving:	grant_archive

BibTex

@inproceedings{boft,
  title = {Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization},
  booktitle = {Proceedings of the Twelfth International Conference on Learning Representations (ICLR)},
  abstract = {Large foundation models are becoming ubiquitous, but training them from scratch is prohibitively expensive. Thus, efficiently adapting these powerful models to downstream tasks is increasingly important. In this paper, we study a principled finetuning paradigm -- Orthogonal Finetuning (OFT) -- for downstream task adaptation. Despite demonstrating good generalizability, OFT still uses a fairly large number of trainable parameters due to the high dimensionality of orthogonal matrices. To address this, we start by examining OFT from an information transmission perspective, and then identify a few key desiderata that enable better parameter-efficiency. Inspired by how the Cooley-Tukey fast Fourier transform algorithm enables efficient information transmission, we propose an efficient orthogonal parameterization using butterfly structures. We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT). By subsuming OFT as a special case, BOFT introduces a generalized orthogonal finetuning framework. Finally, we conduct an extensive empirical study of adapting large vision transformers, large language models, and text-to-image diffusion models to various downstream tasks in vision and language.},
  month = may,
  year = {2024},
  slug = {boft},
  author = {Liu, Weiyang and Qiu, Zeju and Feng, Yao and Xiu, Yuliang and Xue, Yuxuan and Yu, Longhui and Feng, Haiwen and Liu, Zhen and Heo, Juyeon and Peng, Songyou and Wen, Yandong and Black, Michael J. and Weller, Adrian and Sch{\"o}lkopf, Bernhard},
  url = {https://openreview.net/forum?id=7NzgkEdGyr},
  month_numeric = {5}
}

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives

BibTex