Low-Rank Adaptation - louis030195

#llm #ai Created at 010323 # [Anonymous feedback](https://www.admonymous.co/louis030195) # [[Epistemic status]] #shower-thought Last modified date: 010323 Commit: 0 # Related - [[From software 1.0 to software 3.0]] - [[Computing/Intelligence/LLMs are a lever rather than an engine]] - [[Computing/Intelligence/LLM fine-tuning is obsolete]] - [[Computing/Google Search vs Large language model]] - [[Computing/Intelligence/Retrieval augmented generation]] # TODO > [!TODO] TODO # Low-Rank Adaptation - Full fine-tuning of large pre-trained language models becomes less feasible as model size increases, due to the high cost of GPU memory and computation resources required. - LoRA is a low-rank adaptation method that reduces the number of trainable parameters and GPU memory requirement while maintaining or improving model performance. - LoRA is based on the idea of injecting trainable rank decomposition matrices into each layer of the Transformer architecture, which allows the pre-trained weights to be frozen and only the rank matrices to be fine-tuned for specific tasks or domains. - LoRA outperforms or performs on-par with full fine-tuning on various benchmark datasets and language models, including RoBERTa, DeBERTa, GPT-2, and GPT-3. - The paper also provides an empirical investigation into rank-deficiency in language model adaptation, which sheds light on the efficacy of LoRA.