#llm #ai
Created at 010323
# [Anonymous feedback](https://www.admonymous.co/louis030195)
# [[Epistemic status]]
#shower-thought
Last modified date: 010323
Commit: 0
# Related
- [[From software 1.0 to software 3.0]]
- [[Computing/Intelligence/LLMs are a lever rather than an engine]]
- [[Computing/Intelligence/LLM fine-tuning is obsolete]]
- [[Computing/Google Search vs Large language model]]
- [[Computing/Intelligence/Retrieval augmented generation]]
# TODO
> [!TODO] TODO
# Low-Rank Adaptation
- Full fine-tuning of large pre-trained language models becomes less feasible as model size increases, due to the high cost of GPU memory and computation resources required.
- LoRA is a low-rank adaptation method that reduces the number of trainable parameters and GPU memory requirement while maintaining or improving model performance.
- LoRA is based on the idea of injecting trainable rank decomposition matrices into each layer of the Transformer architecture, which allows the pre-trained weights to be frozen and only the rank matrices to be fine-tuned for specific tasks or domains.
- LoRA outperforms or performs on-par with full fine-tuning on various benchmark datasets and language models, including RoBERTa, DeBERTa, GPT-2, and GPT-3.
- The paper also provides an empirical investigation into rank-deficiency in language model adaptation, which sheds light on the efficacy of LoRA.