Professional Documents
Culture Documents
Add Custom Heads
Add Custom Heads
Add Custom Heads
Transformer Heads extends the capabilities of LLMs by attaching additional heads, which produce
their own outputs.
These additional heads can range from simple linear probes for understanding transformer
processing to complex configurations for multi-task learning and task-specific fine-tuning (e.g.,
sentiment classification or regression).
Out of the box, Transformer Heads supports several models, including Mistral-7b, Llama 2 (all
sizes), and GPT-2.
I found it on Reddit and I quite like this crazy idea. I recommend checking their notebooks to get
a better idea of how it works.
💻 GitHub: https://lnkd.in/edB8-gv9
Originally started to give support to add flexible custom heads for all models in HF transformers.
Later extended the idea to add any PEFT adapters as well. Apart from regular models there will be
adapters as well. See below for llama ..
Prithivi Da Oh cool thanks, so you can insert additional heads at any layer in the network with
Adapters. Still, Transformer Heads looks a bit more straightforward and documented in this area.
My point was the idea itself is atleast 3-4 years old. I have used adapters many times and it works
like a charm..
It seems the Adapters could do this all along, yet I have never once seen an example nor a simple
notebook to demonstrate this! This notebook “Joint Multitask Learning”, just amazing! We need
more docs and examples on Adapters if they all capable of doing this
https://github.com/center-for-humans-and-machines/transformer-heads/blob/main/
notebooks/gpt2/joint_multitask_learning.ipynb