
Layer-wise compression to expansion in ICL representations. TDNV first decreases then increases from shallow to deep layers, splitting the model into compression and expansion stages. During the compression stage, task vector accuracy increases as task information is progressively extracted from demonstration pairs. During the expansion stage, early-exit accuracy increases as output information is progressively decoded based on the input query.
We uncover a universal Compression-to-Expansion 🚀 pattern in ICL representations, revealing how LLMs extract and utilize task information across layers.
🔥🔥 More content comming soon: codes, demos.
We analyze model representations for in-context learning (ICL). For each task, we use the hidden representation $h_{i,t}^{(l)}$ from each sample $i$ of task $t$ in layer $l$.
To measure representation quality, we propose the Task-Distance Normalized Variance (TDNV). It is the ratio of within-task variance to between-task distance. Lower TDNV means more compressed representations.
TDNV has two main components:
Take Away: The compression-expansion phenomenon is universal across model architectures and emerges naturally during training.
Take Away: As model size increases, the phenomenon becomes more pronounced, with larger models achieving better task representation compression.
Take Away: As the noise ratio increases, TDNV rises, and once the within-task variance exceeds the between-task distance ($\text{TDNV} > 1$), ICL performance drops sharply.
Take Away: As the number of demonstrations K increases, we observe an intriguing phenomenon:
• Different tasks induce task vectors in distinct directions, yet each task follows a consistent direction.
• The variance within each task decreases.
Thus, we decompose the task vector into bias and variance components:
Decrease of bias:
Decrease of variance:
@article{jiang2025compression,
title={From Compression to Expansion: A Layerwise Analysis of In-Context Learning},
author={Jiang, Jiachen and Dong, Yuxin and Zhou, Jinxin and Zhu, Zhihui},
journal={arXiv preprint arXiv:2505.17322},
year={2025}
}