Personalized Daily ArXiv Papers 2025-05-08

[gpt-4o]	Prompt	Completion	Total
Token	28086	3939	32025
Cost	$0.07	$0.04	$0.11

Total arXiv papers: 417

Total scanned papers: 292

Total relevant papers: 12

Table of contents with paper titles:

Large Language Model Compression with Global Rank and Sparsity Optimization Authors: Changhai Zhou, Qian Qiao, Weizhong Zhang, Cheng Jin
Quiet Feature Learning in Algorithmic Tasks Authors: Prudhviraj Naidu, Zixian Wang, Leon Bergen, Ramamohan Paturi
Position: Foundation Models Need Digital Twin Representations Authors: Yiqing Shen, Hao Ding, Lalithkumar Seenivasan, Tianmin Shu, Mathias Unberath
Efficient Fine-Tuning of Quantized Models via Adaptive Rank and Bitwidth Authors: Changhai Zhou, Yuhua Zhou, Qian Qiao, Weizhong Zhang, Cheng Jin
LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection Authors: Xinyue Zeng, Haohui Wang, Junhong Lin, Jun Wu, Tyler Cody, Dawei Zhou
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via $\alpha$-$\beta$-Divergence Authors: Guanghui Wang, Zhiyong Yang, Zitai Wang, Shi Wang, Qianqian Xu, Qingming Huang
APSQ: Additive Partial Sum Quantization with Algorithm-Hardware Co-Design Authors: Yonghao Tan, Pingcheng Dong, Yongkun Wu, Yu Liu, Xuejiao Liu, Peng Luo, Shih-Yang Liu, Xijie Huang, Dong Zhang, Luhong Liang, Kwang-Ting Cheng
AccLLM: Accelerating Long-Context LLM Inference Via Algorithm-Hardware Co-Design Authors: Yanbiao Liang, Huihong Shi, Haikuo Shao, Zhongfeng Wang
Grouped Sequency-arranged Rotation: Optimizing Rotation Transformation for Quantization for Free Authors: Euntae Choi, Sumin Song, Woosang Lim, Sungjoo Yoo
Is the end of Insight in Sight ? Authors: Jean-Michel Tucny, Mihir Durve, Sauro Succi
Sparsity is All You Need: Rethinking Biological Pathway-Informed Approaches in Deep Learning Authors: Isabella Caranzano, Corrado Pancotti, Cesare Rollo, Flavio Sartori, Pietro Li`o, Piero Fariselli, Tiziana Sanavia
Information Filtering Networks: Theoretical Foundations, Generative Methodologies, and Real-World Applications Authors: Tomaso Aste

1. Large Language Model Compression with Global Rank and Sparsity Optimization

ArXiv ID: 2505.03801

Authors: Changhai Zhou, Qian Qiao, Weizhong Zhang, Cheng Jin

Abstract: Low-rank and sparse composite approximation is a natural idea to compress Large Language Models (LLMs). However, such an idea faces two primary challenges that adversely affect the performance of existing methods. The first challenge relates to the interaction and cooperation between low-rank and sparse matrices, while the second involves determining weight allocation across different layers, as redundancy varies considerably among them. To address these challenges, we propose a novel two-stage LLM compression method with the capability of global rank and sparsity optimization. It is noteworthy that the overall optimization space is vast, making comprehensive optimization computationally prohibitive. Therefore, to reduce the optimization space, our first stage utilizes robust principal component analysis to decompose the weight matrices of LLMs into low-rank and sparse components, which span the low dimensional and sparse spaces containing the resultant low-rank and sparse matrices, respectively. In the second stage, we propose a probabilistic global optimization technique to jointly identify the low-rank and sparse structures within the above two spaces. The appealing feature of our approach is its ability to automatically detect the redundancy across different layers and to manage the interaction between the sparse and low-rank components. Extensive experimental results indicate that our method significantly surpasses state-of-the-art techniques for sparsification and composite approximation.

Comment: Proposes a two-stage LLM compression method combining low-rank and sparse approximations with global optimization. This directly addresses foundational challenges in model compression and sparsity.

Relevance: 10 Novelty: 8

2. Quiet Feature Learning in Algorithmic Tasks

ArXiv ID: 2505.03997

Authors: Prudhviraj Naidu, Zixian Wang, Leon Bergen, Ramamohan Paturi

Abstract: We train Transformer-based language models on ten foundational algorithmic tasks and observe pronounced phase transitions in their loss curves that deviate from established power-law scaling trends. Over large ranges of compute, the validation loss barely improves, then abruptly decreases. Probing the models' internal representations reveals the learning of quiet features during the stagnant phase, followed by sudden acquisition of loud features that coincide with the sharp drop in loss. Our ablation experiments show that disrupting a single learned feature can dramatically degrade performance, providing evidence of their causal role in task performance. These findings challenge the prevailing assumption that next-token predictive loss reliably tracks incremental progress; instead, key internal features may be developing below the surface until they coalesce, triggering a rapid performance gain.

Comment: The paper provides insights into representation learning by analyzing how features are encoded and emerge in Transformer-based models during training. This aligns closely with the 'Representation Learning' criterion, particularly in understanding training dynamics and feature learning.

Relevance: 10 Novelty: 8

3. Position: Foundation Models Need Digital Twin Representations

ArXiv ID: 2505.03798

Authors: Yiqing Shen, Hao Ding, Lalithkumar Seenivasan, Tianmin Shu, Mathias Unberath

Abstract: Current foundation models (FMs) rely on token representations that directly fragment continuous real-world multimodal data into discrete tokens. They limit FMs to learning real-world knowledge and relationships purely through statistical correlation rather than leveraging explicit domain knowledge. Consequently, current FMs struggle with maintaining semantic coherence across modalities, capturing fine-grained spatial-temporal dynamics, and performing causal reasoning. These limitations cannot be overcome by simply scaling up model size or expanding datasets. This position paper argues that the machine learning community should consider digital twin (DT) representations, which are outcome-driven digital representations that serve as building blocks for creating virtual replicas of physical processes, as an alternative to the token representation for building FMs. Finally, we discuss how DT representations can address these challenges by providing physically grounded representations that explicitly encode domain knowledge and preserve the continuous nature of real-world processes.

Comment: The position paper argues for digital twin representations as an alternative to token-based representations in foundation models. This aligns with emerging trends and challenges established assumptions in representation learning.