Personalized Daily ArXiv Papers 2025-07-04

[gpt-4o]	Prompt	Completion	Total
Token	25732	3071	28803
Cost	$0.06	$0.03	$0.1

Total arXiv papers: 443

Total scanned papers: 249

Total relevant papers: 18

Table of contents with paper titles:

Position: A Theory of Deep Learning Must Include Compositional Sparsity Authors: David A. Danhofer, Davide D'Ascenzo, Rafael Dubach, Tomaso Poggio
Fast and Simplex: 2-Simplicial Attention in Triton Authors: Aurko Roy, Timothy Chou, Sai Surya Duvvuri, Sijia Chen, Jiecao Yu, Xiaodong Wang, Manzil Zaheer, Rohan Anil
Knowledge Protocol Engineering: A New Paradigm for AI in Domain-Specific Knowledge Work Authors: Guangwei Zhang
Sparse Gaussian Processes: Structured Approximations and Power-EP Revisited Authors: Thang D. Bui, Michalis K. Titsias
Toward a Robust and Generalizable Metamaterial Foundation Model Authors: Namjung Kim, Dongseok Lee, Jongbin Yu, Sung Woong Cho, Dosung Lee, Yesol Park, Youngjoon Hong
A Scalable and Quantum-Accurate Foundation Model for Biomolecular Force Field via Linearly Tensorized Quadrangle Attention Authors: Qun Su, Kai Zhu, Qiaolin Gou, Jintu Zhang, Renling Hu, Yurong Li, Yongze Wang, Hui Zhang, Ziyi You, Linlong Jiang, Yu Kang, Jike Wang, Chang-Yu Hsieh, Tingjun Hou
Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks Authors: Shikai Qiu, Lechao Xiao, Andrew Gordon Wilson, Jeffrey Pennington, Atish Agarwala
Energy-Based Transformers are Scalable Learners and Thinkers Authors: Alexi Gladstone, Ganesh Nanduru, Md Mofijul Islam, Peixuan Han, Hyeonjeong Ha, Aman Chadha, Yilun Du, Heng Ji, Jundong Li, Tariq Iqbal
Latent Chain-of-Thought? Decoding the Depth-Recurrent Transformer Authors: Wenquan Lu, Yuechuan Yang, Kyle Lee, Yanshu Li, Enqi Liu
L-VAE: Variational Auto-Encoder with Learnable Beta for Disentangled Representation Authors: Hazal Mogultay Ozcan, Sinan Kalkan, Fatos T. Yarman-Vural
Solving the Hubbard model with Neural Quantum States Authors: Yuntian Gu, Wenrui Li, Heng Lin, Bo Zhan, Ruichen Li, Yifei Huang, Di He, Yantao Wu, Tao Xiang, Mingpu Qin, Liwei Wang, Dingshun Lv
Continual Gradient Low-Rank Projection Fine-Tuning for LLMs Authors: Chenxu Wang, Yilin Lyu, Zicheng Sun, Liping Jing
Learning few-step posterior samplers by unfolding and distillation of diffusion models Authors: Charlesquin Kemajou Mbakam, Jonathan Spence, Marcelo Pereyra
MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent Authors: Hongli Yu, Tinghong Chen, Jiangtao Feng, Jiangjie Chen, Weinan Dai, Qiying Yu, Ya-Qin Zhang, Wei-Ying Ma, Jingjing Liu, Mingxuan Wang, Hao Zhou
Clarifying Before Reasoning: A Coq Prover with Structural Context Authors: Yanzhen Lu, Hanbin Yang, Xiaodie Wang, Ge Zhang, Biao Li, Chenxu Fu, Chao Li, Yang Yuan, Andrew Chi-Chih Yao
What Neuroscience Can Teach AI About Learning in Continuously Changing Environments Authors: Daniel Durstewitz, Bruno Averbeck, Georgia Koppe
Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics Authors: Alex Colagrande, Paul Caillon, Eva Feillet, Alexandre Allauzen
Adaptive Iterative Soft-Thresholding Algorithm with the Median Absolute Deviation Authors: Yining Feng, Ivan Selesnick

1. Position: A Theory of Deep Learning Must Include Compositional Sparsity

ArXiv ID: 2507.02550

Authors: David A. Danhofer, Davide D'Ascenzo, Rafael Dubach, Tomaso Poggio

Abstract: Overparametrized Deep Neural Networks (DNNs) have demonstrated remarkable success in a wide variety of domains too high-dimensional for classical shallow networks subject to the curse of dimensionality. However, open questions about fundamental principles, that govern the learning dynamics of DNNs, remain. In this position paper we argue that it is the ability of DNNs to exploit the compositionally sparse structure of the target function driving their success. As such, DNNs can leverage the property that most practically relevant functions can be composed from a small set of constituent functions, each of which relies only on a low-dimensional subset of all inputs. We show that this property is shared by all efficiently Turing-computable functions and is therefore highly likely present in all current learning problems. While some promising theoretical insights on questions concerned with approximation and generalization exist in the setting of compositionally sparse functions, several important questions on the learnability and optimization of DNNs remain. Completing the picture of the role of compositional sparsity in deep learning is essential to a comprehensive theory of artificial, and even general, intelligence.

Comment: The paper discusses the role of compositional sparsity in deep learning, which aligns with the representation learning criterion, focusing on how deep networks encode information.