Personalized Daily ArXiv Papers 2025-04-09

[gpt-4o]	Prompt	Completion	Total
Token	34160	4448	38608
Cost	$0.09	$0.04	$0.13

Total arXiv papers: 475

Total scanned papers: 261

Total relevant papers: 24

Table of contents with paper titles:

Quantum Mechanics and Neural Networks Authors: Christian Ferko, James Halverson
Thanos: A Block-wise Pruning Algorithm for Efficient Large Language Model Compression Authors: Ivan Ilin, Peter Richtarik
Architecture independent generalization bounds for overparametrized deep ReLU networks Authors: Thomas Chen, Chun-Kai Kevin Chien, Patricia Mu\~noz Ewald, Andrew G. Moore
Finding Fantastic Experts in MoEs: A Unified Study for Expert Dropping Strategies and Observations Authors: Ajay Jaiswal, Jianyu Wang, Yixiao Li, Pingzhi Li, Tianlong Chen, Zhangyang Wang, Chong Wang, Ruoming Pang, Xianzhi Du
Reasoning Models Know When They're Right: Probing Hidden States for Self-Verification Authors: Anqi Zhang, Yulin Chen, Jane Pan, Chen Zhao, Aurojit Panda, Jinyang Li, He He
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization Authors: Qingyang Zhang, Haitao Wu, Changqing Zhang, Peilin Zhao, Yatao Bian
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models Authors: Chejian Xu, Wei Ping, Peng Xu, Zihan Liu, Boxin Wang, Mohammad Shoeybi, Bo Li, Bryan Catanzaro
Lattice: Learning to Efficiently Compress the Memory Authors: Mahdi Karami, Vahab Mirrokni
MASS: MoErging through Adaptive Subspace Selection Authors: Donato Crisostomi, Alessandro Zirilli, Antonio Andrea Gargiulo, Maria Sofia Bucarelli, Simone Scardapane, Fabrizio Silvestri, Iacopo Masi, Emanuele Rodol`a
Find A Winning Sign: Sign Is All We Need to Win the Lottery Authors: Junghun Oh, Sungyong Baik, Kyoung Mu Lee
Achieving binary weight and activation for LLMs using Post-Training Quantization Authors: Siqing Song, Chuang Wang, Ruiqi Wang, Yi Yang, Xuyao Zhang
The Work Capacity of Channels with Memory: Maximum Extractable Work in Percept-Action Loops Authors: Lukas J. Fiderer, Paul C. Barth, Isaac D. Smith, Hans J. Briegel
TAGC: Optimizing Gradient Communication in Distributed Transformer Training Authors: Igor Polyakov, Alexey Dukhanov, Egor Spirin
Fractal and Regular Geometry of Deep Neural Networks Authors: Simmaco Di Lillo, Domenico Marinucci, Michele Salvi, Stefano Vigogna
GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian Optimization Authors: Bojana Rankovi\'c, Philippe Schwaller
Meta-Continual Learning of Neural Fields Authors: Seungyoon Woo, Junhyeog Yun, Gunhee Kim
Leanabell-Prover: Posttraining Scaling in Formal Reasoning Authors: Jingyuan Zhang, Qi Wang, Xingguang Ji, Yahui Liu, Yang Yue, Fuzheng Zhang, Di Zhang, Guorui Zhou, Kun Gai
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding Authors: Hossein Entezari Zarch, Lei Gao, Chaoyi Jiang, Murali Annavaram
Curved representational Bregman divergences and their applications Authors: Frank Nielsen
DDT: Decoupled Diffusion Transformer Authors: Shuai Wang, Zhi Tian, Weilin Huang, Limin Wang
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation Authors: Biao Zhang, Fedor Moiseev, Joshua Ainslie, Paul Suganthan, Min Ma, Surya Bhupatiraju, Fede Lebron, Orhan Firat, Armand Joulin, Zhe Dong
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling Authors: Benjamin Lipkin, Benjamin LeBrun, Jacob Hoover Vigly, Jo\~ao Loula, David R. MacIver, Li Du, Jason Eisner, Ryan Cotterell, Vikash Mansinghka, Timothy J. O'Donnell, Alexander K. Lew, Tim Vieira
Intermediate Layer Classifiers for OOD generalization Authors: Arnas Uselis, Seong Joon Oh
Measuring D\'ej`a vu Memorization Efficiently Authors: Narine Kokhlikyan, Bargav Jayaraman, Florian Bordes, Chuan Guo, Kamalika Chaudhuri

1. Quantum Mechanics and Neural Networks

ArXiv ID: 2504.05462

Authors: Christian Ferko, James Halverson

Abstract: We demonstrate that any Euclidean-time quantum mechanical theory may be represented as a neural network, ensured by the Kosambi-Karhunen-Lo`eve theorem, mean-square path continuity, and finite two-point functions. The additional constraint of reflection positivity, which is related to unitarity, may be achieved by a number of mechanisms, such as imposing neural network parameter space splitting or the Markov property. Non-differentiability of the networks is related to the appearance of non-trivial commutators. Neural networks acting on Markov processes are no longer Markov, but still reflection positive, which facilitates the definition of deep neural network quantum systems. We illustrate these principles in several examples using numerical implementations, recovering classic quantum mechanical results such as Heisenberg uncertainty, non-trivial commutators, and the spectrum.

Comment: The paper explores the representation of quantum mechanical theories as neural networks, introducing theoretical insights into the intersection of quantum mechanics and neural networks. This aligns with emerging trends and foundational research.

Relevance: 9 Novelty: 9

2. Thanos: A Block-wise Pruning Algorithm for Efficient Large Language Model Compression

ArXiv ID: 2504.05346

Authors: Ivan Ilin, Peter Richtarik

Abstract: This paper presents Thanos, a novel weight-pruning algorithm designed to reduce the memory footprint and enhance the computational efficiency of large language models (LLMs) by removing redundant weights while maintaining accuracy. Thanos introduces a block-wise pruning strategy with adaptive masks that dynamically adjust to weight importance, enabling flexible sparsity patterns and structured formats, such as $n:m$ sparsity, optimized for hardware acceleration. Experimental evaluations demonstrate that Thanos achieves state-of-the-art performance in structured pruning and outperforms existing methods in unstructured pruning. By providing an efficient and adaptable approach to model compression, Thanos offers a practical solution for deploying large models in resource-constrained environments.

Comment: The paper introduces a novel block-wise pruning algorithm, Thanos, for efficient LLM compression, which aligns with the 'Model Compression' criterion, particularly in sparsity and pruning methods.