Personalized Daily ArXiv Papers 2025-09-04

[gpt-4o]	Prompt	Completion	Total
Token	30602	3232	33834
Cost	$0.08	$0.03	$0.11

Total arXiv papers: 423

Total scanned papers: 234

Total relevant papers: 15

Table of contents with paper titles:

Structure Transfer: an Inference-Based Calculus for the Transformation of Representations Authors: Daniel Raggi, Gem Stapleton, Mateja Jamnik, Aaron Stockdill, Grecia Garcia Garcia, Peter C-H. Cheng
Binary Quantization For LLMs Through Dynamic Grouping Authors: Xinzhe Zheng, Zhen-Qun Yang, Haoran Xie, S. Joe Qin, Arlene Chen, Fangzhen Lin
SurGBSA: Learning Representations From Molecular Dynamics Simulations Authors: Derek Jones, Yue Yang, Felice C. Lightstone, Niema Moshiri, Jonathan E. Allen, Tajana S. Rosing
LExI: Layer-Adaptive Active Experts for Efficient MoE Model Inference Authors: Krishna Teja Chitty-Venkata, Sandeep Madireddy, Murali Emani, Venkatram Vishwanath
Fast and Accurate SVD-Type Updating in Streaming Data Authors: Johannes J. Brust, Michael A. Saunders
DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling Authors: Yubo Gao, Renbo Tu, Gennady Pekhimenko, Nandita Vijaykumar
Learning Laplacian Eigenvectors: a Pre-training Method for Graph Neural Networks Authors: Howard Dai, Nyambura Njenga, Benjamin Whitsett, Catherine Ma, Darwin Deng, Sara de \'Angel, Alexandre Van Tassel, Siddharth Viswanath, Ryan Pellico, Ian Adelstein, Smita Krishnaswamy
Geometric Foundations of Tuning without Forgetting in Neural ODEs Authors: Erkan Bayram, Mohamed-Ali Belabbas, Tamer Ba\c{s}ar
FoMEMO: Towards Foundation Models for Expensive Multi-objective Optimization Authors: Yiming Yao, Fei Liu, Liang Zhao, Xi Lin, Qingfu Zhang
Structured Basis Function Networks: Loss-Centric Multi-Hypothesis Ensembles with Controllable Diversity Authors: Alejandro Rodriguez Dominguez, Muhammad Shahzad, Xia Hong
Deep Research is the New Analytics System: Towards Building the Runtime for AI-Driven Analytics Authors: Matthew Russo, Tim Kraska
LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence Authors: Xingxuan Zhang, Gang Ren, Han Yu, Hao Yuan, Hui Wang, Jiansheng Li, Jiayun Wu, Lang Mo, Li Mao, Mingchao Hao, Ningbo Dai, Renzhe Xu, Shuyang Li, Tianyang Zhang, Yue He, Yuanrui Wang, Yunjia Zhang, Zijing Xu, Dongzhe Li, Fang Gao, Hao Zou, Jiandong Liu, Jiashuo Liu, Jiawei Xu, Kaijie Cheng, Kehan Li, Linjun Zhou, Qing Li, Shaohua Fan, Xiaoyu Lin, Xinyan Han, Xuanyue Li, Yan Lu, Yuan Xue, Yuanyuan Jiang, Zimu Wang, Zhenlei Wang, Peng Cui
Unlearning That Lasts: Utility-Preserving, Robust, and Almost Irreversible Forgetting in LLMs Authors: Naman Deep Singh, Maximilian M\"uller, Francesco Croce, Matthias Hein
DrDiff: Dynamic Routing Diffusion with Hierarchical Attention for Breaking the Efficiency-Quality Trade-off Authors: Jusheng Zhang, Yijia Fan, Kaitong Cai, Zimeng Huang, Xiaofei Sun, Jian Wang, Chengpei Tang, Keze Wang
Contrastive clustering based on regular equivalence for influential node identification in complex networks Authors: Yanmei Hu, Yihang Wu, Bing Sun, Xue Yue, Biao Cai, Xiangtao Li, Yang Chen

1. Structure Transfer: an Inference-Based Calculus for the Transformation of Representations

ArXiv ID: 2509.03249

Authors: Daniel Raggi, Gem Stapleton, Mateja Jamnik, Aaron Stockdill, Grecia Garcia Garcia, Peter C-H. Cheng

Abstract: Representation choice is of fundamental importance to our ability to communicate and reason effectively. A major unsolved problem, addressed in this paper, is how to devise \textit{representational-system (RS) agnostic} techniques that drive representation transformation and choice. We present a novel calculus, called \textit{structure transfer}, that enables representation transformation across diverse RSs. Specifically, given a \textit{source} representation drawn from a source RS, the rules of structure transfer allow us to generate a \textit{target} representation for a target RS. The generality of structure transfer comes in part from its ability to ensure that the source representation and the generated target representation satisfy \textit{any} specified relation (such as semantic equivalence). This is done by exploiting \textit{schemas}, which encode knowledge about RSs. Specifically, schemas can express \textit{preservation of information} across relations between any pair of RSs, and this knowledge is used by structure transfer to derive a structure for the target representation which ensures that the desired relation holds. We formalise this using Representational Systems Theory~\cite{raggi2022rst}, building on the key concept of a \textit{construction space}. The abstract nature of construction spaces grants them the generality to model RSs of diverse kinds, including formal languages, geometric figures and diagrams, as well as informal notations. Consequently, structure transfer is a system-agnostic calculus that can be used to identify alternative representations in a wide range of practical settings.

Comment: The paper introduces a novel calculus for representation transformation across diverse representational systems, which aligns with the core topic of representation learning.

Relevance: 9 Novelty: 8

2. Binary Quantization For LLMs Through Dynamic Grouping

ArXiv ID: 2509.03054

Authors: Xinzhe Zheng, Zhen-Qun Yang, Haoran Xie, S. Joe Qin, Arlene Chen, Fangzhen Lin

Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of Natural Language Processing (NLP) tasks, but require substantial memory and computational resources. Binary quantization, which compresses model weights from 16-bit Brain Float to 1-bit representations in {-1, 1}, offers significant reductions in storage and inference costs. However, such aggressive quantization often leads to notable performance degradation compared to more conservative 4-bit quantization methods. In this research, we propose a novel optimization objective tailored for binary quantization, along with three algorithms designed to realize it effectively. Our method enhances blocked quantization by dynamically identifying optimal unstructured sub-matrices through adaptive grouping strategies. Experimental results demonstrate that our approach achieves an average bit length of just 1.007 bits, while maintaining high model quality. Specifically, our quantized LLaMA 3.2 3B model attains a perplexity of 8.23, remarkably close to the original 7.81, and surpasses previous SOTA BiLLM with a perplexity of only 123.90. Furthermore, our method is competitive with SOTA 4-bit approaches such as GPTQ in both performance and efficiency. The compression process is highly efficient, requiring only 14 seconds to quantize the full LLaMA 3.2 3B weights on a single CPU core, with the entire process completing in under 100 minutes and exhibiting embarrassingly parallel properties. Code - https://github.com/johnnyzheng0636/WGM_bi_quan

Comment: The paper presents a novel optimization objective for binary quantization in LLMs, which is relevant to model compression.

Relevance: 9 Novelty: 8

3. SurGBSA: Learning Representations From Molecular Dynamics Simulations

ArXiv ID: 2509.03084

Authors: Derek Jones, Yue Yang, Felice C. Lightstone, Niema Moshiri, Jonathan E. Allen, Tajana S. Rosing

Abstract: Self-supervised pretraining from static structures of drug-like compounds and proteins enable powerful learned feature representations. Learned features demonstrate state of the art performance on a range of predictive tasks including molecular properties, structure generation, and protein-ligand interactions. The majority of approaches are limited by their use of static structures and it remains an open question, how best to use atomistic molecular dynamics (MD) simulations to develop more generalized models to improve prediction accuracy for novel molecular structures. We present SURrogate mmGBSA (SurGBSA) as a new modeling approach for MD-based representation learning, which learns a surrogate function of the Molecular Mechanics Generalized Born Surface Area (MMGBSA). We show for the first time the benefits of physics-informed pre-training to train a surrogate MMGBSA model on a collection of over 1.4 million 3D trajectories collected from MD simulations of the CASF-2016 benchmark. SurGBSA demonstrates a dramatic 6,497x speedup versus a traditional physics-based single-point MMGBSA calculation while nearly matching single-point MMGBSA accuracy on the challenging pose ranking problem for identification of the correct top pose (-0.4% difference). Our work advances the development of molecular foundation models by showing model improvements when training on MD simulations. Models, code and training data are made publicly available.

Comment: The paper presents a new modeling approach for MD-based representation learning, which aligns with the core topic of representation learning.

Relevance: 9 Novelty: 8

4. LExI: Layer-Adaptive Active Experts for Efficient MoE Model Inference

ArXiv ID: 2509.02753

Authors: Krishna Teja Chitty-Venkata, Sandeep Madireddy, Murali Emani, Venkatram Vishwanath

Abstract: Mixture-of-Experts (MoE) models scale efficiently by activating only a subset of experts per token, offering a computationally sparse alternative to dense architectures. While prior post-training optimizations, such as inter- and intra-expert pruning, reduce memory usage they provide limited gains in inference-time compute efficiency. Moreover, existing MoE architectures typically activate a fixed number of experts uniformly across all layers, resulting in redundant computation and suboptimal performance. In this work, we first demonstrate that MoE pruning strategies improve only the memory footprint but do not significantly improve inference performance on GPU using optimized frameworks such as vLLM. To address this, we introduce LExI, a data-free optimization technique that determines the optimal number of active experts per layer in a pretrained MoE model. LExI leverages only the model weights to estimate the relative importance of each layer and adaptively assigns the number of active experts accordingly per layer. Experiments on state-of-the-art language and vision MoE benchmarks demonstrate that LExI significantly outperforms traditional MoE pruning approaches in terms of inference efficiency with negligible accuracy loss. For example, using LExI, Qwen1.5-MoE achieves the same throughput on Nvidia H100 GPU with 10% better accuracy than traditional expert pruning.

Comment: The paper introduces LExI, a novel optimization technique for MoE models, which aligns with the core topic of model architecture and efficiency improvements.

Relevance: 9 Novelty: 8

5. Fast and Accurate SVD-Type Updating in Streaming Data

ArXiv ID: 2509.02840

Authors: Johannes J. Brust, Michael A. Saunders

Abstract: For a datastream, the change over a short interval is often of low rank. For high throughput information arranged in matrix format, recomputing an optimal SVD approximation after each step is typically prohibitive. Instead, incremental and truncated updating strategies are used, which may not scale for large truncation ranks. Therefore, we propose a set of efficient new algorithms that update a bidiagonal factorization, and which are similarly accurate as the SVD methods. In particular, we develop a compact Householder-type algorithm that decouples a sparse part from a low-rank update and has about half the memory requirements of standard bidiagonalization methods. A second algorithm based on Givens rotations has only about 10 flops per rotation and scales quadratically with the problem size, compared to a typical cubic scaling. The algorithm is therefore effective for processing high-throughput updates, as we demonstrate in tracking large subspaces of recommendation systems and networks, and when compared to well known software such as LAPACK or the incremental SVD.

Comment: The paper presents new algorithms for efficient SVD-type updating, which is relevant to model compression through low-rank approaches.

Relevance: 9 Novelty: 8

6. DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling

ArXiv ID: 2509.03472

Authors: Yubo Gao, Renbo Tu, Gennady Pekhimenko, Nandita Vijaykumar

Abstract: Differentially-Private SGD (DP-SGD) is a powerful technique to protect user privacy when using sensitive data to train neural networks. During training, converting model weights and activations into low-precision formats, i.e., quantization, can drastically reduce training times, energy consumption, and cost, and is thus a widely used technique. In this work, we demonstrate that quantization causes significantly higher accuracy degradation in DP-SGD compared to regular SGD. We observe that this is caused by noise injection in DP-SGD, which amplifies quantization variance, leading to disproportionately large accuracy degradation. To address this challenge, we present QPQuant, a dynamic quantization framework that adaptively selects a changing subset of layers to quantize at each epoch. Our method combines two key ideas that effectively reduce quantization variance: (i) probabilistic sampling of the layers that rotates which layers are quantized every epoch, and (ii) loss-aware layer prioritization, which uses a differentially private loss sensitivity estimator to identify layers that can be quantized with minimal impact on model quality. This estimator consumes a negligible fraction of the overall privacy budget, preserving DP guarantees. Empirical evaluations on ResNet18, ResNet50, and DenseNet121 across a range of datasets demonstrate that DPQuant consistently outperforms static quantization baselines, achieving near Pareto-optimal accuracy-compute trade-offs and up to 2.21x theoretical throughput improvements on low-precision hardware, with less than 2% drop in validation accuracy.

Comment: The paper presents a dynamic quantization framework for differentially-private model training, which is relevant to model compression and efficiency.

Relevance: 9 Novelty: 7

7. Learning Laplacian Eigenvectors: a Pre-training Method for Graph Neural Networks

ArXiv ID: 2509.02803

Authors: Howard Dai, Nyambura Njenga, Benjamin Whitsett, Catherine Ma, Darwin Deng, Sara de \'Angel, Alexandre Van Tassel, Siddharth Viswanath, Ryan Pellico, Ian Adelstein, Smita Krishnaswamy

Abstract: We propose a novel framework for pre-training Graph Neural Networks (GNNs) by inductively learning Laplacian eigenvectors. Traditional Message Passing Neural Networks (MPNNs) often struggle to capture global and regional graph structure due to over-smoothing risk as network depth increases. Because the low-frequency eigenvectors of the graph Laplacian matrix encode global information, pre-training GNNs to predict these eigenvectors encourages the network to naturally learn large-scale structural patterns over each graph. Empirically, we show that models pre-trained via our framework outperform baseline models on a variety of graph structure-based tasks. While most existing pre-training methods focus on domain-specific tasks like node or edge feature reconstruction, our self-supervised pre-training framework is structure-based and highly flexible. Eigenvector-learning can be applied to all graph-based datasets, and can be used with synthetic features when task-specific data is sparse.

Comment: The paper proposes a novel pre-training method for GNNs by learning Laplacian eigenvectors, which is relevant to representation learning.

Relevance: 8 Novelty: 8

8. Geometric Foundations of Tuning without Forgetting in Neural ODEs

ArXiv ID: 2509.03474

Authors: Erkan Bayram, Mohamed-Ali Belabbas, Tamer Ba\c{s}ar

Abstract: In our earlier work, we introduced the principle of Tuning without Forgetting (TwF) for sequential training of neural ODEs, where training samples are added iteratively and parameters are updated within the subspace of control functions that preserves the end-point mapping at previously learned samples on the manifold of output labels in the first-order approximation sense. In this letter, we prove that this parameter subspace forms a Banach submanifold of finite codimension under nonsingular controls, and we characterize its tangent space. This reveals that TwF corresponds to a continuation/deformation of the control function along the tangent space of this Banach submanifold, providing a theoretical foundation for its mapping-preserving (not forgetting) during the sequential training exactly, beyond first-order approximation.

Comment: The paper provides a theoretical foundation for Tuning without Forgetting in neural ODEs, which is relevant to emerging trends in model architecture.

Relevance: 8 Novelty: 8

9. FoMEMO: Towards Foundation Models for Expensive Multi-objective Optimization

ArXiv ID: 2509.03244

Authors: Yiming Yao, Fei Liu, Liang Zhao, Xi Lin, Qingfu Zhang

Abstract: Expensive multi-objective optimization is a prevalent and crucial concern in many real-world scenarios, where sample-efficiency is vital due to the limited evaluations to recover the true Pareto front for decision making. Existing works either involve rebuilding Gaussian process surrogates from scratch for each objective in each new problem encountered, or rely on extensive past domain experiments for pre-training deep learning models, making them hard to generalize and impractical to cope with various emerging applications in the real world. To address this issue, we propose a new paradigm named FoMEMO (Foundation Models for Expensive Multi-objective Optimization), which enables the establishment of a foundation model conditioned on any domain trajectory and user preference, and facilitates fast in-context optimization based on the predicted preference-wise aggregation posteriors. Rather than accessing extensive domain experiments in the real world, we demonstrate that pre-training the foundation model with a diverse set of hundreds of millions of synthetic data can lead to superior adaptability to unknown problems, without necessitating any subsequent model training or updates in the optimization process. We evaluate our method across a variety of synthetic benchmarks and real-word applications, and demonstrate its superior generality and competitive performance compared to existing methods.

Comment: The paper introduces a new paradigm for foundation models in multi-objective optimization, which aligns with foundational research in AI for Science.