Personalized Daily ArXiv Papers 2025-08-21

[gpt-4o]	Prompt	Completion	Total
Token	24325	2865	27190
Cost	$0.06	$0.03	$0.09

Total arXiv papers: 402

Total scanned papers: 251

Total relevant papers: 17

Table of contents with paper titles:

Neuro-inspired Ensemble-to-Ensemble Communication Primitives for Sparse and Efficient ANNs Authors: Orestis Konstantaropoulos, Stelios Manolis Smirnakis, Maria Papadopouli
GLASS: Test-Time Acceleration for LLMs via Global-Local Neural Importance Aggregation Authors: Amirmohsen Sattarifard, Sepehr Lavasani, Ehsan Imani, Kunlin Zhang, Hanlin Xu, Fengyu Sun, Negar Hassanpour, Chao Gao
Source-Guided Flow Matching Authors: Zifan Wang, Alice Harting, Matthieu Barreau, Michael M. Zavlanos, Karl H. Johansson
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs Authors: Haokun Lin, Haobo Xu, Yichen Wu, Ziyu Guo, Renrui Zhang, Zhichao Lu, Ying Wei, Qingfu Zhang, Zhenan Sun
Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models Authors: Liyi Zhang, Jake Snell, Thomas L. Griffiths
Disentangling concept semantics via multilingual averaging in Sparse Autoencoders Authors: Cliff O'Reilly, Ernesto Jimenez-Ruiz, Tillman Weyde
Surya: Foundation Model for Heliophysics Authors: Sujit Roy, Johannes Schmude, Rohit Lal, Vishal Gaur, Marcus Freitag, Julian Kuehnert, Theodore van Kessel, Dinesha V. Hegde, Andr\'es Mu\~noz-Jaramillo, Johannes Jakubik, Etienne Vos, Kshitiz Mandal, Ata Akbari Asanjan, Joao Lucas de Sousa Almeida, Amy Lin, Talwinder Singh, Kang Yang, Chetraj Pandey, Jinsu Hong, Berkay Aydin, Thorsten Kurth, Ryan McGranaghan, Spiridon Kasapis, Vishal Upendran, Shah Bahauddin, Daniel da Silva, Nikolai V. Pogorelov, Campbell Watson, Manil Maskey, Madhulika Guhathakurta, Juan Bernabe-Moreno, Rahul Ramachandran
Graph Structure Learning with Temporal Graph Information Bottleneck for Inductive Representation Learning Authors: Jiafeng Xiong, Rizos Sakellariou
STAS: Spatio-Temporal Adaptive Computation Time for Spiking Transformers Authors: Donghwa Kang, Doohyun Kim, Sang-Ki Ko, Jinkyu Lee, Brent ByungHoon Kang, Hyeongboo Baek
Understanding Data Influence with Differential Approximation Authors: Haoru Tan, Sitong Wu, Xiuzhe Wu, Wang Wang, Bo Zhao, Zeke Xie, Gui-Song Xia, Xiaojuan Qi
Logical Expressivity and Explanations for Monotonic GNNs with Scoring Functions Authors: Matthew Morris, David J. Tena Cucala, Bernardo Cuenca Grau
Graph Concept Bottleneck Models Authors: Haotian Xu, Tsui-Wei Weng, Lam M. Nguyen, Tengfei Ma
Parameter-Aware Ensemble SINDy for Interpretable Symbolic SGS Closure Authors: Hanseul Kang, Shervin Karimkashi, Ville Vuorinen
ECHO: Frequency-aware Hierarchical Encoding for Variable-length Signal Authors: Yucong Zhang, Juan Liu, Ming Li
From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery Authors: Jiaqi Wei, Yuejin Yang, Xiang Zhang, Yuhan Chen, Xiang Zhuang, Zhangyang Gao, Dongzhan Zhou, Guangshuai Wang, Zhiqiang Gao, Juntai Cao, Zijie Qiu, Xuming He, Qiang Zhang, Chenyu You, Shuangjia Zheng, Ning Ding, Wanli Ouyang, Nanqing Dong, Yu Cheng, Siqi Sun, Lei Bai, Bowen Zhou
Multi-view Graph Condensation via Tensor Decomposition Authors: N\'icolas Roque dos Santos, Dawon Ahn, Diego Minatel, Alneu de Andrade Lopes, Evangelos E. Papalexakis
SBGD: Improving Graph Diffusion Generative Model via Stochastic Block Diffusion Authors: Junwei Su, Shan Wu

1. Neuro-inspired Ensemble-to-Ensemble Communication Primitives for Sparse and Efficient ANNs

ArXiv ID: 2508.14140

Authors: Orestis Konstantaropoulos, Stelios Manolis Smirnakis, Maria Papadopouli

Abstract: The structure of biological neural circuits-modular, hierarchical, and sparsely interconnected-reflects an efficient trade-off between wiring cost, functional specialization, and robustness. These principles offer valuable insights for artificial neural network (ANN) design, especially as networks grow in depth and scale. Sparsity, in particular, has been widely explored for reducing memory and computation, improving speed, and enhancing generalization. Motivated by systems neuroscience findings, we explore how patterns of functional connectivity in the mouse visual cortex-specifically, ensemble-to-ensemble communication, can inform ANN design. We introduce G2GNet, a novel architecture that imposes sparse, modular connectivity across feedforward layers. Despite having significantly fewer parameters than fully connected models, G2GNet achieves superior accuracy on standard vision benchmarks. To our knowledge, this is the first architecture to incorporate biologically observed functional connectivity patterns as a structural bias in ANN design. We complement this static bias with a dynamic sparse training (DST) mechanism that prunes and regrows edges during training. We also propose a Hebbian-inspired rewiring rule based on activation correlations, drawing on principles of biological plasticity. G2GNet achieves up to 75% sparsity while improving accuracy by up to 4.3% on benchmarks, including Fashion-MNIST, CIFAR-10, and CIFAR-100, outperforming dense baselines with far fewer computations.

Comment: The paper presents a novel architecture inspired by biological neural circuits, focusing on sparsity and efficiency, relevant to model compression and architecture.

Relevance: 9 Novelty: 8

2. GLASS: Test-Time Acceleration for LLMs via Global-Local Neural Importance Aggregation

ArXiv ID: 2508.14302

Authors: Amirmohsen Sattarifard, Sepehr Lavasani, Ehsan Imani, Kunlin Zhang, Hanlin Xu, Fengyu Sun, Negar Hassanpour, Chao Gao

Abstract: Deploying Large Language Models (LLMs) on edge hardware demands aggressive, prompt-aware dynamic pruning to reduce computation without degrading quality. Static or predictor-based schemes either lock in a single sparsity pattern or incur extra runtime overhead, and recent zero-shot methods that rely on statistics from a single prompt fail on short prompt and/or long generation scenarios. We introduce A/I-GLASS: Activation- and Impact-based Global-Local neural importance Aggregation for feed-forward network SparSification, two training-free methods that dynamically select FFN units using a rank-aggregation of prompt local and model-intrinsic global neuron statistics. Empirical results across multiple LLMs and benchmarks demonstrate that GLASS significantly outperforms prior training-free methods, particularly in challenging long-form generation scenarios, without relying on auxiliary predictors or adding any inference overhead.

Comment: The paper introduces a novel method for dynamic pruning in LLMs, focusing on model compression through sparsification without training, which aligns with the model compression criteria.

Relevance: 9 Novelty: 8

3. Source-Guided Flow Matching

ArXiv ID: 2508.14807

Authors: Zifan Wang, Alice Harting, Matthieu Barreau, Michael M. Zavlanos, Karl H. Johansson

Abstract: Guidance of generative models is typically achieved by modifying the probability flow vector field through the addition of a guidance field. In this paper, we instead propose the Source-Guided Flow Matching (SGFM) framework, which modifies the source distribution directly while keeping the pre-trained vector field intact. This reduces the guidance problem to a well-defined problem of sampling from the source distribution. We theoretically show that SGFM recovers the desired target distribution exactly. Furthermore, we provide bounds on the Wasserstein error for the generated distribution when using an approximate sampler of the source distribution and an approximate vector field. The key benefit of our approach is that it allows the user to flexibly choose the sampling method depending on their specific problem. To illustrate this, we systematically compare different sampling methods and discuss conditions for asymptotically exact guidance. Moreover, our framework integrates well with optimal flow matching models since the straight transport map generated by the vector field is preserved. Experimental results on synthetic 2D benchmarks, image datasets, and physics-informed generative tasks demonstrate the effectiveness and flexibility of the proposed framework.

Comment: The paper introduces a novel framework for generative models by modifying the source distribution, which aligns with emerging trends in foundational research by challenging established assumptions in generative modeling.

Relevance: 9 Novelty: 8

4. Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

ArXiv ID: 2508.14896

Authors: Haokun Lin, Haobo Xu, Yichen Wu, Ziyu Guo, Renrui Zhang, Zhichao Lu, Ying Wei, Qingfu Zhang, Zhenan Sun

Abstract: Recent advances in diffusion large language models (dLLMs) have introduced a promising alternative to autoregressive (AR) LLMs for natural language generation tasks, leveraging full attention and denoising-based decoding strategies. However, the deployment of these models on edge devices remains challenging due to their massive parameter scale and high resource demands. While post-training quantization (PTQ) has emerged as a widely adopted technique for compressing AR LLMs, its applicability to dLLMs remains largely unexplored. In this work, we present the first systematic study on quantizing diffusion-based language models. We begin by identifying the presence of activation outliers, characterized by abnormally large activation values that dominate the dynamic range. These outliers pose a key challenge to low-bit quantization, as they make it difficult to preserve precision for the majority of values. More importantly, we implement state-of-the-art PTQ methods and conduct a comprehensive evaluation across multiple task types and model variants. Our analysis is structured along four key dimensions: bit-width, quantization method, task category, and model type. Through this multi-perspective evaluation, we offer practical insights into the quantization behavior of dLLMs under different configurations. We hope our findings provide a foundation for future research in efficient dLLM deployment. All codes and experimental setups will be released to support the community.

Comment: The paper conducts a systematic study on quantizing diffusion-based language models, relevant to model compression and efficiency.

Relevance: 9 Novelty: 7

5. Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models

ArXiv ID: 2508.14285

Authors: Liyi Zhang, Jake Snell, Thomas L. Griffiths

Abstract: Fine-tuning large language models (LLMs) with low-rank adaptaion (LoRA) is a cost-effective way to incorporate information from a specific dataset. However, it is often unclear how well the fine-tuned LLM will generalize, i.e., how well it will perform on unseen datasets. Methods have been proposed to improve generalization by optimizing with in-context prompts, or by using meta-learning to fine-tune LLMs. However, these methods are expensive in memory and computation, requiring either long-context prompts or saving copies of parameters and using second-order gradient updates. To address these challenges, we propose Amortized Bayesian Meta-Learning for LoRA (ABMLL). This method builds on amortized Bayesian meta-learning for smaller models, adapting this approach to LLMs while maintaining its computational efficiency. We reframe task-specific and global parameters in the context of LoRA and use a set of new hyperparameters to balance reconstruction accuracy and the fidelity of task-specific parameters to the global ones. ABMLL provides effective generalization and scales to large models such as Llama3-8B. Furthermore, as a result of using a Bayesian framework, ABMLL provides improved uncertainty quantification. We test ABMLL on Unified-QA and CrossFit datasets and find that it outperforms existing methods on these benchmarks in terms of both accuracy and expected calibration error.

Comment: The paper discusses low-rank adaptation of large language models, which is relevant to model compression and efficiency improvements.

Relevance: 9 Novelty: 7

6. Disentangling concept semantics via multilingual averaging in Sparse Autoencoders

ArXiv ID: 2508.14275

Authors: Cliff O'Reilly, Ernesto Jimenez-Ruiz, Tillman Weyde

Abstract: Connecting LLMs with formal knowledge representation and reasoning is a promising approach to address their shortcomings. Embeddings and sparse autoencoders are widely used to represent textual content, but the semantics are entangled with syntactic and language-specific information. We propose a method that isolates concept semantics in Large Langue Models by averaging concept activations derived via Sparse Autoencoders. We create English text representations from OWL ontology classes, translate the English into French and Chinese and then pass these texts as prompts to the Gemma 2B LLM. Using the open source Gemma Scope suite of Sparse Autoencoders, we obtain concept activations for each class and language version. We average the different language activations to derive a conceptual average. We then correlate the conceptual averages with a ground truth mapping between ontology classes. Our results give a strong indication that the conceptual average aligns to the true relationship between classes when compared with a single language by itself. The result hints at a new technique which enables mechanistic interpretation of internal network states with higher accuracy.

Comment: The paper explores disentangling concept semantics using sparse autoencoders, which aligns with representation learning and sparse methods.

Relevance: 9 Novelty: 7

7. Surya: Foundation Model for Heliophysics

ArXiv ID: 2508.14112

Authors: Sujit Roy, Johannes Schmude, Rohit Lal, Vishal Gaur, Marcus Freitag, Julian Kuehnert, Theodore van Kessel, Dinesha V. Hegde, Andr\'es Mu\~noz-Jaramillo, Johannes Jakubik, Etienne Vos, Kshitiz Mandal, Ata Akbari Asanjan, Joao Lucas de Sousa Almeida, Amy Lin, Talwinder Singh, Kang Yang, Chetraj Pandey, Jinsu Hong, Berkay Aydin, Thorsten Kurth, Ryan McGranaghan, Spiridon Kasapis, Vishal Upendran, Shah Bahauddin, Daniel da Silva, Nikolai V. Pogorelov, Campbell Watson, Manil Maskey, Madhulika Guhathakurta, Juan Bernabe-Moreno, Rahul Ramachandran

Abstract: Heliophysics is central to understanding and forecasting space weather events and solar activity. Despite decades of high-resolution observations from the Solar Dynamics Observatory (SDO), most models remain task-specific and constrained by scarce labeled data, limiting their capacity to generalize across solar phenomena. We introduce Surya, a 366M parameter foundation model for heliophysics designed to learn general-purpose solar representations from multi-instrument SDO observations, including eight Atmospheric Imaging Assembly (AIA) channels and five Helioseismic and Magnetic Imager (HMI) products. Surya employs a spatiotemporal transformer architecture with spectral gating and long--short range attention, pretrained on high-resolution solar image forecasting tasks and further optimized through autoregressive rollout tuning. Zero-shot evaluations demonstrate its ability to forecast solar dynamics and flare events, while downstream fine-tuning with parameter-efficient Low-Rank Adaptation (LoRA) shows strong performance on solar wind forecasting, active region segmentation, solar flare forecasting, and EUV spectra. Surya is the first foundation model in heliophysics that uses time advancement as a pretext task on full-resolution SDO data. Its novel architecture and performance suggest that the model is able to learn the underlying physics behind solar evolution.

Comment: The paper presents a foundation model for heliophysics with a novel spatiotemporal transformer architecture, relevant to model architecture innovations.

Relevance: 8 Novelty: 8

8. Graph Structure Learning with Temporal Graph Information Bottleneck for Inductive Representation Learning

ArXiv ID: 2508.14859

Authors: Jiafeng Xiong, Rizos Sakellariou

Abstract: Temporal graph learning is crucial for dynamic networks where nodes and edges evolve over time and new nodes continuously join the system. Inductive representation learning in such settings faces two major challenges: effectively representing unseen nodes and mitigating noisy or redundant graph information. We propose GTGIB, a versatile framework that integrates Graph Structure Learning (GSL) with Temporal Graph Information Bottleneck (TGIB). We design a novel two-step GSL-based structural enhancer to enrich and optimize node neighborhoods and demonstrate its effectiveness and efficiency through theoretical proofs and experiments. The TGIB refines the optimized graph by extending the information bottleneck principle to temporal graphs, regularizing both edges and features based on our derived tractable TGIB objective function via variational approximation, enabling stable and efficient optimization. GTGIB-based models are evaluated to predict links on four real-world datasets; they outperform existing methods in all datasets under the inductive setting, with significant and consistent improvement in the transductive setting.

Comment: The paper proposes a framework integrating Graph Structure Learning with Temporal Graph Information Bottleneck, relevant to representation learning and model architecture.

Relevance: 8 Novelty: 8

9. STAS: Spatio-Temporal Adaptive Computation Time for Spiking Transformers

ArXiv ID: 2508.14138

Authors: Donghwa Kang, Doohyun Kim, Sang-Ki Ko, Jinkyu Lee, Brent ByungHoon Kang, Hyeongboo Baek

Abstract: Spiking neural networks (SNNs) offer energy efficiency over artificial neural networks (ANNs) but suffer from high latency and computational overhead due to their multi-timestep operational nature. While various dynamic computation methods have been developed to mitigate this by targeting spatial, temporal, or architecture-specific redundancies, they remain fragmented. While the principles of adaptive computation time (ACT) offer a robust foundation for a unified approach, its application to SNN-based vision Transformers (ViTs) is hindered by two core issues: the violation of its temporal similarity prerequisite and a static architecture fundamentally unsuited for its principles. To address these challenges, we propose STAS (Spatio-Temporal Adaptive computation time for Spiking transformers), a framework that co-designs the static architecture and dynamic computation policy. STAS introduces an integrated spike patch splitting (I-SPS) module to establish temporal stability by creating a unified input representation, thereby solving the architectural problem of temporal dissimilarity. This stability, in turn, allows our adaptive spiking self-attention (A-SSA) module to perform two-dimensional token pruning across both spatial and temporal axes. Implemented on spiking Transformer architectures and validated on CIFAR-10, CIFAR-100, and ImageNet, STAS reduces energy consumption by up to 45.9%, 43.8%, and 30.1%, respectively, while simultaneously improving accuracy over SOTA models.

Comment: The paper introduces a framework for spiking transformers with adaptive computation time, relevant to model architecture innovations.

Relevance: 8 Novelty: 8

10. Understanding Data Influence with Differential Approximation

ArXiv ID: 2508.14648

Authors: Haoru Tan, Sitong Wu, Xiuzhe Wu, Wang Wang, Bo Zhao, Zeke Xie, Gui-Song Xia, Xiaojuan Qi

Abstract: Data plays a pivotal role in the groundbreaking advancements in artificial intelligence. The quantitative analysis of data significantly contributes to model training, enhancing both the efficiency and quality of data utilization. However, existing data analysis tools often lag in accuracy. For instance, many of these tools even assume that the loss function of neural networks is convex. These limitations make it challenging to implement current methods effectively. In this paper, we introduce a new formulation to approximate a sample's influence by accumulating the differences in influence between consecutive learning steps, which we term Diff-In. Specifically, we formulate the sample-wise influence as the cumulative sum of its changes/differences across successive training iterations. By employing second-order approximations, we approximate these difference terms with high accuracy while eliminating the need for model convexity required by existing methods. Despite being a second-order method, Diff-In maintains computational complexity comparable to that of first-order methods and remains scalable. This efficiency is achieved by computing the product of the Hessian and gradient, which can be efficiently approximated using finite differences of first-order gradients. We assess the approximation accuracy of Diff-In both theoretically and empirically. Our theoretical analysis demonstrates that Diff-In achieves significantly lower approximation error compared to existing influence estimators. Extensive experiments further confirm its superior performance across multiple benchmark datasets in three data-centric tasks: data cleaning, data deletion, and coreset selection. Notably, our experiments on data pruning for large-scale vision-language pre-training show that Diff-In can scale to millions of data points and outperforms strong baselines.

Comment: The paper introduces a new formulation for approximating data influence, which is relevant to representation learning and training dynamics in neural networks.

Relevance: 8 Novelty: 8

11. Logical Expressivity and Explanations for Monotonic GNNs with Scoring Functions

ArXiv ID: 2508.14091

Authors: Matthew Morris, David J. Tena Cucala, Bernardo Cuenca Grau

Abstract: Graph neural networks (GNNs) are often used for the task of link prediction: predicting missing binary facts in knowledge graphs (KGs). To address the lack of explainability of GNNs on KGs, recent works extract Datalog rules from GNNs with provable correspondence guarantees. The extracted rules can be used to explain the GNN's predictions; furthermore, they can help characterise the expressive power of various GNN models. However, these works address only a form of link prediction based on a restricted, low-expressivity graph encoding/decoding method. In this paper, we consider a more general and popular approach for link prediction where a scoring function is used to decode the GNN output into fact predictions. We show how GNNs and scoring functions can be adapted to be monotonic, use the monotonicity to extract sound rules for explaining predictions, and leverage existing results about the kind of rules that scoring functions can capture. We also define procedures for obtaining equivalent Datalog programs for certain classes of monotonic GNNs with scoring functions. Our experiments show that, on link prediction benchmarks, monotonic GNNs and scoring functions perform well in practice and yield many sound rules.

Comment: The paper focuses on enhancing the explainability and expressivity of GNNs using scoring functions, which aligns with representation learning by providing insights into how GNNs encode information.