← Previous Summary | Monthly Overview | Next Summary →
2025-07 | 2025-08 | 2025-09

Personalized Monthly Topic Summary 2025/08

MetricValue
Total Papers429
Model Architecture129
Model Compression and Efficiency117
High Performance Computing28
Representation Learning134
Other Foundational Research21

Model Architecture (129)

  1. Understanding Transformers through the Lens of Pavlovian Conditioning - Score: 19 (R=10, N=9) - Date: 2025-08-13 - Comment: The paper provides a novel theoretical framework for understanding transformers, which is relevant to model architecture.

  2. Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks - Score: 18 (R=10, N=8) - Date: 2025-08-27 - Comment: The paper investigates the optimal sparsity of Mixture-of-Experts models for reasoning tasks, which aligns with foundational research in model architecture and sparsity.

  3. X-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC Platforms - Score: 18 (R=10, N=8) - Date: 2025-08-20 - Comment: The paper focuses on scalable training for Mixture-of-Experts (MoE) architectures, which is relevant to model architecture and efficiency breakthroughs. It introduces novel techniques for improving MoE training on HPC platforms.

  4. Maximum Score Routing For Mixture-of-Experts - Score: 18 (R=10, N=8) - Date: 2025-08-19 - Comment: The paper proposes a novel MoE routing paradigm, which is directly relevant to model architecture innovations.

  5. $\mu$-Parametrization for Mixture of Experts - Score: 18 (R=10, N=8) - Date: 2025-08-14 - Comment: The paper provides a theoretical framework for MoE parameterization, aligning with model architecture insights and foundational research.

  6. Discovering Expert-Level Nash Equilibrium Algorithms with Large Language Models - Score: 18 (R=9, N=9) - Date: 2025-08-19 - Comment: The paper presents a framework using LLMs for discovering Nash equilibrium algorithms, which is a significant contribution to foundational research in AI for Science.

  7. Invisible Architectures of Thought: Toward a New Science of AI as Cognitive Infrastructure - Score: 18 (R=9, N=9) - Date: 2025-08-01 - Comment: The paper introduces 'Cognitive Infrastructure Studies' as a new interdisciplinary domain, which could be considered an emerging trend challenging established assumptions.

  8. What can we learn from signals and systems in a transformer? Insights for probabilistic modeling and inference architecture - Score: 17 (R=9, N=8) - Date: 2025-08-29 - Comment: The paper provides insights into transformers by connecting them with classical nonlinear filtering theory, which aligns with the model architecture criterion by offering a theoretical perspective on transformer operations.

  9. Safety Alignment Should Be Made More Than Just A Few Attention Heads - Score: 17 (R=9, N=8) - Date: 2025-08-28 - Comment: The paper addresses safety alignment in LLMs by distributing safety-related behaviors across attention heads, providing insights into LLM behavior and interpretability.

  10. MultiPL-MoE: Multi-Programming-Lingual Extension of Large Language Models through Hybrid Mixture-of-Experts - Score: 17 (R=9, N=8) - Date: 2025-08-28 - Comment: The paper introduces a hybrid Mixture-of-Experts (MoE) model for multilingual code generation, aligning with the Model Architecture criterion.

  11. Enabling MoE on the Edge via Importance-Driven Expert Scheduling - Score: 17 (R=9, N=8) - Date: 2025-08-27 - Comment: The paper focuses on deploying MoE on edge devices with a novel importance-driven expert scheduling, relevant to model architecture and efficiency.

  12. UltraMemV2: Memory Networks Scaling to 120B Parameters with Superior Long-Context Learning - Score: 17 (R=9, N=8) - Date: 2025-08-27 - Comment: The paper presents UltraMemV2, a memory-layer architecture that competes with MoE models, relevant to model architecture and efficiency.

  13. FFT-MoE: Efficient Federated Fine-Tuning for Foundation Models via Large-scale Sparse MoE under Heterogeneous Edge - Score: 17 (R=9, N=8) - Date: 2025-08-27 - Comment: The paper introduces FFT MoE, a novel framework using sparse Mixture of Experts (MoE) for federated fine-tuning, which aligns with the core topic of model architecture and representation learning.

  14. What do language models model? Transformers, automata, and the format of thought - Score: 17 (R=9, N=8) - Date: 2025-08-27 - Comment: The paper discusses the theoretical understanding of transformers and LLMs, which aligns with insights into LLM behavior and architecture.

  15. SALMAN: Stability Analysis of Language Models Through the Maps Between Graph-based Manifolds - Score: 17 (R=9, N=8) - Date: 2025-08-27 - Comment: The paper presents a novel robustness framework for transformer-based language models, focusing on model stability and interpretability, which aligns with foundational research in LLMs.

  16. GateTS: Versatile and Efficient Forecasting via Attention-Inspired routed Mixture-of-Experts - Score: 17 (R=9, N=8) - Date: 2025-08-26 - Comment: The paper introduces a Mixture-of-Experts model with a novel attention-inspired gating mechanism, aligning with model architecture and efficiency topics.

  17. Closer to Reality: Practical Semi-Supervised Federated Learning for Foundation Model Adaptation - Score: 17 (R=9, N=8) - Date: 2025-08-25 - Comment: The paper introduces a novel framework, FedMox, which uses a sparse Mixture-of-Experts architecture for federated learning, aligning with the Model Architecture and Model Compression criteria.

  18. Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search - Score: 17 (R=9, N=8) - Date: 2025-08-25 - Comment: Jet-Nemotron presents a new hybrid-architecture language model developed using a novel neural architecture exploration pipeline, which aligns with the model architecture criterion.

  19. Intern-S1: A Scientific Multimodal Foundation Model - Score: 17 (R=9, N=8) - Date: 2025-08-22 - Comment: The paper introduces Intern-S1, a multimodal Mixture-of-Experts model, which aligns with model architecture innovations, particularly in MoE models.

  20. BLIPs: Bayesian Learned Interatomic Potentials - Score: 17 (R=9, N=8) - Date: 2025-08-20 - Comment: The paper introduces a Bayesian framework for training interatomic potentials, which is relevant to foundational research in AI for Science, particularly in molecular modeling. It provides a novel approach to uncertainty estimation in MLIPs, which is a significant theoretical contribution.

  21. Wavy Transformer - Score: 17 (R=9, N=8) - Date: 2025-08-19 - Comment: The paper introduces the Wavy Transformer, addressing over-smoothing in transformers, which is relevant to model architecture innovations.

  22. Reduced-order modeling of Hamiltonian dynamics based on symplectic neural networks - Score: 17 (R=9, N=8) - Date: 2025-08-19 - Comment: The paper introduces a symplectic neural network framework for Hamiltonian dynamics, relevant to model architecture and AI for Science.

  23. BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining - Score: 17 (R=9, N=8) - Date: 2025-08-18 - Comment: The paper discusses synthetic data generation for LLM pretraining, providing insights into pretraining paradigms and data quality, which is relevant to foundational research in LLMs.

  24. Why Cannot Large Language Models Ever Make True Correct Reasoning? - Score: 17 (R=9, N=8) - Date: 2025-08-15 - Comment: The paper critiques the reasoning abilities of LLMs, providing theoretical insights into their limitations, aligning with the large language models criterion.

  25. Beyond Hard Sharing: Efficient Multi-Task Speech-to-Text Modeling with Supervised Mixture of Experts - Score: 17 (R=9, N=8) - Date: 2025-08-15 - Comment: The paper proposes a Supervised Mixture of Experts (S-MoE) model, which aligns with the model architecture criterion by introducing a novel approach to MoE.

  26. CoMoE: Collaborative Optimization of Expert Aggregation and Offloading for MoE-based LLMs at Edge - Score: 17 (R=9, N=8) - Date: 2025-08-14 - Comment: The paper proposes a collaborative optimization framework for MoE-based LLMs at the edge, which is relevant to model architecture and efficiency.

  27. Topos Theory for Generative AI and LLMs - Score: 17 (R=9, N=8) - Date: 2025-08-13 - Comment: The paper introduces novel LLM architectures using topos theory, which is a significant theoretical insight into LLM behavior and architecture.

  28. Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent - Score: 17 (R=9, N=8) - Date: 2025-08-12 - Comment: The paper provides theoretical insights into how transformers learn symbolic multi-step reasoning, which is relevant to model architecture and LLMs.

  29. What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains - Score: 17 (R=9, N=8) - Date: 2025-08-12 - Comment: The paper provides theoretical insights into the representation capabilities of two-layer transformers, which aligns with the interest in understanding model architectures and training dynamics.

  30. Low-Bit Data Processing Using Multiple-Output Spiking Neurons with Non-linear Reset Feedback - Score: 17 (R=9, N=8) - Date: 2025-08-11 - Comment: The paper introduces a novel multiple-output spiking neuron model, which is relevant to representation learning and model architecture. It provides insights into how deep networks encode information and proposes a new architectural innovation in spiking neural networks.

  31. MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs - Score: 17 (R=9, N=8) - Date: 2025-08-08 - Comment: The paper presents a novel method for compressing MoE-based LLMs, which is highly relevant to model compression and MoE architecture.

  32. Making Prompts First-Class Citizens for Adaptive LLM Pipelines - Score: 17 (R=9, N=8) - Date: 2025-08-08 - Comment: The paper introduces SPEAR, a novel approach to managing prompts in LLM pipelines, focusing on making prompts structured and adaptive. This aligns with the interest in foundational research on LLM behavior and architecture.

  33. Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex Optimization - Score: 17 (R=9, N=8) - Date: 2025-08-08 - Comment: The paper introduces Gaussian mixture layers for neural networks, which is a novel architectural innovation relevant to model architecture.

  34. Learning from B Cell Evolution: Adaptive Multi-Expert Diffusion for Antibody Design via Online Optimization - Score: 17 (R=9, N=8) - Date: 2025-08-06 - Comment: The paper introduces a biologically-motivated framework for antibody design using a multi-expert system, which aligns with the interest in Mixture-of-Experts (MoE) and representation learning.

  35. Parameter-Efficient Routed Fine-Tuning: Mixture-of-Experts Demands Mixture of Adaptation Modules - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper explores parameter-efficient fine-tuning for Mixture-of-Experts models, which is relevant to both model architecture and efficiency.

  36. CAMERA: Multi-Matrix Joint Compression for MoE Models via Micro-Expert Redundancy Analysis - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper introduces CAMERA, a framework for MoE model compression, which is relevant to model compression and MoE architectures.

  37. Trainable Dynamic Mask Sparse Attention - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper introduces a trainable dynamic mask sparse attention mechanism, relevant to model architecture and efficiency improvements.

  38. EAC-MoE: Expert-Selection Aware Compressor for Mixture-of-Experts Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper proposes EAC-MoE, a method for compressing Mixture-of-Experts models using quantization and pruning, which aligns with the model compression and MoE criteria.

  39. Drift-aware Collaborative Assistance Mixture of Experts for Heterogeneous Multistream Learning - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper introduces a dynamic mixture of experts framework for multistream learning, which aligns with the interest in mixture-of-experts architectures.

  40. MMBERT: Scaled Mixture-of-Experts Multimodal BERT for Robust Chinese Hate Speech Detection under Cloaking Perturbations - Score: 17 (R=9, N=8) - Date: 2025-08-04 - Comment: The paper introduces MMBERT, a novel multimodal framework using a Mixture-of-Experts architecture, which aligns with model architecture criteria.

  41. Vectorized Attention with Learnable Encoding for Quantum Transformer - Score: 17 (R=8, N=9) - Date: 2025-08-27 - Comment: The paper introduces a Vectorized Quantum Transformer, which is a novel architecture combining quantum computing with transformer models.

  42. Quantum-Boosted High-Fidelity Deep Learning - Score: 17 (R=8, N=9) - Date: 2025-08-18 - Comment: The paper introduces a hybrid quantum-classical architecture for deep learning, which is a significant architectural innovation. It leverages quantum computing to improve the expressiveness of deep generative models, aligning with the model architecture criterion.

  43. ExpertSim: Fast Particle Detector Simulation Using Mixture-of-Generative-Experts - Score: 16 (R=9, N=7) - Date: 2025-08-29 - Comment: The paper introduces a Mixture-of-Generative-Experts architecture for particle detector simulation, aligning with the model architecture criterion, specifically MoE.

  44. Limitations of Normalization in Attention Mechanism - Score: 16 (R=9, N=7) - Date: 2025-08-26 - Comment: The paper investigates the limitations of normalization in attention mechanisms, providing insights into model architecture.

  45. MMQ: Multimodal Mixture-of-Quantization Tokenization for Semantic ID Generation and User Behavioral Adaptation - Score: 16 (R=9, N=7) - Date: 2025-08-22 - Comment: The paper discusses a multimodal mixture-of-quantization framework, which involves a multi-expert architecture, relevant to mixture-of-experts and model architecture.

  46. Word Meanings in Transformer Language Models - Score: 16 (R=9, N=7) - Date: 2025-08-19 - Comment: The paper investigates how word meanings are represented in transformer language models, which aligns with the interest in understanding how deep networks encode information.

  47. Natively Trainable Sparse Attention for Hierarchical Point Cloud Datasets - Score: 16 (R=9, N=7) - Date: 2025-08-15 - Comment: The paper explores sparse attention mechanisms in transformers, which is relevant to model architecture and efficiency improvements.

  48. HierMoE: Accelerating MoE Training with Hierarchical Token Deduplication and Expert Swap - Score: 16 (R=9, N=7) - Date: 2025-08-14 - Comment: The paper introduces HierMoE, which accelerates MoE training with hierarchical token deduplication and expert swap, aligning with the core topic of Model Architecture and MoE.

  49. Fast weight programming and linear transformers: from machine learning to neurobiology - Score: 16 (R=9, N=7) - Date: 2025-08-13 - Comment: The paper reviews Fast Weight Programmers and their connections to transformers, relevant to model architecture innovations.

  50. PiKV: KV Cache Management System for Mixture of Experts - Score: 16 (R=9, N=7) - Date: 2025-08-12 - Comment: The paper introduces a KV cache management system for MoE architectures, focusing on compression and efficiency, which aligns with interests in model compression and MoE.

  51. Mixture of Experts Guided by Gaussian Splatters Matters: A new Approach to Weakly-Supervised Video Anomaly Detection - Score: 16 (R=9, N=7) - Date: 2025-08-11 - Comment: The paper introduces a novel framework using Mixture of Experts (MoE) for video anomaly detection, which aligns with the interest in model architecture innovations.

  52. Frontier: Simulating the Next Generation of LLM Inference Systems - Score: 16 (R=9, N=7) - Date: 2025-08-06 - Comment: The paper discusses a simulator for LLM inference systems, focusing on Mixture-of-Experts (MoE) models and disaggregated architectures, which aligns with the Model Architecture criterion.

  53. Mixture of Contexts for Long Video Generation - Score: 16 (R=8, N=8) - Date: 2025-08-29 - Comment: The paper introduces a novel sparse attention routing module for long video generation, which aligns with foundational research in model architecture.

  54. Turning Tabular Foundation Models into Graph Foundation Models - Score: 16 (R=8, N=8) - Date: 2025-08-29 - Comment: The paper proposes a novel approach to turn tabular foundation models into graph foundation models, which aligns with foundational research in model architecture.

  55. CoFormer: Collaborating with Heterogeneous Edge Devices for Scalable Transformer Inference - Score: 16 (R=8, N=8) - Date: 2025-08-29 - Comment: CoFormer introduces a novel collaborative inference system for transformer models, focusing on architectural innovation and efficiency improvements.

  56. Symplectic convolutional neural networks - Score: 16 (R=8, N=8) - Date: 2025-08-28 - Comment: The paper presents a new symplectic CNN architecture, which is relevant to model architecture innovations, particularly in the context of autoencoders.

  57. HypER: Hyperbolic Echo State Networks for Capturing Stretch-and-Fold Dynamics in Chaotic Flows - Score: 16 (R=8, N=8) - Date: 2025-08-26 - Comment: The paper introduces a novel ESN architecture with hyperbolic geometry, which is relevant to model architecture innovations.

  58. Quantum Graph Attention Network: A Novel Quantum Multi-Head Attention Mechanism for Graph Learning - Score: 16 (R=8, N=8) - Date: 2025-08-26 - Comment: The Quantum Graph Attention Network introduces a novel quantum multi-head attention mechanism, aligning with model architecture innovations.

  59. Tessellation Groups, Harmonic Analysis on Non-compact Symmetric Spaces and the Heat Kernel in view of Cartan Convolutional Neural Networks - Score: 16 (R=8, N=8) - Date: 2025-08-25 - Comment: The paper discusses mathematical foundations for Cartan Convolutional Neural Networks, which is a novel architectural concept, aligning with model architecture innovations.

  60. Discovering Hidden Algebraic Structures via Transformers with Rank-Aware Beam GRPO - Score: 16 (R=8, N=8) - Date: 2025-08-22 - Comment: The paper explores transformers for algebraic structure discovery, which aligns with foundational research in model architecture and representation learning.

  61. TOAST: Fast and scalable auto-partitioning based on principled static analysis - Score: 16 (R=8, N=8) - Date: 2025-08-22 - Comment: The paper presents a novel system for auto-partitioning large models, which involves architectural innovations and efficiency improvements, relevant to model architecture and compression.

  62. STAS: Spatio-Temporal Adaptive Computation Time for Spiking Transformers - Score: 16 (R=8, N=8) - Date: 2025-08-21 - Comment: The paper introduces a framework for spiking transformers with adaptive computation time, relevant to model architecture innovations.

  63. Surya: Foundation Model for Heliophysics - Score: 16 (R=8, N=8) - Date: 2025-08-21 - Comment: The paper presents a foundation model for heliophysics with a novel spatiotemporal transformer architecture, relevant to model architecture innovations.

  64. Causally-Guided Pairwise Transformer -- Towards Foundational Digital Twins in Process Industry - Score: 16 (R=8, N=8) - Date: 2025-08-19 - Comment: The paper introduces a novel architecture, the Causally-Guided Pairwise Transformer, which integrates a causal graph as an inductive bias, aligning with interests in architectural innovations.

  65. Inverse-LLaVA: Eliminating Alignment Pre-training Through Text-to-Vision Mapping - Score: 16 (R=8, N=8) - Date: 2025-08-19 - Comment: The paper proposes a novel approach to multimodal learning without alignment pre-training, which challenges conventional paradigms in model architecture.

  66. Elucidating Rectified Flow with Deterministic Sampler: Polynomial Discretization Complexity for Multi and One-step Models - Score: 16 (R=8, N=8) - Date: 2025-08-13 - Comment: The paper provides theoretical insights into rectified flow models, which is relevant to emerging trends in model architecture and theoretical understanding.

  67. Language Models Can Understand Spectra: A Multimodal Model for Molecular Structure Elucidation - Score: 16 (R=8, N=8) - Date: 2025-08-13 - Comment: The paper introduces a multimodal model for molecular structure elucidation, which is relevant to AI for Science with a focus on foundational research in molecular modeling.

  68. Training-Free ANN-to-SNN Conversion for High-Performance Spiking Transformer - Score: 16 (R=8, N=8) - Date: 2025-08-12 - Comment: The paper proposes a training-free ANN-to-SNN conversion for Transformers, which is relevant to model architecture innovations and efficiency improvements.

  69. CellForge: Agentic Design of Virtual Cell Models - Score: 16 (R=8, N=8) - Date: 2025-08-05 - Comment: The paper introduces CellForge, a multi-agent framework for virtual cell modeling, aligning with AI for Science and emerging trends in foundational research.

  70. Eigen Neural Network: Unlocking Generalizable Vision with Eigenbasis - Score: 16 (R=8, N=8) - Date: 2025-08-05 - Comment: The paper introduces the Eigen Neural Network, a novel architecture that reparameterizes weights in a learned orthonormal eigenbasis, which aligns with the interest in model architecture innovations.

  71. Graph Lineages and Skeletal Graph Products - Score: 16 (R=8, N=8) - Date: 2025-08-04 - Comment: The paper introduces a new algebraic type theory for graded graphs and hierarchical graph lineages, which is relevant to model architecture innovations.

  72. Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving - Score: 16 (R=8, N=8) - Date: 2025-08-01 - Comment: The paper presents Seed-Prover, a model for automated theorem proving with architectural innovations, aligning with the AI for Science criterion.

  73. AI paradigm for solving differential equations: first-principles data generation and scale-dilation operator AI solver - Score: 16 (R=8, N=8) - Date: 2025-08-01 - Comment: The paper introduces a novel AI paradigm for solving differential equations using a Transformer-based AI solver, which aligns with the Model Architecture criterion. It also provides theoretical insights into the training dynamics, relevant to Representation Learning.

  74. PSO-Merging: Merging Models Based on Particle Swarm Optimization - Score: 15 (R=8, N=7) - Date: 2025-08-28 - Comment: The paper introduces a novel data-driven merging method using Particle Swarm Optimization, which is relevant to model architecture and efficiency improvements.

  75. Just Because You Can, Doesn't Mean You Should: LLMs for Data Fitting - Score: 15 (R=8, N=7) - Date: 2025-08-28 - Comment: The paper discusses the robustness issues of LLMs in data fitting, providing theoretical insights into LLM behavior, which aligns with the large language models criterion.

  76. Quantum-Classical Hybrid Molecular Autoencoder for Advancing Classical Decoding - Score: 15 (R=8, N=7) - Date: 2025-08-28 - Comment: The paper presents a hybrid quantum-classical architecture for molecular autoencoding, which is relevant to AI for Science with a focus on foundational research in molecular modeling.

  77. Distance-informed Neural Processes - Score: 15 (R=8, N=7) - Date: 2025-08-27 - Comment: The paper introduces Distance-informed Neural Processes, a novel variant of Neural Processes, which aligns with Model Architecture by improving uncertainty estimation.

  78. Biologically Disentangled Multi-Omic Modeling Reveals Mechanistic Insights into Pan-Cancer Immunotherapy Resistance - Score: 15 (R=8, N=7) - Date: 2025-08-27 - Comment: The paper introduces a Biologically Disentangled Variational Autoencoder, which is a novel architecture for integrating multi-omic data, aligning with Model Architecture.

  79. Frozen in Time: Parameter-Efficient Time Series Transformers via Reservoir-Induced Feature Expansion and Fixed Random Dynamics - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper introduces a hybrid transformer architecture for time series, which is relevant to model architecture innovations.

  80. Training Transformers for Mesh-Based Simulations - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper proposes a novel Graph Transformer architecture for mesh-based simulations, focusing on architectural innovations and efficiency, which aligns with model architecture topics.

  81. Module-Aware Parameter-Efficient Machine Unlearning on Transformers - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper introduces a module-aware parameter-efficient machine unlearning approach for Transformers, which is relevant to model architecture and efficiency.

  82. WISCA: A Lightweight Model Transition Method to Improve LLM Training via Weight Scaling - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper proposes a weight scaling method to enhance training efficiency and model quality in LLMs, relevant to model architecture and efficiency.

  83. Learn to Memorize: Optimizing LLM-based Agents with Adaptive Memory Framework - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper proposes an adaptive memory framework for LLM-based agents using MoE, which is relevant to model architecture and representation learning.

  84. CrystalDiT: A Diffusion Transformer for Crystal Generation - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper presents CrystalDiT, a diffusion transformer for crystal generation, focusing on architectural simplicity over complexity, which aligns with model architecture innovations.

  85. Bridging Foundation Models and Efficient Architectures: A Modular Brain Imaging Framework with Local Masking and Pretrained Representation Learning - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper presents a modular framework integrating foundation models with efficient architectures, including a Random Walk Mixture of Experts module, which aligns with model architecture and efficiency topics.

  86. Generative Foundation Model for Structured and Unstructured Electronic Health Records - Score: 15 (R=8, N=7) - Date: 2025-08-25 - Comment: The paper presents a multimodal foundation model, GDP, with a CNN-Transformer encoder and cross-modal attention, aligning with the Model Architecture and Large Language Models criteria.

  87. Representation Learning with Adaptive Superpixel Coding - Score: 15 (R=8, N=7) - Date: 2025-08-25 - Comment: The paper proposes a self-supervised model based on Transformers with adaptive superpixel coding, which aligns with the core topic of model architecture innovation.

  88. Transforming Causality: Transformer-Based Temporal Causal Discovery with Prior Knowledge Integration - Score: 15 (R=8, N=7) - Date: 2025-08-25 - Comment: The paper presents a Transformer-based framework for temporal causal discovery, focusing on architecture-level innovations.

  89. SDEC: Semantic Deep Embedded Clustering - Score: 15 (R=8, N=7) - Date: 2025-08-25 - Comment: The paper presents a novel method combining autoencoders and transformer-based embeddings for text clustering, which aligns with representation learning and model architecture insights.

  90. Tree-like Pairwise Interaction Networks - Score: 15 (R=8, N=7) - Date: 2025-08-22 - Comment: The paper introduces a novel neural network architecture, Tree-like Pairwise Interaction Network, which is relevant to model architecture innovations.

  91. GRASPED: Graph Anomaly Detection using Autoencoder with Spectral Encoder and Decoder (Full Version) - Score: 15 (R=8, N=7) - Date: 2025-08-22 - Comment: The paper proposes a novel autoencoder architecture for graph anomaly detection, focusing on spectral encoding and decoding, which aligns with the interest in model architecture innovations.

  92. MoEcho: Exploiting Side-Channel Attacks to Compromise User Privacy in Mixture-of-Experts LLMs - Score: 15 (R=8, N=7) - Date: 2025-08-22 - Comment: The paper discusses security vulnerabilities in Mixture-of-Experts (MoE) architectures, which is relevant to model architecture analysis.

  93. HHNAS-AM: Hierarchical Hybrid Neural Architecture Search using Adaptive Mutation Policies - Score: 15 (R=8, N=7) - Date: 2025-08-22 - Comment: The paper introduces a novel approach to Neural Architecture Search (NAS) with a hierarchical hybrid structure and adaptive mutation policies, which aligns with the model architecture criterion.

  94. Kourkoutas-Beta: A Sunspike-Driven Adam Optimizer with Desert Flair - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper introduces Kourkoutas-Beta, an Adam-style optimizer with dynamic adjustments, which is relevant to model architecture and training dynamics.

  95. Predicting the Performance of Graph Convolutional Networks with Spectral Properties of the Graph Laplacian - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper explores the use of spectral properties of the graph Laplacian to predict GCN performance, which provides theoretical insights into model architecture.

  96. SEDEG:Sequential Enhancement of Decoder and Encoder's Generality for Class Incremental Learning with Small Memory - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper discusses enhancing encoder-decoder architectures for incremental learning, which aligns with model architecture analysis.

  97. Constructing Invariant and Equivariant Operations by Symmetric Tensor Network - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper introduces a method for constructing invariant and equivariant operations, which is relevant to model architecture.

  98. Cost-Aware Contrastive Routing for LLMs - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper introduces a cost-aware routing framework for LLMs, which is relevant to model architecture and efficiency.

  99. L-SR1: Learned Symmetric-Rank-One Preconditioning - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper introduces a learned second-order optimizer, which is relevant to model architecture and optimization methods.

  100. DynamixSFT: Dynamic Mixture Optimization of Instruction Tuning Collections - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper proposes a dynamic mixture optimization method for instruction-tuning datasets, relevant to model architecture and efficiency.

  101. NeMo: A Neuron-Level Modularizing-While-Training Approach for Decomposing DNN Models - Score: 15 (R=8, N=7) - Date: 2025-08-18 - Comment: The paper introduces a neuron-level modularizing-while-training approach for DNNs, which is relevant to model architecture and representation learning by proposing a new method for modularization applicable to Transformers and CNNs.

  102. Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-08-14 - Comment: The paper introduces a novel memory architecture for LLMs, focusing on domain adaptation without changing original model parameters, aligning with foundational research in LLM architecture.

  103. Hierarchical Adaptive networks with Task vectors for Test-Time Adaptation - Score: 15 (R=8, N=7) - Date: 2025-08-14 - Comment: The paper proposes a novel hierarchical adaptive network architecture for test-time adaptation, which aligns with the model architecture criterion.

  104. Wavelet Mixture of Experts for Time Series Forecasting - Score: 15 (R=8, N=7) - Date: 2025-08-13 - Comment: The paper introduces a Wavelet Mixture of Experts model for time series forecasting, which involves MoE and is relevant to model architecture innovations.

  105. DETACH: Cross-domain Learning for Long-Horizon Tasks via Mixture of Disentangled Experts - Score: 15 (R=8, N=7) - Date: 2025-08-12 - Comment: The paper presents a cross-domain learning framework using a mixture of disentangled experts, relevant to model architecture.

  106. Parity Requires Unified Input Dependence and Negative Eigenvalues in SSMs - Score: 15 (R=8, N=7) - Date: 2025-08-12 - Comment: The paper investigates the theoretical limitations of structured state-space models (SSMs) and proposes conditions for their effectiveness, contributing to model architecture insights.

  107. Mode-Aware Non-Linear Tucker Autoencoder for Tensor-based Unsupervised Learning - Score: 15 (R=8, N=7) - Date: 2025-08-12 - Comment: The paper introduces a non-linear Tucker autoencoder for tensor-based learning, which aligns with interests in representation learning and model architecture.

  108. Recurrent Deep Differentiable Logic Gate Networks - Score: 15 (R=8, N=7) - Date: 2025-08-11 - Comment: The paper presents a new architecture combining differentiable logic gates with recurrent networks, which is relevant to model architecture innovations.

  109. Architecture-Aware Generalization Bounds for Temporal Networks: Theory and Fair Comparison Methodology - Score: 15 (R=8, N=7) - Date: 2025-08-11 - Comment: The paper provides theoretical insights into the generalization bounds of temporal networks, which aligns with foundational research in model architecture.

  110. MolSnap: Snap-Fast Molecular Generation with Latent Variational Mean Flow - Score: 15 (R=8, N=7) - Date: 2025-08-08 - Comment: The paper proposes a novel framework for molecular generation using a Causality-Aware Transformer and Variational Mean Flow, which is relevant to foundational research in AI for Science.

  111. Cross-LoRA: A Data-Free LoRA Transfer Framework across Heterogeneous LLMs - Score: 15 (R=8, N=7) - Date: 2025-08-08 - Comment: The paper introduces Cross-LoRA, a framework for transferring LoRA modules across LLMs, aligning with model architecture and efficiency.

  112. Attention Basin: Why Contextual Position Matters in Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-08-08 - Comment: The paper investigates the positional bias in LLMs and introduces a method to enhance model performance by reordering input sequences, which provides insights into LLM behavior.

  113. Taxonomy of Faults in Attention-Based Neural Networks - Score: 15 (R=8, N=7) - Date: 2025-08-08 - Comment: The paper presents a taxonomy of faults in attention-based neural networks, which is relevant to understanding and analyzing existing architectures.

  114. Gaussian mixture layers for neural networks - Score: 15 (R=8, N=7) - Date: 2025-08-08 - Comment: The paper introduces Gaussian mixture layers for neural networks, which is a novel architectural innovation. This aligns with the interest in model architecture and representation learning.

  115. Zero-Variance Gradients for Variational Autoencoders - Score: 15 (R=8, N=7) - Date: 2025-08-06 - Comment: The paper proposes a method for zero-variance gradients in VAEs, which aligns with the interest in representation learning and autoencoders.

  116. VCNet: Recreating High-Level Visual Cortex Principles for Robust Artificial Vision - Score: 15 (R=8, N=7) - Date: 2025-08-06 - Comment: VCNet introduces a novel architecture inspired by the primate visual cortex, aligning with model architecture innovation.

  117. BoostTransformer: Enhancing Transformer Models with Subgrid Selection and Importance Sampling - Score: 15 (R=8, N=7) - Date: 2025-08-06 - Comment: The paper proposes a novel framework, BoostTransformer, which enhances transformer models with boosting principles, focusing on architectural innovation and efficiency improvements.

  118. Adaptive Riemannian Graph Neural Networks - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper introduces Adaptive Riemannian Graph Neural Networks, which is relevant to model architecture innovations.

  119. What are you sinking? A geometric approach on attention sink - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper provides a geometric analysis of attention mechanisms in transformers, offering insights into architectural components and their effects, aligning with the model architecture criterion.

  120. LetheViT: Selective Machine Unlearning for Vision Transformers via Attention-Guided Contrastive Learning - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper proposes LetheViT, a method for machine unlearning in vision transformers, which aligns with model architecture and representation learning criteria.

  121. Training Dynamics of the Cooldown Stage in Warmup-Stable-Decay Learning Rate Scheduler - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper provides insights into the training dynamics of learning rate schedulers in transformers, relevant to model architecture and training dynamics.

  122. HT-Transformer: Event Sequences Classification by Accumulating Prefix Information with History Tokens - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper proposes a novel transformer-based model with history tokens, which is relevant to model architecture innovations.

  123. Expressive Power of Graph Transformers via Logic - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper provides theoretical insights into the expressive power of graph transformers, which aligns with the interest in understanding model architectures, particularly transformers.

  124. Sheaf Graph Neural Networks via PAC-Bayes Spectral Optimization - Score: 15 (R=8, N=7) - Date: 2025-08-04 - Comment: The paper introduces a novel architecture for Graph Neural Networks using PAC-Bayes spectral optimization, which aligns with the model architecture criterion.

  125. Invariant Graph Transformer for Out-of-Distribution Generalization - Score: 15 (R=8, N=7) - Date: 2025-08-04 - Comment: The paper introduces GOODFormer, a Graph Transformer for out-of-distribution generalization, focusing on invariant graph learning, which is relevant to model architecture innovations.

  126. Sinusoidal Approximation Theorem for Kolmogorov-Arnold Networks - Score: 15 (R=8, N=7) - Date: 2025-08-04 - Comment: The paper introduces a novel variant of Kolmogorov-Arnold Networks using sinusoidal functions, which aligns with the model architecture criterion.

  127. Stress-Aware Resilient Neural Training - Score: 15 (R=8, N=7) - Date: 2025-08-04 - Comment: The paper proposes a stress-aware learning paradigm with a novel optimizer, relevant to training dynamics in neural networks.

  128. SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model - Score: 15 (R=8, N=7) - Date: 2025-08-01 - Comment: The paper proposes a new architecture, SimuRA, for generalized agentic reasoning using LLMs, which aligns with interests in architectural innovations and theoretical insights into LLM behavior.

  129. Solution-aware vs global ReLU selection: partial MILP strikes back for DNN verification - Score: 15 (R=8, N=7) - Date: 2025-08-01 - Comment: The paper introduces a novel solution-aware ReLU selection method for DNN verification, which aligns with the Model Architecture criterion by proposing a new approach to handle ReLU variables in neural networks.

Model Compression and Efficiency (117)

  1. Spatio-Temporal Pruning for Compressed Spiking Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-29 - Comment: The paper explores spatio-temporal pruning for Spiking LLMs, which is relevant to model compression and efficiency in large language models.

  2. CORE: Lossless Compression for Retrieval-Augmented LLMs via Reinforcement Learning - Score: 17 (R=9, N=8) - Date: 2025-08-28 - Comment: The paper proposes a novel method for lossless compression in retrieval-augmented LLMs, aligning with the Model Compression criterion.

  3. APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration - Score: 17 (R=9, N=8) - Date: 2025-08-27 - Comment: The paper focuses on quantization and efficiency improvements for LLMs, which aligns with model compression and efficiency breakthroughs.

  4. TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training - Score: 17 (R=9, N=8) - Date: 2025-08-26 - Comment: The paper introduces TiKMiX, a method for dynamic data mixture adjustment in language model pre-training, which aligns with foundational research in LLM pretraining and efficiency.

  5. CoViPAL: Layer-wise Contextualized Visual Token Pruning for Large Vision-Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-26 - Comment: The paper introduces a novel method for pruning visual tokens in large vision-language models, which aligns with model compression through pruning and efficiency improvements.

  6. CommonKV: Compressing KV Cache with Cross-layer Parameter Sharing - Score: 17 (R=9, N=8) - Date: 2025-08-25 - Comment: The paper introduces CommonKV, a novel method for compressing KV cache in LLMs using cross-layer parameter sharing and SVD, which aligns with the model compression criterion.

  7. TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill \& Decode Inference - Score: 17 (R=9, N=8) - Date: 2025-08-25 - Comment: The paper introduces Tensor-Parallel Latent Attention, focusing on model compression and efficiency improvements, relevant to model compression.

  8. Z-Pruner: Post-Training Pruning of Large Language Models for Efficiency without Retraining - Score: 17 (R=9, N=8) - Date: 2025-08-25 - Comment: Z-Pruner introduces a novel post-training pruning method for LLMs, focusing on inducing sparsity without retraining, which aligns with the model compression criterion.

  9. SCOPE: A Generative Approach for LLM Prompt Compression - Score: 17 (R=9, N=8) - Date: 2025-08-25 - Comment: The paper introduces a generative approach for prompt compression in LLMs, focusing on efficiency and coherence, which aligns with model compression and LLM efficiency.

  10. SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning - Score: 17 (R=9, N=8) - Date: 2025-08-22 - Comment: The paper introduces SPARK, a method for unstructured sparsity and KV cache channel pruning, which is relevant to model compression and efficiency breakthroughs.

  11. GLASS: Test-Time Acceleration for LLMs via Global-Local Neural Importance Aggregation - Score: 17 (R=9, N=8) - Date: 2025-08-21 - Comment: The paper introduces a novel method for dynamic pruning in LLMs, focusing on model compression through sparsification without training, which aligns with the model compression criteria.

  12. Neuro-inspired Ensemble-to-Ensemble Communication Primitives for Sparse and Efficient ANNs - Score: 17 (R=9, N=8) - Date: 2025-08-21 - Comment: The paper presents a novel architecture inspired by biological neural circuits, focusing on sparsity and efficiency, relevant to model compression and architecture.

  13. GRAFT: Gradient-Aware Fast MaxVol Technique for Dynamic Data Sampling - Score: 17 (R=9, N=8) - Date: 2025-08-20 - Comment: The paper presents a method for dynamic data sampling using low-rank feature representations, which is relevant to model compression and efficiency. It introduces a novel technique for subset selection that reduces computational costs, aligning with foundational research in model efficiency.

  14. A Perfectly Truthful Calibration Measure - Score: 17 (R=9, N=8) - Date: 2025-08-19 - Comment: The paper introduces a perfectly truthful calibration measure, which is a theoretical advancement in the field of prediction calibration.

  15. FLARE: Fast Low-rank Attention Routing Engine - Score: 17 (R=9, N=8) - Date: 2025-08-19 - Comment: The paper presents a low-rank attention mechanism, which is relevant to model compression and efficiency.

  16. Quantization through Piecewise-Affine Regularization: Optimization and Statistical Guarantees - Score: 17 (R=9, N=8) - Date: 2025-08-18 - Comment: The paper discusses quantization through piecewise-affine regularization, which is relevant to model compression techniques.

  17. XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization - Score: 17 (R=9, N=8) - Date: 2025-08-15 - Comment: The paper introduces a novel approach to reduce memory consumption in LLM inference using quantization and rematerialization, relevant to model compression.

  18. Global Convergence Analysis of Vanilla Gradient Descent for Asymmetric Matrix Completion - Score: 17 (R=9, N=8) - Date: 2025-08-14 - Comment: The paper provides a theoretical analysis of gradient descent for asymmetric low-rank matrix completion, which is relevant to model compression and efficiency.

  19. EGGS-PTP: An Expander-Graph Guided Structured Post-training Pruning Method for Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-14 - Comment: The paper introduces a novel structured pruning method for LLMs using expander graphs, aligning with model compression and efficiency breakthroughs.

  20. Synaptic Pruning: A Biological Inspiration for Deep Learning Regularization - Score: 17 (R=9, N=8) - Date: 2025-08-14 - Comment: The paper proposes a synaptic pruning method inspired by biological processes, which is relevant to model compression through sparsity and pruning.

  21. DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic - Score: 17 (R=9, N=8) - Date: 2025-08-14 - Comment: The paper introduces a novel framework for dynamic quantization training, which is relevant to model compression and efficiency.

  22. Towards Scalable Lottery Ticket Networks using Genetic Algorithms - Score: 17 (R=9, N=8) - Date: 2025-08-13 - Comment: The paper explores using genetic algorithms to identify strong lottery ticket subnetworks, which is relevant to model compression and efficiency.

  23. Dynamic Rank Adjustment for Accurate and Efficient Neural Network Training - Score: 17 (R=9, N=8) - Date: 2025-08-13 - Comment: The paper proposes a dynamic-rank training framework, relevant to model compression and efficiency.

  24. DySK-Attn: A Framework for Efficient, Real-Time Knowledge Updating in Large Language Models via Dynamic Sparse Knowledge Attention - Score: 17 (R=9, N=8) - Date: 2025-08-12 - Comment: The paper proposes a framework for real-time knowledge updating in LLMs using dynamic sparse knowledge attention, which is relevant to model architecture and efficiency.

  25. Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning - Score: 17 (R=9, N=8) - Date: 2025-08-12 - Comment: The paper introduces a training-free sparse attention mechanism, relevant to model compression and efficiency.

  26. BoRA: Towards More Expressive Low-Rank Adaptation with Block Diversity - Score: 17 (R=9, N=8) - Date: 2025-08-12 - Comment: The paper proposes BoRA, a novel method for low-rank adaptation in LLMs, enhancing parameter efficiency and model performance, relevant to model compression.

  27. Generalizing Scaling Laws for Dense and Sparse Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-12 - Comment: The paper proposes a generalized scaling law for both dense and sparse LLMs, which is relevant to foundational research in LLMs and model efficiency.

  28. One Size Does Not Fit All: A Distribution-Aware Sparsification for More Precise Model Merging - Score: 17 (R=9, N=8) - Date: 2025-08-11 - Comment: The paper presents an adaptive sparsification strategy for model merging, which is relevant to model compression and sparsity, offering a novel approach to parameter pruning.

  29. DP-LLM: Runtime Model Adaptation with Dynamic Layer-wise Precision Assignment - Score: 17 (R=9, N=8) - Date: 2025-08-11 - Comment: The paper introduces a novel mechanism for runtime model adaptation in LLMs using dynamic layer-wise precision assignment, aligning with model compression and efficiency breakthroughs.

  30. Fourier-VLM: Compressing Vision Tokens in the Frequency Domain for Large Vision-Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-11 - Comment: The paper introduces a novel method for compressing visual representations in vision-language models using frequency domain techniques, which aligns with the model compression criterion.

  31. The Fourth State: Signed-Zero Ternary for Stable LLM Quantization (and More) - Score: 17 (R=9, N=8) - Date: 2025-08-11 - Comment: The paper introduces a novel quantization method, Signed-Zero Ternary (SZT), which is relevant to model compression and efficiency breakthroughs.

  32. Fairy$\pm i$: the First 2-bit Complex LLM with All Parameters in ${\pm1, \pm i}$ - Score: 17 (R=9, N=8) - Date: 2025-08-08 - Comment: The paper presents a novel 2-bit quantization framework for complex-valued LLMs, which is relevant to model compression and efficiency breakthroughs.

  33. Optimal Growth Schedules for Batch Size and Learning Rate in SGD that Reduce SFO Complexity - Score: 17 (R=9, N=8) - Date: 2025-08-08 - Comment: The paper provides theoretical insights into optimizing batch size and learning rate schedules for SGD, which is relevant to training dynamics in neural networks.

  34. InfoQ: Mixed-Precision Quantization via Global Information Flow - Score: 17 (R=9, N=8) - Date: 2025-08-08 - Comment: The paper presents InfoQ, a novel framework for mixed-precision quantization focusing on global information flow, relevant to model compression.

  35. MoKA: Mixture of Kronecker Adapters - Score: 17 (R=9, N=8) - Date: 2025-08-06 - Comment: The paper proposes a new generation of Kronecker adapters for parameter-efficient fine-tuning, which is relevant to model compression and architecture innovation.

  36. Compressing Chain-of-Thought in LLMs via Step Entropy - Score: 17 (R=9, N=8) - Date: 2025-08-06 - Comment: The paper introduces a novel CoT compression framework for LLMs, focusing on efficiency improvements, which aligns with model compression.

  37. LOST: Low-rank and Sparse Pre-training for Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper proposes LOST, a method integrating low-rank and sparse structures for efficient pre-training of large language models, which is relevant to model compression.

  38. CompressKV: Semantic Retrieval Heads Know What Tokens are Not Important Before Generation - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper presents CompressKV, a method for KV cache compression in LLMs, aligning with the model compression criterion focusing on efficiency breakthroughs.

  39. LeanK: Learnable K Cache Channel Pruning for Efficient Decoding - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: LeanK proposes a learning-based method for pruning key cache channels in LLMs, relevant to model compression and efficiency.

  40. Amber Pruner: Leveraging N:M Activation Sparsity for Efficient Prefill in Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper introduces Amber Pruner, a training-free N:M activation sparsity method for LLMs, which aligns with the model compression criterion focusing on sparsity and efficiency breakthroughs.

  41. Kronecker-LoRA: hybrid Kronecker-LoRA adapters for scalable, sustainable fine-tuning - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper introduces Kron-LoRA, a novel approach combining Kronecker product and low-rank decomposition for efficient fine-tuning of large language models, which is relevant to model compression.

  42. Adacc: Adaptive Compression and Activation Checkpointing for LLM Memory Management - Score: 17 (R=9, N=8) - Date: 2025-08-04 - Comment: The paper introduces a novel memory management framework combining adaptive compression and activation checkpointing, relevant to model compression.

  43. Measuring Reasoning Utility in LLMs via Conditional Entropy Reduction - Score: 16 (R=9, N=7) - Date: 2025-08-29 - Comment: The paper examines reasoning utility in LLMs using conditional entropy, providing theoretical insights into LLM behavior.

  44. Beacon: Post-Training Quantization with Integrated Grid Selection - Score: 16 (R=9, N=7) - Date: 2025-08-29 - Comment: The paper presents Beacon, a novel algorithm for post-training quantization, which is relevant to model compression techniques.

  45. Scaling Laws for Task-Stratified Knowledge in Post-Training Quantized Large Language Models - Score: 16 (R=9, N=7) - Date: 2025-08-27 - Comment: The paper provides insights into post-training quantization effects on LLMs, which is relevant to model compression and understanding LLM behavior.

  46. Interpreting the Effects of Quantization on LLMs - Score: 16 (R=9, N=7) - Date: 2025-08-26 - Comment: The paper investigates the effects of quantization on LLMs, providing insights into model compression and interpretability.

  47. Systematic Characterization of LLM Quantization: A Performance, Energy, and Quality Perspective - Score: 16 (R=9, N=7) - Date: 2025-08-26 - Comment: The paper provides a comprehensive characterization of LLM quantization methods, focusing on performance, energy, and quality tradeoffs, which aligns with the model compression criterion.

  48. GEM: A Scale-Aware and Distribution-Sensitive Sparse Fine-Tuning Framework for Effective Downstream Adaptation - Score: 16 (R=9, N=7) - Date: 2025-08-25 - Comment: The paper proposes a sparse fine-tuning framework that is scale-aware and distribution-sensitive, aligning with model compression and efficiency breakthroughs.

  49. Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs - Score: 16 (R=9, N=7) - Date: 2025-08-21 - Comment: The paper conducts a systematic study on quantizing diffusion-based language models, relevant to model compression and efficiency.

  50. Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models - Score: 16 (R=9, N=7) - Date: 2025-08-21 - Comment: The paper discusses low-rank adaptation of large language models, which is relevant to model compression and efficiency improvements.

  51. One Shot vs. Iterative: Rethinking Pruning Strategies for Model Compression - Score: 16 (R=9, N=7) - Date: 2025-08-20 - Comment: The paper provides a comprehensive comparison of pruning strategies, which is relevant to model compression. It introduces a hybrid pruning approach, contributing to the understanding of pruning strategies.

  52. Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory System - Score: 16 (R=9, N=7) - Date: 2025-08-20 - Comment: The paper addresses model compression through dynamic KV cache placement in heterogeneous memory systems, which is relevant to efficiency breakthroughs in LLM inference.

  53. NEURAL: Attention-Guided Pruning for Unified Multimodal Resource-Constrained Clinical Evaluation - Score: 16 (R=9, N=7) - Date: 2025-08-14 - Comment: The paper introduces a novel framework for data compression using attention-guided pruning, which aligns with model compression and efficiency breakthroughs.

  54. Selective KV-Cache Sharing to Mitigate Timing Side-Channels in LLM Inference - Score: 16 (R=9, N=7) - Date: 2025-08-13 - Comment: The paper discusses KV-cache sharing, which is relevant to model compression and efficiency in LLMs.

  55. SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference - Score: 16 (R=9, N=7) - Date: 2025-08-06 - Comment: The paper introduces SmallKV, a method for KV cache compression in LLMs, which aligns with model compression and efficiency breakthroughs.

  56. Flexible Automatic Identification and Removal (FAIR)-Pruner: An Efficient Neural Network Pruning Method - Score: 16 (R=9, N=7) - Date: 2025-08-05 - Comment: FAIR-Pruner introduces a novel method for neural network pruning, focusing on structured pruning and automatic determination of layer-wise pruning rates, relevant to model compression.

  57. Toward Efficient Spiking Transformers: Synapse Pruning Meets Synergistic Learning-Based Compensation - Score: 16 (R=9, N=7) - Date: 2025-08-05 - Comment: The paper proposes synapse pruning and synergistic learning for efficient spiking Transformers, contributing to model compression and efficiency.

  58. Systematic Evaluation of Optimization Techniques for Long-Context Language Models - Score: 16 (R=9, N=7) - Date: 2025-08-04 - Comment: The paper systematically evaluates optimization techniques like pruning and quantization for LLMs, aligning with the model compression criterion.

  59. Towards Source-Free Machine Unlearning - Score: 16 (R=8, N=8) - Date: 2025-08-22 - Comment: The paper addresses source-free machine unlearning, which is relevant to model compression and efficiency.

  60. Approximate Bayesian Inference via Bitstring Representations - Score: 16 (R=8, N=8) - Date: 2025-08-20 - Comment: The paper introduces a novel approach to approximate Bayesian inference using quantized representations, relevant to model compression and efficiency.

  61. AdaRing: Towards Ultra-Light Vision-Language Adaptation via Cross-Layer Tensor Ring Decomposition - Score: 16 (R=8, N=8) - Date: 2025-08-19 - Comment: The paper proposes AdaRing, a novel vision-language fine-tuning framework using cross-layer tensor ring decomposition, which involves model compression techniques.

  62. Towards High-Order Mean Flow Generative Models: Feasibility, Expressivity, and Provably Efficient Criteria - Score: 16 (R=8, N=8) - Date: 2025-08-12 - Comment: The paper introduces a theoretical study on Second-Order MeanFlow, which is relevant to emerging trends in generative modeling and foundational research.

  63. Learning Logical Rules using Minimum Message Length - Score: 16 (R=8, N=8) - Date: 2025-08-11 - Comment: The paper introduces a Bayesian inductive logic programming approach, which is relevant to foundational research in AI, focusing on unifying probabilistic and logical learning.

  64. Superior resilience to poisoning and amenability to unlearning in quantum machine learning - Score: 16 (R=8, N=8) - Date: 2025-08-05 - Comment: The paper explores quantum machine learning's resilience and unlearning capabilities, which is an emerging trend in AI with potential foundational implications.

  65. Towards Higher Effective Rank in Parameter-efficient Fine-tuning using Khatri--Rao Product - Score: 16 (R=8, N=8) - Date: 2025-08-04 - Comment: The paper introduces KRAdapter, a novel PEFT algorithm using the Khatri-Rao product for higher effective rank, relevant to model compression and efficiency.

  66. Self-Composing Neural Operators with Depth and Accuracy Scaling via Adaptive Train-and-Unroll Approach - Score: 15 (R=8, N=7) - Date: 2025-08-29 - Comment: The paper proposes a novel framework for neural operators with a focus on efficiency and accuracy, relevant to model architecture innovations.

  67. MERIT: Maximum-normalized Element-wise Ratio for Language Model Large-batch Training - Score: 15 (R=8, N=7) - Date: 2025-08-29 - Comment: The paper introduces a novel optimizer, MERIT, for large-batch training in language models, focusing on improving training stability and efficiency, which aligns with the model compression and efficiency criteria.

  68. Uncertainty Under the Curve: A Sequence-Level Entropy Area Metric for Reasoning LLM - Score: 15 (R=8, N=7) - Date: 2025-08-29 - Comment: The paper introduces Entropy Area Score (EAS), a novel metric for quantifying uncertainty in LLMs, which contributes to foundational research in LLM behavior.

  69. Diffusion Language Models Know the Answer Before Decoding - Score: 15 (R=8, N=7) - Date: 2025-08-28 - Comment: The paper presents a novel fast decoding paradigm for diffusion language models, which is relevant to model compression and efficiency improvements.

  70. Efficiently Generating Multidimensional Calorimeter Data with Tensor Decomposition Parameterization - Score: 15 (R=8, N=7) - Date: 2025-08-28 - Comment: The paper discusses tensor decomposition for efficient data generation, aligning with model compression through low-rank approaches.

  71. Get Global Guarantees: On the Probabilistic Nature of Perturbation Robustness - Score: 15 (R=8, N=7) - Date: 2025-08-27 - Comment: The paper proposes a new metric for evaluating robustness in neural models, which aligns with emerging trends in theoretical work.

  72. Sparse minimum Redundancy Maximum Relevance for feature selection - Score: 15 (R=8, N=7) - Date: 2025-08-27 - Comment: The paper proposes a sparse feature selection method using a penalized mRMR procedure, which aligns with Model Compression through sparsity and feature selection.

  73. How Quantization Shapes Bias in Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper evaluates how quantization affects bias in large language models, which is relevant to model compression and efficiency.

  74. Riemannian Optimization for LoRA on the Stiefel Manifold - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper discusses a novel optimization method for LoRA, which is relevant to model compression and efficiency improvements.

  75. Steering When Necessary: Flexible Steering Large Language Models with Backtracking - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper proposes a framework for flexible activation steering in LLMs, focusing on aligning model behavior, which is relevant to model architecture and efficiency improvements.

  76. TokenLake: A Unified Segment-level Prefix Cache Pool for Fine-grained Elastic Long-Context LLM Serving - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: TokenLake introduces a novel segment-level prefix cache pool for LLMs, which is relevant to model compression and efficiency improvements.

  77. Optimizing Neural Networks with Learnable Non-Linear Activation Functions via Lookup-Based FPGA Acceleration - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper discusses optimizing neural networks with learnable activation functions via FPGA acceleration, which is relevant to model architecture and efficiency.

  78. Dynamic Sparse Attention on Mobile SoCs - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper presents a system-algorithm co-designed sparse attention module, relevant to model compression and efficiency.

  79. Confidence-Modulated Speculative Decoding for Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper presents a novel speculative decoding method for LLMs, focusing on efficiency improvements, which aligns with model compression and efficiency breakthroughs.

  80. SpecVLM: Enhancing Speculative Decoding of Video LLMs via Verifier-Guided Token Pruning - Score: 15 (R=8, N=7) - Date: 2025-08-25 - Comment: The paper introduces a speculative decoding framework for video LLMs, focusing on efficiency improvements through token pruning, relevant to model compression.

  81. SemToken: Semantic-Aware Tokenization for Efficient Long-Context Language Modeling - Score: 15 (R=8, N=7) - Date: 2025-08-22 - Comment: The paper proposes SemToken, a semantic-aware tokenization framework, which offers a new approach to optimize tokenization and computation in large language models, relevant to representation learning and efficiency.

  82. Multi-view Graph Condensation via Tensor Decomposition - Score: 15 (R=8, N=7) - Date: 2025-08-21 - Comment: The paper proposes a novel method for graph condensation using tensor decomposition, which aligns with the model compression criteria by reducing graph size while preserving performance.

  83. QuickMerge++: Fast Token Merging with Autoregressive Prior - Score: 15 (R=8, N=7) - Date: 2025-08-20 - Comment: The paper proposes a token merging framework for efficient next-token prediction, which relates to model compression and efficiency. It introduces a novel approach to reduce token counts while maintaining performance.

  84. SparseMap: A Sparse Tensor Accelerator Framework Based on Evolution Strategy - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper presents a framework for optimizing sparse tensor accelerators, which is relevant to model compression and efficiency.

  85. A Self-Ensemble Inspired Approach for Effective Training of Binary-Weight Spiking Neural Networks - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper presents a novel approach for training binary-weight spiking neural networks, which is relevant to model compression and efficiency.

  86. SSPO: Self-traced Step-wise Preference Optimization for Process Supervision and Reasoning Compression - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper proposes SSPO, a framework for reasoning compression in LLMs, which aligns with foundational research in LLM behavior and interpretability.

  87. ProtTeX-CC: Activating In-Context Learning in Protein LLM via Two-Stage Instruction Compression - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper introduces a two-stage compression framework for protein LLMs, relevant to model compression and efficiency.

  88. PTQAT: A Hybrid Parameter-Efficient Quantization Algorithm for 3D Perception Tasks - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper presents a novel hybrid quantization algorithm, which is relevant to model compression through quantization.

  89. When Language Overrules: Revealing Text Dominance in Multimodal Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper investigates text dominance in multimodal LLMs and proposes a token compression method, which is relevant to foundational research in LLM behavior.

  90. Pruning and Malicious Injection: A Retraining-Free Backdoor Attack on Transformer Models - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper introduces a novel backdoor attack on transformers using pruning, which is relevant to model compression and sparsity.

  91. Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper proposes a method to reduce the output length of large reasoning models, which relates to model efficiency and compression.

  92. SABER: Switchable and Balanced Training for Efficient LLM Reasoning - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper proposes a reinforcement learning framework for efficient reasoning in LLMs, focusing on token-budgeted reasoning, which is relevant to foundational research in LLM efficiency.

  93. RTTC: Reward-Guided Collaborative Test-Time Compute - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper introduces a framework for adaptive test-time compute in LLMs, which is relevant to foundational research in LLM efficiency and adaptation.

  94. Beyond Scaling Law: A Data-Efficient Distillation Framework for Reasoning - Score: 15 (R=8, N=7) - Date: 2025-08-14 - Comment: The paper proposes a data-efficient distillation framework for reasoning in LLMs, aligning with foundational research in LLM behavior and efficiency.

  95. Structured Kernel Regression VAE: A Computationally Efficient Surrogate for GP-VAEs in ICA - Score: 15 (R=8, N=7) - Date: 2025-08-14 - Comment: The paper introduces SKR-VAE, a computationally efficient surrogate for GP-VAEs in ICA, which is relevant to model architecture and efficiency improvements.

  96. Train Long, Think Short: Curriculum Learning for Efficient Reasoning - Score: 15 (R=8, N=7) - Date: 2025-08-13 - Comment: The paper proposes a curriculum learning strategy for efficient reasoning in LLMs, which is relevant to model architecture and training dynamics.

  97. ASPD: Unlocking Adaptive Serial-Parallel Decoding by Exploring Intrinsic Parallelism in LLMs - Score: 15 (R=8, N=7) - Date: 2025-08-13 - Comment: The paper proposes a method for adaptive serial-parallel decoding in LLMs, focusing on inference efficiency, relevant to model compression and efficiency.

  98. EditMF: Drawing an Invisible Fingerprint for Your Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-08-13 - Comment: The paper introduces a novel fingerprinting method for LLMs, which is relevant to foundational research in LLMs.

  99. OverFill: Two-Stage Models for Efficient Language Model Decoding - Score: 15 (R=8, N=7) - Date: 2025-08-13 - Comment: The paper proposes OverFill, a method to optimize LLM inference stages, which is relevant to model compression and efficiency.

  100. Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation - Score: 15 (R=8, N=7) - Date: 2025-08-12 - Comment: The paper presents a framework for semantic caching in LLMs, focusing on efficiency and theoretical foundations, which aligns with interests in model compression and efficiency.

  101. Efficient Edge LLMs Deployment via HessianAware Quantization and CPU GPU Collaborative - Score: 15 (R=8, N=7) - Date: 2025-08-12 - Comment: The paper addresses efficient deployment of MoE architectures using Hessian-Aware Quantization, which is relevant to model compression and efficiency.

  102. Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal - Score: 15 (R=8, N=7) - Date: 2025-08-11 - Comment: The paper proposes a novel framework for CoT compression in code reasoning, which is relevant to model compression. It introduces a new method for efficient reasoning, contributing to algorithmic efficiency breakthroughs.

  103. Pruning Large Language Models by Identifying and Preserving Functional Networks - Score: 15 (R=8, N=7) - Date: 2025-08-08 - Comment: The paper proposes a new method for pruning LLMs by identifying functional networks, which is relevant to model compression and efficiency.

  104. Provable Post-Training Quantization: Theoretical Analysis of OPTQ and Qronos - Score: 15 (R=8, N=7) - Date: 2025-08-08 - Comment: The paper provides theoretical analysis and error bounds for post-training quantization methods, which is relevant to model compression and efficiency.

  105. VLMQ: Efficient Post-Training Quantization for Large Vision-Language Models via Hessian Augmentation - Score: 15 (R=8, N=7) - Date: 2025-08-06 - Comment: The paper presents a novel post-training quantization framework for vision-language models, focusing on efficiency and compression, which is relevant to model compression.

  106. Exploring Layer-wise Information Effectiveness for Post-Training Quantization in Small Language Models - Score: 15 (R=8, N=7) - Date: 2025-08-06 - Comment: The paper presents a metric-driven post-training quantization framework for small language models, focusing on compression and efficiency, which is relevant to model compression.

  107. Where and How to Enhance: Discovering Bit-Width Contribution for Mixed Precision Quantization - Score: 15 (R=8, N=7) - Date: 2025-08-06 - Comment: The paper proposes a Shapley-based method for mixed precision quantization, which is relevant to model compression and efficiency.

  108. ProCut: LLM Prompt Compression via Attribution Estimation - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper introduces ProCut for LLM prompt compression, focusing on efficiency improvements in LLMs, which aligns with model compression.

  109. Accelerating LLM Reasoning via Early Rejection with Partial Reward Modeling - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper introduces a method for accelerating LLM reasoning via early rejection, relevant to model efficiency and theoretical insights into LLM behavior.

  110. Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper explores continual pre-training of LLMs with techniques like experience replay and gradient alignment, relevant to understanding LLM behavior and efficiency.

  111. FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank Models - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper introduces FlashSVD, a framework for memory-efficient inference with low-rank models, which is relevant to model compression.

  112. DisTaC: Conditioning Task Vectors via Distillation for Robust Model Merging - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper introduces a novel method for robust model merging using distillation, relevant to model architecture and efficiency.

  113. Compression-Induced Communication-Efficient Large Model Training and Inferencing - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper introduces phantom parallelism for energy-efficient training, aligning with the model compression criterion focusing on efficiency breakthroughs.

  114. Improved Algorithms for Kernel Matrix-Vector Multiplication Under Sparsity Assumptions - Score: 15 (R=8, N=7) - Date: 2025-08-01 - Comment: The paper presents improved algorithms for kernel matrix-vector multiplication under sparsity assumptions, which aligns with interests in model compression and efficiency breakthroughs.

  115. Differentially Private Clipped-SGD: High-Probability Convergence with Arbitrary Clipping Level - Score: 15 (R=8, N=7) - Date: 2025-08-01 - Comment: The paper provides a high-probability convergence analysis for DP-Clipped-SGD with a fixed clipping level, relevant to model compression and efficiency.

  116. Coflex: Enhancing HW-NAS with Sparse Gaussian Processes for Efficient and Scalable DNN Accelerator Design - Score: 15 (R=8, N=7) - Date: 2025-08-01 - Comment: The paper introduces a novel HW-NAS framework integrating Sparse Gaussian Processes for efficient DNN accelerator design, which aligns with model architecture and compression topics.

  117. Efficient Machine Unlearning via Influence Approximation - Score: 15 (R=8, N=7) - Date: 2025-08-01 - Comment: The paper introduces Influence Approximation Unlearning, linking incremental learning and unlearning, relevant to model compression and efficiency.

High Performance Computing (28)

  1. Model Science: getting serious about verification, explanation and control of AI systems - Score: 18 (R=9, N=9) - Date: 2025-08-28 - Comment: The paper proposes a new discipline called Model Science, which focuses on verification, explanation, and control of AI systems, aligning with the emerging trends criterion by introducing a broad new paradigm.

  2. Virtuous Machines: Towards Artificial General Science - Score: 18 (R=9, N=9) - Date: 2025-08-20 - Comment: The paper discusses a domain-agnostic AI system capable of autonomously conducting scientific research, which is a significant step towards AI for Science. It introduces a new paradigm for AI systems in scientific discovery.

  3. Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction - Score: 18 (R=9, N=9) - Date: 2025-08-06 - Comment: The paper introduces Goedel-Prover-V2, a new state-of-the-art in automated theorem proving, with innovations in data synthesis and self-correction, aligning with foundational research in AI for Science.

  4. Amortized Sampling with Transferable Normalizing Flows - Score: 17 (R=9, N=8) - Date: 2025-08-26 - Comment: The paper introduces a transferable normalizing flow for molecular conformations, which is relevant to AI for Science and foundational research in molecular modeling.

  5. Programmable k-local Ising Machines and all-optical Kolmogorov-Arnold Networks on Photonic Platforms - Score: 17 (R=9, N=8) - Date: 2025-08-26 - Comment: The paper introduces a novel photonic platform for k-local Ising optimization and optical KAN function learning, which could be considered a foundational research in AI for Science due to its innovative approach to optical computing.

  6. Mutual Information Surprise: Rethinking Unexpectedness in Autonomous Systems - Score: 17 (R=9, N=8) - Date: 2025-08-26 - Comment: The introduction of Mutual Information Surprise as a new framework for autonomous systems is a novel theoretical contribution, aligning with emerging trends in AI.

  7. Source-Guided Flow Matching - Score: 17 (R=9, N=8) - Date: 2025-08-21 - Comment: The paper introduces a novel framework for generative models by modifying the source distribution, which aligns with emerging trends in foundational research by challenging established assumptions in generative modeling.

  8. The Rise of Generative AI for Metal-Organic Framework Design and Synthesis - Score: 17 (R=9, N=8) - Date: 2025-08-20 - Comment: The paper discusses generative AI for designing metal-organic frameworks, focusing on foundational research in molecular modeling and new generative paradigms.

  9. Time-Scale Coupling Between States and Parameters in Recurrent Neural Networks - Score: 17 (R=9, N=8) - Date: 2025-08-19 - Comment: The paper studies how gating mechanisms in RNNs induce adaptive learning-rate behavior, providing insights into training dynamics in neural networks.

  10. Memorisation and forgetting in a learning Hopfield neural network: bifurcation mechanisms, attractors and basins - Score: 17 (R=9, N=8) - Date: 2025-08-15 - Comment: The paper provides a comprehensive analysis of memory formation in Hopfield networks, relevant to foundational research in neural network behavior.

  11. Whither symbols in the era of advanced neural networks? - Score: 17 (R=9, N=8) - Date: 2025-08-11 - Comment: The paper discusses the symbolic basis of human thought in the context of neural networks, which aligns with representation learning and emerging trends in AI. It challenges established assumptions about symbolic systems.

  12. Learning to optimize with guarantees: a complete characterization of linearly convergent algorithms - Score: 17 (R=9, N=8) - Date: 2025-08-04 - Comment: The paper provides a theoretical characterization of linearly convergent algorithms, which aligns with the emerging trends criterion.

  13. Principled Detection of Hallucinations in Large Language Models via Multiple Testing - Score: 16 (R=9, N=7) - Date: 2025-08-27 - Comment: The paper addresses hallucination detection in LLMs using a hypothesis testing approach, which provides theoretical insights into LLM behavior, aligning with the criteria for foundational research in LLMs.

  14. Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs - Score: 16 (R=9, N=7) - Date: 2025-08-12 - Comment: The paper explores pretraining data filtering as a method to build tamper-resistant safeguards into open-weight LLMs, contributing to foundational research in LLMs.

  15. Spacer: Towards Engineered Scientific Inspiration - Score: 16 (R=8, N=8) - Date: 2025-08-26 - Comment: Spacer proposes a novel system for scientific discovery using LLMs, which aligns with emerging trends in AI for Science.

  16. Data-driven particle dynamics: Structure-preserving coarse-graining for emergent behavior in non-equilibrium systems - Score: 16 (R=8, N=8) - Date: 2025-08-19 - Comment: The paper presents a framework for machine learning coarse-grained dynamics, which is relevant to AI for Science with foundational research in modeling.

  17. Universal Learning of Nonlinear Dynamics - Score: 16 (R=8, N=8) - Date: 2025-08-19 - Comment: The paper presents a novel algorithm for learning nonlinear dynamical systems, which is relevant to representation learning.

  18. Intrinsic Memory Agents: Heterogeneous Multi-Agent LLM Systems through Structured Contextual Memory - Score: 16 (R=8, N=8) - Date: 2025-08-13 - Comment: The paper proposes a novel framework for addressing memory limitations in multi-agent LLM systems, which aligns with foundational research in LLM architecture.

  19. Stochastic dynamics learning with state-space systems - Score: 16 (R=8, N=8) - Date: 2025-08-12 - Comment: The paper advances theoretical foundations of reservoir computing, which is relevant to emerging trends in model architecture and representation learning.

  20. On the Design of Expressive and Trainable Pulse-based Quantum Machine Learning Models - Score: 16 (R=8, N=8) - Date: 2025-08-08 - Comment: The paper discusses the design of pulse-based quantum machine learning models, focusing on expressivity and trainability, which is a novel paradigm in AI for science.

  21. LAG: Logic-Augmented Generation from a Cartesian Perspective - Score: 16 (R=8, N=8) - Date: 2025-08-08 - Comment: The paper proposes a novel logic-augmented generation paradigm for LLMs, focusing on systematic question decomposition and dependency-aware reasoning, relevant to large language models.

  22. Trustworthy scientific inference for inverse problems with generative models - Score: 16 (R=8, N=8) - Date: 2025-08-05 - Comment: The paper introduces FreB, a protocol for trustworthy scientific inference with generative models, aligning with AI for Science and emerging trends in foundational research.

  23. Rethinking Caching for LLM Serving Systems: Beyond Traditional Heuristics - Score: 15 (R=8, N=7) - Date: 2025-08-27 - Comment: The paper proposes a new semantic caching system for LLM serving, which is relevant to model compression and efficiency improvements.

  24. HyperFlexis: Joint Design of Algorithms and Systems for Multi-SLO Serving and Fast Scaling - Score: 15 (R=8, N=7) - Date: 2025-08-25 - Comment: The paper discusses a unified LLM serving system with innovations in scheduling and scaling, relevant to model compression and efficiency in LLMs.

  25. Lean Meets Theoretical Computer Science: Scalable Synthesis of Theorem Proving Challenges in Formal-Informal Pairs - Score: 15 (R=8, N=7) - Date: 2025-08-25 - Comment: The paper leverages theoretical computer science to generate theorem proving challenges, which is relevant to foundational research in AI for science and automated reasoning.

  26. From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery - Score: 15 (R=8, N=7) - Date: 2025-08-21 - Comment: The paper provides a survey on autonomous scientific discovery, which aligns with the AI for Science criteria by discussing foundational capabilities and core processes.

  27. Uncovering Systematic Failures of LLMs in Verifying Code Against Natural Language Specifications - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper uncovers systematic failures of LLMs in code verification, which provides insights into LLM behavior and interpretability.

  28. Counterfactual Probing for Hallucination Detection and Mitigation in Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper proposes a novel method for detecting and mitigating hallucinations in large language models, which is relevant to theoretical insights into LLM behavior.

Representation Learning (134)

  1. Consciousness as a Functor - Score: 18 (R=9, N=9) - Date: 2025-08-26 - Comment: The paper proposes a novel theory of consciousness as a functor, which is an emerging trend with potential foundational impact.

  2. On the Theoretical Limitations of Embedding-Based Retrieval - Score: 17 (R=9, N=8) - Date: 2025-08-29 - Comment: The paper discusses the theoretical limitations of embedding-based retrieval, which aligns with the representation learning criterion by addressing fundamental challenges in vector embeddings.

  3. Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection - Score: 17 (R=9, N=8) - Date: 2025-08-29 - Comment: The paper introduces Rank-One Safety Injection (ROSI), a novel method for enhancing safety alignment in LLMs, which aligns with foundational research in LLM behavior and interpretability.

  4. Parameter-Free Structural-Diversity Message Passing for Graph Neural Networks - Score: 17 (R=9, N=8) - Date: 2025-08-28 - Comment: The paper introduces a parameter-free graph neural network framework based on structural diversity, which provides a new theoretical perspective for graph representation learning. This aligns with the representation learning criterion.

  5. UNIFORM: Unifying Knowledge from Large-scale and Diverse Pre-trained Models - Score: 17 (R=9, N=8) - Date: 2025-08-28 - Comment: The paper introduces a framework for knowledge transfer from diverse pre-trained models, which aligns with the representation learning criterion by addressing foundational challenges in knowledge integration.

  6. Data-Efficient Symbolic Regression via Foundation Model Distillation - Score: 17 (R=9, N=8) - Date: 2025-08-28 - Comment: The paper introduces a novel framework EQUATE for symbolic regression using foundation model distillation, aligning with foundational research in representation learning and model compression.

  7. DeepAtlas: a tool for effective manifold learning - Score: 17 (R=9, N=8) - Date: 2025-08-28 - Comment: The paper introduces DeepAtlas, an algorithm for manifold learning, which aligns with representation learning by providing insights into how data can be represented in lower-dimensional spaces. It also offers a novel approach to assess the manifold hypothesis.

  8. On Surjectivity of Neural Networks: Can you elicit any behavior from your model? - Score: 17 (R=9, N=8) - Date: 2025-08-28 - Comment: The paper provides theoretical insights into the surjectivity of neural networks, which is relevant to understanding the behavior and interpretability of large language models.

  9. Memorization in Graph Neural Networks - Score: 17 (R=9, N=8) - Date: 2025-08-28 - Comment: The paper provides insights into memorization in GNNs, which is relevant to representation learning and training dynamics in neural networks.

  10. Echoes of the past: A unified perspective on fading memory and echo states - Score: 17 (R=9, N=8) - Date: 2025-08-27 - Comment: The paper unifies various notions of memory in RNNs, contributing to a deeper understanding of their temporal information processing capabilities, which is relevant to representation learning.

  11. Adversarial Examples Are Not Bugs, They Are Superposition - Score: 17 (R=9, N=8) - Date: 2025-08-26 - Comment: The paper explores the hypothesis that adversarial examples are due to superposition, providing insights into representation learning.

  12. Curvature Learning for Generalization of Hyperbolic Neural Networks - Score: 17 (R=9, N=8) - Date: 2025-08-26 - Comment: The paper introduces a method for curvature learning in hyperbolic neural networks, providing theoretical insights into the role of curvatures, which aligns with representation learning and emerging trends.

  13. Native Logical and Hierarchical Representations with Subspace Embeddings - Score: 17 (R=9, N=8) - Date: 2025-08-26 - Comment: The paper introduces a novel paradigm of embedding concepts as linear subspaces, which is a significant contribution to representation learning and model architecture.

  14. On Task Vectors and Gradients - Score: 17 (R=9, N=8) - Date: 2025-08-25 - Comment: The paper provides a theoretical foundation for task arithmetic, which is relevant to representation learning and offers substantial insights into training dynamics.

  15. Scalable Equilibrium Propagation via Intermediate Error Signals for Deep Convolutional CRNNs - Score: 17 (R=9, N=8) - Date: 2025-08-25 - Comment: The paper presents a novel framework for Equilibrium Propagation in deep networks, addressing the vanishing gradient problem and enhancing scalability, which aligns with representation learning and training dynamics in neural networks.

  16. Reliable Unlearning Harmful Information in LLMs with Metamorphosis Representation Projection - Score: 17 (R=9, N=8) - Date: 2025-08-22 - Comment: The paper proposes a novel approach for machine unlearning in LLMs using Metamorphosis Representation Projection, which is relevant to foundational research in large language models.

  17. The DeepLog Neurosymbolic Machine - Score: 17 (R=9, N=8) - Date: 2025-08-20 - Comment: The paper introduces a theoretical and operational framework for neurosymbolic AI, which aligns with foundational research in representation learning and model architecture. It provides insights into how deep networks encode information through a neurosymbolic approach.

  18. Contrastive Representations for Temporal Reasoning - Score: 17 (R=9, N=8) - Date: 2025-08-19 - Comment: The paper introduces a method for temporal reasoning using contrastive representations, which is relevant to representation learning.

  19. Uncovering Emergent Physics Representations Learned In-Context by Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-19 - Comment: The paper investigates the in-context learning ability of LLMs using physics-based tasks, providing insights into LLM behavior and interpretability.

  20. Informative Post-Hoc Explanations Only Exist for Simple Functions - Score: 17 (R=9, N=8) - Date: 2025-08-18 - Comment: The paper provides a theoretical framework for understanding the informativeness of post-hoc explanations, which aligns with foundational research in representation learning by challenging assumptions about model interpretability.

  21. Towards the Next-generation Bayesian Network Classifiers - Score: 17 (R=9, N=8) - Date: 2025-08-18 - Comment: The paper proposes a novel paradigm for Bayesian network classifiers by learning distributional representations, which aligns with representation learning. It introduces a new neural network architecture to capture high-order feature dependencies.

  22. Constrained Decoding of Diffusion LLMs with Context-Free Grammars - Score: 17 (R=9, N=8) - Date: 2025-08-15 - Comment: The paper presents a constrained decoding method for diffusion LLMs using context-free grammars, which is relevant to foundational research in LLM behavior and interpretability.

  23. DINOv3 - Score: 17 (R=9, N=8) - Date: 2025-08-15 - Comment: The paper introduces DINOv3, a self-supervised learning model that enhances visual representation learning, which aligns with the representation learning criterion.

  24. Decoupling Understanding from Reasoning via Problem Space Mapping for Small-scale Model Reasoning - Score: 17 (R=9, N=8) - Date: 2025-08-15 - Comment: The paper introduces a framework for improving reasoning in small language models by decoupling understanding from reasoning, which aligns with foundational research in representation learning.

  25. Provable In-Context Vector Arithmetic via Retrieving Task Concepts - Score: 17 (R=9, N=8) - Date: 2025-08-14 - Comment: The paper provides a theoretical framework for in-context learning in LLMs, focusing on vector arithmetic and task concept retrieval, which is relevant to foundational research in LLMs.

  26. HKT: A Biologically Inspired Framework for Modular Hereditary Knowledge Transfer in Neural Networks - Score: 17 (R=9, N=8) - Date: 2025-08-14 - Comment: The paper introduces a biologically inspired framework for knowledge transfer in neural networks, which aligns with representation learning and model architecture.

  27. Entangled in Representations: Mechanistic Investigation of Cultural Biases in Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-13 - Comment: The paper investigates cultural biases in LLMs using a mechanistic interpretability-based method, providing insights into LLM behavior and interpretability.

  28. $\text{M}^{2}$LLM: Multi-view Molecular Representation Learning with Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-13 - Comment: The paper proposes a multi-view framework for molecular representation learning using LLMs, which is relevant to foundational research in representation learning and AI for Science.

  29. Can LLMs Detect Their Confabulations? Estimating Reliability in Uncertainty-Aware Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-12 - Comment: The paper investigates reliability estimation in LLMs using uncertainty, which provides insights into LLM behavior and interpretability.

  30. From Product Hilbert Spaces to the Generalized Koopman Operator and the Nonlinear Fundamental Lemma - Score: 17 (R=9, N=8) - Date: 2025-08-12 - Comment: The paper presents a novel solution to generalize the Koopman operator for nonlinear systems, which is foundational research in representation learning.

  31. A Spin Glass Characterization of Neural Networks - Score: 17 (R=9, N=8) - Date: 2025-08-12 - Comment: The paper provides a novel statistical mechanics characterization of neural networks, offering insights into their structure and properties beyond conventional metrics.

  32. Intrinsic training dynamics of deep neural networks - Score: 17 (R=9, N=8) - Date: 2025-08-12 - Comment: The paper provides insights into the intrinsic training dynamics of deep neural networks, focusing on dimensionality reduction and implicit bias, which aligns with representation learning.

  33. Tractable Sharpness-Aware Learning of Probabilistic Circuits - Score: 17 (R=9, N=8) - Date: 2025-08-08 - Comment: The paper presents a Hessian-based regularizer for probabilistic circuits, offering insights into training dynamics and representation learning.

  34. Task complexity shapes internal representations and robustness in neural networks - Score: 17 (R=9, N=8) - Date: 2025-08-08 - Comment: The paper investigates how task complexity influences internal representations in neural networks, which is relevant to representation learning and model compression.

  35. Integrated Influence: Data Attribution with Baseline - Score: 17 (R=9, N=8) - Date: 2025-08-08 - Comment: The paper introduces a novel data attribution method, Integrated Influence, which provides a theoretical framework and insights into data attribution, aligning with representation learning.

  36. Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws - Score: 17 (R=9, N=8) - Date: 2025-08-06 - Comment: The paper provides a theoretical analysis of SGD dynamics in high-dimensional neural networks, which aligns with the Representation Learning criterion.

  37. LLMs Have a Heart of Stone: Demystifying the Soft Thinking Ability of Large Reasoning Models - Score: 17 (R=9, N=8) - Date: 2025-08-06 - Comment: The paper explores the 'Soft Thinking' capabilities of LLMs, providing theoretical insights into their behavior and interpretability.

  38. Revisiting Deep Information Propagation: Fractal Frontier and Finite-size Effects - Score: 17 (R=9, N=8) - Date: 2025-08-06 - Comment: The paper provides insights into how deep networks encode information by studying information propagation in finite-width neural networks, which aligns with representation learning.

  39. Noosemia: toward a Cognitive and Phenomenological Account of Intentionality Attribution in Human-Generative AI Interaction - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper introduces a novel cognitive-phenomenological phenomenon related to LLMs, which could provide theoretical insights into LLM behavior and interpretability.

  40. From Generator to Embedder: Harnessing Innate Abilities of Multimodal LLMs via Building Zero-Shot Discriminative Embedding Model - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper proposes a framework for adapting multimodal LLMs for universal embedding tasks, which aligns with foundational research in representation learning and LLMs.

  41. Embryology of a Language Model - Score: 17 (R=9, N=8) - Date: 2025-08-04 - Comment: The paper provides insights into the internal structure development of language models, which is relevant to representation learning and understanding LLM behavior.

  42. Watch the Weights: Unsupervised monitoring and control of fine-tuned LLMs - Score: 17 (R=9, N=8) - Date: 2025-08-04 - Comment: The paper presents a novel method for monitoring and controlling fine-tuned LLMs by interpreting weights, which is relevant to understanding LLM behavior and interpretability.

  43. Model Directions, Not Words: Mechanistic Topic Models Using Sparse Autoencoders - Score: 17 (R=9, N=8) - Date: 2025-08-01 - Comment: The paper introduces Mechanistic Topic Models using sparse autoencoders, aligning with representation learning and offering novel insights into topic modeling.

  44. How does Chain of Thought Think? Mechanistic Interpretability of Chain-of-Thought Reasoning with Sparse Autoencoding - Score: 17 (R=9, N=8) - Date: 2025-08-01 - Comment: The paper investigates the mechanistic interpretability of Chain-of-Thought reasoning using sparse autoencoding, which aligns with representation learning and LLM interpretability.

  45. Semantic Convergence: Investigating Shared Representations Across Scaled LLMs - Score: 17 (R=9, N=8) - Date: 2025-08-01 - Comment: The paper investigates feature universality in LLMs using Sparse Autoencoders, relevant to representation learning and LLM behavior analysis.

  46. Disentangling concept semantics via multilingual averaging in Sparse Autoencoders - Score: 16 (R=9, N=7) - Date: 2025-08-21 - Comment: The paper explores disentangling concept semantics using sparse autoencoders, which aligns with representation learning and sparse methods.

  47. A Generalized Learning Framework for Self-Supervised Contrastive Learning - Score: 16 (R=9, N=7) - Date: 2025-08-20 - Comment: The paper proposes a generalized framework for self-supervised contrastive learning, providing theoretical insights into representation learning.

  48. RepreGuard: Detecting LLM-Generated Text by Revealing Hidden Representation Patterns - Score: 16 (R=9, N=7) - Date: 2025-08-19 - Comment: The paper focuses on detecting LLM-generated text by analyzing internal representations, which aligns with representation learning and insights into LLM behavior.

  49. Approximating the universal thermal climate index using sparse regression with orthogonal polynomials - Score: 16 (R=9, N=7) - Date: 2025-08-18 - Comment: The paper explores sparse regression techniques with orthogonal polynomials, which aligns with representation learning through sparse methods.

  50. Pattern-based Knowledge Component Extraction from Student Code Using Representation Learning - Score: 16 (R=9, N=7) - Date: 2025-08-14 - Comment: The paper proposes a novel framework for automated knowledge component extraction using representation learning, which aligns with the core topic of representation learning.

  51. Evaluating Contrast Localizer for Identifying Causal Unitsin Social & Mathematical Tasks in Language Models - Score: 16 (R=9, N=7) - Date: 2025-08-13 - Comment: The paper explores the causal relevance of units in LLMs using a neuroscientific approach, which aligns with the interest in understanding LLM behavior and interpretability.

  52. Representation Understanding via Activation Maximization - Score: 16 (R=9, N=7) - Date: 2025-08-12 - Comment: The paper explores activation maximization for understanding feature representations in DNNs, which is relevant to representation learning and model interpretability.

  53. How Do LLMs Persuade? Linear Probes Can Uncover Persuasion Dynamics in Multi-Turn Conversations - Score: 16 (R=9, N=7) - Date: 2025-08-08 - Comment: The paper uses linear probes to study persuasion dynamics in LLMs, providing insights into LLM behavior and interpretability.

  54. CAMA: Enhancing Mathematical Reasoning in Large Language Models with Causal Knowledge - Score: 16 (R=9, N=7) - Date: 2025-08-05 - Comment: The paper focuses on enhancing mathematical reasoning in LLMs using a causal framework, which aligns with foundational research in LLM behavior and interpretability.

  55. Universal Neurons in GPT-2: Emergence, Persistence, and Functional Impact - Score: 16 (R=9, N=7) - Date: 2025-08-05 - Comment: The paper investigates universal neurons in GPT-2 models, providing insights into representation learning and neural network training dynamics.

  56. Interpretable by AI Mother Tongue: Native Symbolic Reasoning in Neural Models - Score: 16 (R=8, N=8) - Date: 2025-08-27 - Comment: The paper presents a framework for developing an AI Mother Tongue, focusing on symbolic reasoning and interpretability, which aligns with representation learning and emerging trends.

  57. Information Templates: A New Paradigm for Intelligent Active Feature Acquisition - Score: 16 (R=8, N=8) - Date: 2025-08-27 - Comment: The paper proposes a new paradigm for active feature acquisition using information templates, relevant to representation learning.

  58. Limits of message passing for node classification: How class-bottlenecks restrict signal-to-noise ratio - Score: 16 (R=8, N=8) - Date: 2025-08-26 - Comment: The paper provides a theoretical framework for understanding limitations in message passing neural networks, relevant to representation learning.

  59. Graph Structure Learning with Temporal Graph Information Bottleneck for Inductive Representation Learning - Score: 16 (R=8, N=8) - Date: 2025-08-21 - Comment: The paper proposes a framework integrating Graph Structure Learning with Temporal Graph Information Bottleneck, relevant to representation learning and model architecture.

  60. Understanding Data Influence with Differential Approximation - Score: 16 (R=8, N=8) - Date: 2025-08-21 - Comment: The paper introduces a new formulation for approximating data influence, which is relevant to representation learning and training dynamics in neural networks.

  61. DyMixOp: Guiding Neural Operator Design for PDEs from a Complex Dynamics Perspective with Local-Global-Mixing - Score: 16 (R=8, N=8) - Date: 2025-08-20 - Comment: The paper introduces a novel neural operator framework for PDEs, focusing on transforming infinite-dimensional nonlinear PDE dynamics into a finite-dimensional latent space. This aligns with the criteria of model architecture innovations and representation learning.

  62. EXOTIC: An Exact, Optimistic, Tree-Based Algorithm for Min-Max Optimization - Score: 16 (R=8, N=8) - Date: 2025-08-19 - Comment: The paper presents a novel algorithm for min-max optimization, which is relevant to emerging trends in optimization theory.

  63. IBEX: Information-Bottleneck-EXplored Coarse-to-Fine Molecular Generation under Limited Data - Score: 16 (R=8, N=8) - Date: 2025-08-15 - Comment: The paper presents IBEX, a method for molecular generation using information bottleneck theory, aligning with AI for Science and representation learning in molecular modeling.

  64. On the Complexity-Faithfulness Trade-off of Gradient-Based Explanations - Score: 16 (R=8, N=8) - Date: 2025-08-15 - Comment: The paper introduces a spectral framework to analyze gradient-based explanations in ReLU networks, which is relevant to foundational research in model interpretability.

  65. Towards Universal Neural Inference - Score: 16 (R=8, N=8) - Date: 2025-08-13 - Comment: The paper introduces a universal neural inference model, which is relevant to representation learning and emerging trends in model architecture.

  66. Interpretable Reward Model via Sparse Autoencoder - Score: 16 (R=8, N=8) - Date: 2025-08-13 - Comment: The paper introduces a Sparse Autoencoder-enhanced Reward Model, which is relevant to representation learning and model architecture due to its novel integration of sparse autoencoders for interpretability.

  67. Attribution Explanations for Deep Neural Networks: A Theoretical Perspective - Score: 16 (R=8, N=8) - Date: 2025-08-12 - Comment: The paper provides a theoretical perspective on attribution explanations for DNNs, which is relevant to understanding model interpretability and representation learning.

  68. SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models - Score: 16 (R=8, N=8) - Date: 2025-08-06 - Comment: SCFlow proposes a novel approach to disentangle style and content using flow models, which aligns with representation learning and architectural innovation.

  69. Separated-Variable Spectral Neural Networks: A Physics-Informed Learning Approach for High-Frequency PDEs - Score: 16 (R=8, N=8) - Date: 2025-08-04 - Comment: The paper introduces a novel framework for solving high-frequency PDEs using neural networks, addressing spectral bias with a theoretical framework based on singular value decomposition. This aligns with representation learning and model architecture innovations.

  70. Efficient Neuro-Symbolic Learning of Constraints and Objective - Score: 15 (R=8, N=7) - Date: 2025-08-29 - Comment: The paper introduces a differentiable neuro-symbolic architecture for solving NP-hard reasoning problems, which aligns with foundational research in representation learning and model architecture.

  71. Towards Mitigating Excessive Forgetting in LLM Unlearning via Entanglement-Aware Unlearning with Proxy Constraint - Score: 15 (R=8, N=7) - Date: 2025-08-29 - Comment: The paper proposes a novel unlearning framework for LLMs, which aligns with foundational research in LLM behavior and interpretability.

  72. Uncovering the Spectral Bias in Diagonal State Space Models - Score: 15 (R=8, N=7) - Date: 2025-08-29 - Comment: The paper investigates the spectral bias in diagonal state space models, contributing to foundational research in model architecture and representation learning.

  73. Tracking World States with Language Models: State-Based Evaluation Using Chess - Score: 15 (R=8, N=7) - Date: 2025-08-28 - Comment: The paper presents a model-agnostic evaluation framework for LLMs using chess, offering insights into LLM behavior and structured reasoning.

  74. LFD: Layer Fused Decoding to Exploit External Knowledge in Retrieval-Augmented Generation - Score: 15 (R=8, N=7) - Date: 2025-08-28 - Comment: The paper proposes Layer Fused Decoding (LFD) for retrieval-augmented generation, offering insights into how LLMs integrate external knowledge, which aligns with foundational research in LLM behavior and interpretability.

  75. Kolmogorov-Arnold Representation for Symplectic Learning: Advancing Hamiltonian Neural Networks - Score: 15 (R=8, N=7) - Date: 2025-08-28 - Comment: The paper introduces a novel approach to Hamiltonian Neural Networks by using Kolmogorov-Arnold Representation, which aligns with foundational research in model architecture and representation learning.

  76. Can Structured Templates Facilitate LLMs in Tackling Harder Tasks? : An Exploration of Scaling Laws by Difficulty - Score: 15 (R=8, N=7) - Date: 2025-08-27 - Comment: The paper proposes a framework for improving LLMs' procedural reasoning, which provides insights into training dynamics and aligns with foundational research in representation learning.

  77. Investigating Advanced Reasoning of Large Language Models via Black-Box Interaction - Score: 15 (R=8, N=7) - Date: 2025-08-27 - Comment: The paper introduces a novel evaluation paradigm for LLM reasoning, which could provide theoretical insights into LLM behavior and interpretability.

  78. On the Generalisation of Koopman Representations for Chaotic System Control - Score: 15 (R=8, N=7) - Date: 2025-08-27 - Comment: The paper explores Koopman-based representations and their generalization for chaotic systems, involving autoencoding and transformers, which aligns with representation learning and model architecture analysis.

  79. Generalization Bound for a General Class of Neural Ordinary Differential Equations - Score: 15 (R=8, N=7) - Date: 2025-08-27 - Comment: The paper provides generalization bounds for neural ODEs, which is relevant to understanding model architecture and emerging trends in theoretical work.

  80. Beyond Tokens: Enhancing RTL Quality Estimation via Structural Graph Learning - Score: 15 (R=8, N=7) - Date: 2025-08-27 - Comment: The paper introduces a novel graph-based learning framework for RTL quality estimation, focusing on structural representation learning, which aligns with foundational research in representation learning.

  81. Disentangling the Factors of Convergence between Brains and Computer Vision Models - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper explores the convergence between brain and computer vision models, providing insights into representation learning and model architecture.

  82. Randomly Removing 50% of Dimensions in Text Embeddings has Minimal Impact on Retrieval and Classification Tasks - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper studies the impact of truncating text embeddings, providing insights into representation learning and model efficiency.

  83. Multimodal Representation Learning Conditioned on Semantic Relations - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper introduces a new framework for multimodal representation learning, focusing on semantic relations and cross-attention mechanisms, which aligns with representation learning insights.

  84. Proximal Vision Transformer: Enhancing Feature Representation through Two-Stage Manifold Geometry - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper proposes a novel framework integrating ViT with proximal tools for enhanced feature representation, relevant to model architecture.

  85. Deep Learning for Markov Chains: Lyapunov Functions, Poisson's Equation, and Stationary Distributions - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper explores the use of deep learning to automate the construction of Lyapunov functions for Markov chains, which is a foundational contribution to representation learning and theoretical insights into neural networks.

  86. Interpretable Kernels - Score: 15 (R=8, N=7) - Date: 2025-08-25 - Comment: The paper discusses interpretable kernels, contributing to interpretable AI by re-expressing kernel solutions in terms of original features, which aligns with representation learning.

  87. Low-dimensional embeddings of high-dimensional data - Score: 15 (R=8, N=7) - Date: 2025-08-25 - Comment: The paper provides a comprehensive review of low-dimensional embeddings, which is relevant to representation learning as it discusses foundational methods for data visualization and analysis.

  88. Tutorial on the Probabilistic Unification of Estimation Theory, Machine Learning, and Generative AI - Score: 15 (R=8, N=7) - Date: 2025-08-22 - Comment: The paper provides a theoretical synthesis connecting classical estimation theory, statistical inference, and modern machine learning, including deep learning and large language models, which aligns with foundational research in representation learning.

  89. Continual Neural Topic Model - Score: 15 (R=8, N=7) - Date: 2025-08-22 - Comment: The paper introduces a new model for continual learning in topic models, which is relevant to representation learning.

  90. Learning Protein-Ligand Binding in Hyperbolic Space - Score: 15 (R=8, N=7) - Date: 2025-08-22 - Comment: The paper proposes a hyperbolic representation learning framework for protein-ligand binding, which aligns with foundational research in representation learning.

  91. EvoFormer: Learning Dynamic Graph-Level Representations with Structural and Temporal Bias Correction - Score: 15 (R=8, N=7) - Date: 2025-08-22 - Comment: The paper proposes EvoFormer, a Transformer framework for dynamic graph-level representation learning, relevant to model architecture innovations.

  92. Saving for the future: Enhancing generalization via partial logic regularization - Score: 15 (R=8, N=7) - Date: 2025-08-22 - Comment: The paper introduces partial logic regularization to enhance generalization, which is relevant to representation learning and training dynamics.

  93. ECHO: Frequency-aware Hierarchical Encoding for Variable-length Signal - Score: 15 (R=8, N=7) - Date: 2025-08-21 - Comment: The paper proposes a novel foundation model for machine signal modeling, which is relevant to model architecture and representation learning.

  94. SBGD: Improving Graph Diffusion Generative Model via Stochastic Block Diffusion - Score: 15 (R=8, N=7) - Date: 2025-08-21 - Comment: The paper proposes a new model for graph diffusion generative models focusing on scalability and size generalization, which aligns with foundational research in model architecture and efficiency.

  95. Graph Concept Bottleneck Models - Score: 15 (R=8, N=7) - Date: 2025-08-21 - Comment: The paper introduces Graph Concept Bottleneck Models, which enhance interpretability and performance by leveraging concept relationships, relevant to representation learning.

  96. Logical Expressivity and Explanations for Monotonic GNNs with Scoring Functions - Score: 15 (R=8, N=7) - Date: 2025-08-21 - Comment: The paper focuses on enhancing the explainability and expressivity of GNNs using scoring functions, which aligns with representation learning by providing insights into how GNNs encode information.

  97. Parameter-Aware Ensemble SINDy for Interpretable Symbolic SGS Closure - Score: 15 (R=8, N=7) - Date: 2025-08-21 - Comment: The paper presents a sparse regression framework for discovering interpretable equations, which aligns with representation learning and sparse methods.

  98. A Unified Cortical Circuit Model with Divisive Normalization and Self-Excitation for Robust Representation and Memory Maintenance - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper presents a unified cortical circuit model for robust representation and memory maintenance, which aligns with foundational research in representation learning.

  99. Distribution Matching via Generalized Consistency Models - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper proposes a novel approach for distribution matching inspired by consistency models, which is relevant to representation learning.

  100. DE-VAE: Revealing Uncertainty in Parametric and Inverse Projections with Variational Autoencoders using Differential Entropy - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper proposes DE-VAE, an uncertainty-aware variational autoencoder, which aligns with interests in autoencoders and representation learning.

  101. Rigorous Feature Importance Scores based on Shapley Value and Banzhaf Index - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper introduces novel feature importance scores based on Shapley value and Banzhaf index, which is relevant to representation learning and interpretability.

  102. Separating Knowledge and Perception with Procedural Data - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper explores representation learning using procedural data, which is relevant to understanding how models encode information.

  103. Assessing Representation Stability for Transformer Models - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper introduces a model-agnostic framework for detecting adversarial examples in transformer models, which relates to representation learning.

  104. Toward Practical Equilibrium Propagation: Brain-inspired Recurrent Neural Network with Feedback Regulation and Residual Connections - Score: 15 (R=8, N=7) - Date: 2025-08-19 - Comment: The paper proposes a biologically plausible Feedback-regulated Residual recurrent neural network (FRE-RNN) to enhance Equilibrium Propagation, which aligns with foundational research in model architecture and representation learning.

  105. Towards Faithful Class-level Self-explainability in Graph Neural Networks by Subgraph Dependencies - Score: 15 (R=8, N=7) - Date: 2025-08-18 - Comment: The paper presents a novel self-explainable GNN framework for class-level explanations, which aligns with the core topic of model architecture analysis and interpretability in graph neural networks.

  106. How Causal Abstraction Underpins Computational Explanation - Score: 15 (R=8, N=7) - Date: 2025-08-18 - Comment: The paper discusses causal abstraction in computational explanations, which relates to foundational research in representation learning and theoretical insights into model behavior.

  107. Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics - Score: 15 (R=8, N=7) - Date: 2025-08-18 - Comment: The paper provides insights into the pre-training dynamics of LLMs, focusing on cross-lingual knowledge transfer, which is relevant to understanding LLM behavior.

  108. Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper proposes ConGCD, a novel approach to category discovery using representation learning, which aligns with the representation learning criterion.

  109. Graph Learning via Logic-Based Weisfeiler-Leman Variants and Tabularization - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper presents a novel approach for graph classification using variants of the Weisfeiler-Leman algorithm, providing theoretical insights into their expressive power. This aligns with representation learning and model architecture analysis.

  110. X-Node: Self-Explanation is All We Need - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper introduces a self-explaining GNN framework, which aligns with the representation learning criterion by providing insights into how GNNs encode information.

  111. Unpacking the Implicit Norm Dynamics of Sharpness-Aware Minimization in Tensorized Models - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper explores the implicit norm dynamics of Sharpness-Aware Minimization, relevant to representation learning and training dynamics.

  112. SynBrain: Enhancing Visual-to-fMRI Synthesis via Probabilistic Representation Learning - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper introduces SynBrain, a probabilistic representation learning framework for visual-to-fMRI synthesis, aligning with representation learning.

  113. xRFM: Accurate, scalable, and interpretable feature learning models for tabular data - Score: 15 (R=8, N=7) - Date: 2025-08-15 - Comment: The paper presents xRFM, a novel feature learning model for tabular data, which is relevant to representation learning.

  114. Fine-Grained Safety Neurons with Training-Free Continual Projection to Reduce LLM Fine Tuning Risks - Score: 15 (R=8, N=7) - Date: 2025-08-14 - Comment: The paper proposes Fine-Grained Safety Neurons for reducing fine-tuning risks in LLMs, which aligns with foundational research in LLM safety and interpretability.

  115. Deep Neural Network Calibration by Reducing Classifier Shift with Stochastic Masking - Score: 15 (R=8, N=7) - Date: 2025-08-13 - Comment: The paper introduces a novel mask-based classifier calibration method leveraging stochastic sparsity, which aligns with representation learning through insights into training dynamics and sparsity.

  116. Bio-Inspired Artificial Neural Networks based on Predictive Coding - Score: 15 (R=8, N=7) - Date: 2025-08-13 - Comment: The paper discusses bio-inspired neural networks based on predictive coding, offering insights into alternative training methods, relevant to representation learning.

  117. Superclass-Guided Representation Disentanglement for Spurious Correlation Mitigation - Score: 15 (R=8, N=7) - Date: 2025-08-13 - Comment: The paper proposes a method for representation disentanglement to mitigate spurious correlations, which is relevant to representation learning.

  118. Barron Space Representations for Elliptic PDEs with Homogeneous Boundary Conditions - Score: 15 (R=8, N=7) - Date: 2025-08-12 - Comment: The paper studies the approximation of high-dimensional PDEs using Barron spaces and shallow networks, which is relevant to foundational research in AI for Science.

  119. Fractal Language Modelling by Universal Sequence Maps (USM) - Score: 15 (R=8, N=7) - Date: 2025-08-12 - Comment: The paper discusses fractal language modeling using Universal Sequence Maps, which is relevant to representation learning and foundational research in encoding procedures.

  120. Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning - Score: 15 (R=8, N=7) - Date: 2025-08-12 - Comment: The paper addresses the challenge of codebook collapse in graph representation learning, proposing a novel framework that enhances codebook utilization and token diversity.

  121. Multivariate Fields of Experts - Score: 15 (R=8, N=7) - Date: 2025-08-11 - Comment: The paper introduces a new framework for learning image priors, which involves representation learning and model architecture. It offers a structured design that retains interpretability, aligning with foundational research.

  122. Structural Equation-VAE: Disentangled Latent Representations for Tabular Data - Score: 15 (R=8, N=7) - Date: 2025-08-11 - Comment: The paper presents a novel architecture, SE-VAE, for disentangled latent representations in tabular data, aligning with the model architecture criterion.

  123. Negative Binomial Variational Autoencoders for Overdispersed Latent Modeling - Score: 15 (R=8, N=7) - Date: 2025-08-08 - Comment: The paper introduces a new VAE framework using the negative binomial distribution, which is relevant to representation learning and offers insights into feature learning.

  124. CF3: Compact and Fast 3D Feature Fields - Score: 15 (R=8, N=7) - Date: 2025-08-08 - Comment: The paper introduces a novel approach to 3D feature fields with an adaptive sparsification method, aligning with model compression and representation learning.

  125. MENDR: Manifold Explainable Neural Data Representations - Score: 15 (R=8, N=7) - Date: 2025-08-08 - Comment: The paper introduces a novel Riemannian Manifold Transformer architecture for EEG signal representation, which is relevant to model architecture and representation learning.

  126. RegMean++: Enhancing Effectiveness and Generalization of Regression Mean for Model Merging - Score: 15 (R=8, N=7) - Date: 2025-08-06 - Comment: The paper introduces RegMean++, which enhances model merging by considering intra- and cross-layer dependencies, relevant to model architecture and representation learning.

  127. HiTeC: Hierarchical Contrastive Learning on Text-Attributed Hypergraph with Semantic-Aware Augmentation - Score: 15 (R=8, N=7) - Date: 2025-08-06 - Comment: The paper focuses on contrastive learning, a key aspect of representation learning, and introduces a novel hierarchical framework for hypergraph learning.

  128. Graph Embedding in the Graph Fractional Fourier Transform Domain - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper introduces a novel graph embedding method using the graph fractional Fourier transform, contributing to representation learning with a focus on spectral methods.

  129. How Does Controllability Emerge In Language Models During Pretraining? - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper investigates how controllability emerges in LLMs during pretraining, relevant to understanding LLM behavior and interpretability.

  130. Granular Concept Circuits: Toward a Fine-Grained Circuit Discovery for Concept Representations - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper introduces a method for discovering concept circuits in deep vision models, which relates to representation learning by offering insights into how models encode information.

  131. Effects of Feature Correlations on Associative Memory Capacity - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper investigates the effects of feature correlations on associative memory capacity, contributing to representation learning by analyzing capacity dynamics.

  132. Uncertainty Quantification for Large-Scale Deep Networks via Post-StoNet Modeling - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper introduces a novel post-processing approach for uncertainty quantification in DNNs, aligning with representation learning and model architecture analysis.

  133. Reinitializing weights vs units for maintaining plasticity in neural networks - Score: 15 (R=8, N=7) - Date: 2025-08-04 - Comment: The paper explores reinitialization techniques to maintain plasticity in neural networks, which is relevant to representation learning and training dynamics.

  134. Improved Robustness and Functional Localization in Topographic CNNs Through Weight Similarity - Score: 15 (R=8, N=7) - Date: 2025-08-04 - Comment: The paper examines topographic constraints in CNNs, providing insights into representation learning and model architecture.

Other Foundational Research (21)

  1. A Rose by Any Other Name Would Smell as Sweet: Categorical Homotopy Theory for Large Language Models - Score: 18 (R=9, N=9) - Date: 2025-08-15 - Comment: The paper introduces a categorical homotopy framework for LLMs, offering a novel theoretical perspective on language model behavior.

  2. Topos Causal Models - Score: 18 (R=9, N=9) - Date: 2025-08-13 - Comment: The introduction of topos causal models represents a novel theoretical framework for causal inference, aligning with emerging trends in foundational research.

  3. Understanding Tool-Integrated Reasoning - Score: 17 (R=9, N=8) - Date: 2025-08-27 - Comment: The paper provides a formal proof for the effectiveness of Tool-Integrated Reasoning in LLMs, offering theoretical insights into model capabilities, which aligns with foundational research in LLMs.

  4. Energy-Based Flow Matching for Generating 3D Molecular Structure - Score: 17 (R=9, N=8) - Date: 2025-08-27 - Comment: The paper focuses on foundational research in molecular modeling using an energy-based perspective, which aligns with AI for Science. It introduces a novel flow matching setup with theoretical justifications.

  5. Dynamic Collaboration of Multi-Language Models based on Minimal Complete Semantic Units - Score: 17 (R=9, N=8) - Date: 2025-08-27 - Comment: The paper introduces a novel method for multi-model collaboration in language models, focusing on token-level reasoning and vocabulary alignment, which aligns with foundational research in LLMs.

  6. From Confidence to Collapse in LLM Factual Robustness - Score: 17 (R=9, N=8) - Date: 2025-08-25 - Comment: The paper introduces a novel metric for evaluating factual robustness in LLMs, which aligns with theoretical insights into LLM behavior.

  7. The Self-Execution Benchmark: Measuring LLMs' Attempts to Overcome Their Lack of Self-Execution - Score: 17 (R=9, N=8) - Date: 2025-08-19 - Comment: The paper introduces a new benchmark to evaluate LLMs' ability to predict aspects of their own responses, which provides theoretical insights into LLM behavior.

  8. Data-Driven Discovery of Interpretable Kalman Filter Variants through Large Language Models and Genetic Programming - Score: 17 (R=9, N=8) - Date: 2025-08-19 - Comment: The paper explores a novel approach combining genetic programming and large language models for algorithmic discovery, which aligns with foundational research in AI for Science.

  9. Can SGD Handle Heavy-Tailed Noise? - Score: 17 (R=9, N=8) - Date: 2025-08-08 - Comment: The paper provides theoretical insights into the behavior of SGD under heavy-tailed noise, contributing to the understanding of training dynamics in neural networks.

  10. Self-Questioning Language Models - Score: 17 (R=9, N=8) - Date: 2025-08-06 - Comment: The paper proposes Self-Questioning Language Models, an innovative framework for improving LLMs without external data, aligning with foundational research in LLM behavior and training dynamics.

  11. The Geometry of Machine Learning Models - Score: 17 (R=9, N=8) - Date: 2025-08-05 - Comment: The paper presents a mathematical framework for analyzing machine learning models through geometry, offering foundational insights into model interpretation and regularization.

  12. BAR Conjecture: the Feasibility of Inference Budget-Constrained LLM Services with Authenticity and Reasoning - Score: 17 (R=9, N=8) - Date: 2025-08-01 - Comment: The paper introduces the BAR Theorem, which provides a theoretical framework for understanding trade-offs in LLM services, aligning with the interest in theoretical insights into LLM behavior.

  13. Thinking Machines: Mathematical Reasoning in the Age of LLMs - Score: 16 (R=9, N=7) - Date: 2025-08-04 - Comment: The paper explores theoretical insights into LLM behavior, particularly in mathematical reasoning, aligning with foundational research in LLMs.

  14. Cognitive Loop via In-Situ Optimization: Self-Adaptive Reasoning for Science - Score: 16 (R=8, N=8) - Date: 2025-08-06 - Comment: The paper introduces a cognitive loop for self-adaptive reasoning in LLMs, which is relevant to foundational research in AI for science and LLM behavior.

  15. FMPlug: Plug-In Foundation Flow-Matching Priors for Inverse Problems - Score: 16 (R=8, N=8) - Date: 2025-08-04 - Comment: The paper introduces FMPlug, a novel framework leveraging foundation flow-matching priors for inverse problems, which aligns with foundational research in AI for Science.

  16. A Verifier Hierarchy - Score: 16 (R=8, N=8) - Date: 2025-08-01 - Comment: The paper presents a Verifier Trade-off Theorem, contributing to theoretical insights in complexity theory, which is relevant to emerging trends.

  17. Fast Convergence Rates for Subsampled Natural Gradient Algorithms on Quadratic Model Problems - Score: 15 (R=8, N=7) - Date: 2025-08-29 - Comment: The paper provides theoretical insights into subsampled natural gradient algorithms, which is relevant to foundational research in optimization and training dynamics.

  18. CP4SBI: Local Conformal Calibration of Credible Sets in Simulation-Based Inference - Score: 15 (R=8, N=7) - Date: 2025-08-26 - Comment: The paper introduces a model-agnostic conformal calibration framework for simulation-based inference, which is a foundational contribution to AI for Science.

  19. Chunks as Arms: Multi-Armed Bandit-Guided Sampling for Long-Context LLM Preference Optimization - Score: 15 (R=8, N=7) - Date: 2025-08-20 - Comment: The paper proposes a novel framework for optimizing LLMs using a Multi-Armed Bandit strategy, which is relevant to foundational research in LLMs. It introduces a new method for improving long-context reasoning, contributing to theoretical advancements in LLM behavior.

  20. Improving Diversity in Language Models: When Temperature Fails, Change the Loss - Score: 15 (R=8, N=7) - Date: 2025-08-14 - Comment: The paper proposes rethinking loss functions in language models to improve diversity, which relates to foundational research in LLMs.

  21. MArgE: Meshing Argumentative Evidence from Multiple Large Language Models for Justifiable Claim Verification - Score: 15 (R=8, N=7) - Date: 2025-08-05 - Comment: The paper introduces a framework for combining outputs from multiple LLMs using argumentative reasoning, relevant to theoretical insights into LLM behavior.