← Previous Summary | Monthly Overview | Next Summary →
2025-02 | 2025-03 | 2025-04

Personalized Monthly Topic Summary 2025/03

MetricValue
Total Papers680
Model Architecture160
Model Compression and Efficiency173
High Performance Computing51
Representation Learning273
Other Foundational Research23

Model Architecture (160)

  1. Mixture of Lookup Experts - Score: 19 (R=10, N=9) - Date: 2025-03-21 - Comment: The paper introduces Mixture of Lookup Experts (MoLE), which aligns closely with foundational research in Mixture-of-Experts architectures and efficiency.

  2. A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers - Score: 19 (R=10, N=9) - Date: 2025-03-07 - Comment: The paper provides theoretical insights into the expressive power of log-depth transformers, directly addressing foundational questions about model architecture and depth scaling.

  3. Convergence Rates for Softmax Gating Mixture of Experts - Score: 19 (R=10, N=9) - Date: 2025-03-06 - Comment: The paper provides a theoretical analysis of softmax gating in Mixture of Experts (MoE), directly addressing architectural insights and efficiency. The convergence analysis and sample efficiency insights are highly relevant.

  4. A Review of DeepSeek Models' Key Innovative Techniques - Score: 18 (R=10, N=8) - Date: 2025-03-17 - Comment: The paper reviews techniques behind DeepSeek models, including innovations in transformers and Mixture of Experts, aligning closely with model architecture research.

  5. When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective - Score: 18 (R=10, N=8) - Date: 2025-03-17 - Comment: The paper provides theoretical insights into when Transformers outperform other architectures, which is highly relevant to foundational research in model architecture.

  6. Mixture of Experts Made Intrinsically Interpretable - Score: 18 (R=10, N=8) - Date: 2025-03-12 - Comment: The paper introduces MoE-X, a Mixture-of-Experts model designed for intrinsic interpretability, which aligns closely with the MoE and interpretability criteria.

  7. MoFE: Mixture of Frozen Experts Architecture - Score: 18 (R=10, N=8) - Date: 2025-03-11 - Comment: The paper introduces the Mixture of Frozen Experts (MoFE) architecture, which is directly relevant to foundational research on Mixture-of-Experts and efficiency in model architectures.

  8. Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning - Score: 18 (R=10, N=8) - Date: 2025-03-10 - Comment: The paper introduces a symbolic Mixture-of-Experts framework, which directly aligns with the MoE topic under model architecture. The instance-level expert selection and efficiency improvements are notable contributions.

  9. Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts - Score: 18 (R=10, N=8) - Date: 2025-03-10 - Comment: The paper introduces Linear-MoE, combining linear sequence modeling with Mixture-of-Experts, which is highly relevant to architectural innovations and foundational research in MoE.

  10. Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts - Score: 18 (R=10, N=8) - Date: 2025-03-10 - Comment: The paper addresses the Straggler Effect in Mixture-of-Experts, which is directly relevant to model architecture and efficiency improvements. The proposed techniques are innovative.

  11. Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer - Score: 18 (R=10, N=8) - Date: 2025-03-05 - Comment: The paper proposes Union-of-Experts (UoE), which advances the Mixture-of-Experts paradigm with architectural innovations, aligning closely with model architecture research.

  12. Efficiently Editing Mixture-of-Experts Models with Compressed Experts - Score: 18 (R=10, N=8) - Date: 2025-03-04 - Comment: The paper introduces compressed experts for Mixture-of-Experts (MoE) models, reducing inference costs while maintaining performance. This directly aligns with the 'Model Architecture' and 'Model Compression' criteria.

  13. Collab: Controlled Decoding using Mixture of Agents for LLM Alignment - Score: 18 (R=9, N=9) - Date: 2025-03-28 - Comment: The paper introduces Model Assembly Learning (MAL) for merging heterogeneous model architectures, which aligns with foundational research in model architecture and parameter integration. The focus on merging across heterogeneous layers is novel and impactful.

  14. Why LLMs Cannot Think and How to Fix It - Score: 18 (R=9, N=9) - Date: 2025-03-13 - Comment: The paper critiques the architectural limitations of LLMs and proposes solutions to enable 'thought processes,' aligning with foundational research on LLM architecture.

  15. A Theory of Learning with Autoregressive Chain of Thought - Score: 18 (R=9, N=9) - Date: 2025-03-12 - Comment: The paper formalizes learning with autoregressive chain-of-thought, which aligns with foundational research in LLMs and introduces theoretical insights.

  16. L$^2$M: Mutual Information Scaling Law for Long-Context Language Modeling - Score: 18 (R=9, N=9) - Date: 2025-03-07 - Comment: The paper establishes a mutual information scaling law for long-context language modeling, which provides theoretical insights into LLM behavior and aligns with the LLM criterion.

  17. Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining - Score: 18 (R=9, N=9) - Date: 2025-03-07 - Comment: The paper establishes scaling laws for hyperparameters in LLM pretraining, providing theoretical insights into model optimization and aligning with foundational research in LLM behavior.

  18. Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought - Score: 18 (R=9, N=9) - Date: 2025-03-03 - Comment: The paper provides theoretical insights into how transformers implement multi-step gradient descent with Chain of Thought prompting, aligning with 'Large Language Models' and 'Representation Learning'.

  19. A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications - Score: 17 (R=10, N=7) - Date: 2025-03-11 - Comment: This is a comprehensive survey on Mixture-of-Experts (MoE), directly aligning with the model architecture criterion. It provides a broad overview and insights into MoE, making it highly relevant.

  20. Bridging the Dimensional Chasm: Uncover Layer-wise Dimensional Reduction in Transformers through Token Correlation - Score: 17 (R=9, N=8) - Date: 2025-03-31 - Comment: This paper provides a geometric framework for understanding token dynamics in Transformers, aligning with foundational research in representation learning and model architecture. The insights into dimensional reduction and token behavior are highly relevant.

  21. Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities - Score: 17 (R=9, N=8) - Date: 2025-03-31 - Comment: The paper focuses on leveraging Mixture-of-Experts (MoE) redundancy for multi-modal generative capabilities, which aligns with the 'Model Architecture' and 'Representation Learning' criteria. The use of low-rank adaptation and insights into modality-specific pathways adds theoretical depth.

  22. Concise One-Layer Transformers Can Do Function Evaluation (Sometimes) - Score: 17 (R=9, N=8) - Date: 2025-03-31 - Comment: This paper provides theoretical insights into the computational capabilities of concise one-layer transformers, directly contributing to understanding transformer architecture. It aligns well with the 'Model Architecture' criterion.

  23. Local Normalization Distortion and the Thermodynamic Formalism of Decoding Strategies for Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-31 - Comment: This paper develops a theoretical framework for decoding strategies in LLMs, analyzing local normalization distortion and its effects. It provides foundational insights into LLM behavior and decoding, aligning well with the criteria for LLM theoretical research.

  24. Model Assembly Learning with Heterogeneous Layer Weight Merging - Score: 17 (R=9, N=8) - Date: 2025-03-29 - Comment: The paper introduces Model Assembly Learning (MAL), a novel paradigm for merging heterogeneous model architectures and parameters. This aligns with model architecture innovations and provides foundational insights into parameter integration.

  25. Neuroplasticity in Artificial Intelligence -- An Overview and Inspirations on Drop In \& Out Learning - Score: 17 (R=9, N=8) - Date: 2025-03-28 - Comment: The paper explores neuroplasticity-inspired mechanisms like 'dropin' and 'dropout' for neural networks, which aligns with emerging trends and foundational research in model architecture and lifelong learning.

  26. FFN Fusion: Rethinking Sequential Computation in Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper proposes FFN Fusion, an architectural optimization technique for LLMs, which aligns with foundational research in model architecture and efficiency.

  27. Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences - Score: 17 (R=9, N=8) - Date: 2025-03-21 - Comment: The paper introduces Lyra, a subquadratic architecture for biological sequence modeling, which is relevant to foundational research in model architecture and efficiency.

  28. Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts - Score: 17 (R=9, N=8) - Date: 2025-03-21 - Comment: The paper introduces Race-DiT, a Mixture of Experts (MoE) model for diffusion transformers with a flexible routing strategy and regularization techniques. It aligns closely with the MoE criterion under model architecture.

  29. ATTENTION2D: Communication Efficient Distributed Self-Attention Mechanism - Score: 17 (R=9, N=8) - Date: 2025-03-21 - Comment: The paper introduces ATTENTION2D for distributed self-attention, which aligns with foundational research in model architecture and efficiency.

  30. Natural Quantization of Neural Networks - Score: 17 (R=9, N=8) - Date: 2025-03-20 - Comment: The paper introduces a quantum neural network architecture, which aligns with the 'Model Architecture' criterion due to its novel approach to integrating quantum mechanics into neural networks.

  31. Unique Hard Attention: A Tale of Two Sides - Score: 17 (R=9, N=8) - Date: 2025-03-20 - Comment: The paper provides theoretical insights into the expressivity of transformers by analyzing the implications of leftmost- and rightmost-hard attention. This aligns closely with foundational research on model architecture and contributes to understanding transformer behavior.

  32. DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers - Score: 17 (R=9, N=8) - Date: 2025-03-19 - Comment: DiffMoE introduces a novel MoE-based approach for diffusion models, which aligns with foundational research in model architecture and efficiency.

  33. RWKV-7 "Goose" with Expressive Dynamic State Evolution - Score: 17 (R=9, N=8) - Date: 2025-03-19 - Comment: RWKV-7 introduces a novel sequence modeling architecture with theoretical insights into its capabilities, making it relevant to model architecture innovations.

  34. Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels - Score: 17 (R=9, N=8) - Date: 2025-03-19 - Comment: Tiled Flash Linear Attention introduces a novel kernel algorithm for efficient sequence modeling, which aligns with foundational research in model architecture and efficiency.

  35. Frac-Connections: Fractional Extension of Hyper-Connections - Score: 17 (R=9, N=8) - Date: 2025-03-19 - Comment: Frac-Connections propose a novel architectural improvement to residual connections, which aligns with foundational research in model architecture.

  36. SuperBPE: Space Travel for Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper introduces SuperBPE, a novel tokenization method for LLMs, which aligns with foundational research in LLM architecture and pretraining improvements.

  37. Towards Learning High-Precision Least Squares Algorithms with Sequence Models - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper explores the limitations of Transformers in high-precision numerical tasks and introduces polynomial architectures for learning numerical algorithms, which aligns with foundational research in model architecture and training dynamics.

  38. Test-Time Training Provably Improves Transformers as In-context Learners - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper provides theoretical insights into test-time training for transformers as in-context learners, which aligns with foundational research in training dynamics and large language models.

  39. MoECollab: Democratizing LLM Development Through Collaborative Mixture of Experts - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: Proposes MoECollab framework leveraging Mixture of Experts (MoE) architecture, aligning with architectural innovation and emerging trends.

  40. LLM-Driven Multi-step Translation from C to Rust using Static Analysis - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper proposes a multi-step translation methodology for C-to-Rust using LLMs, aligning with foundational research in LLM-driven architecture innovations.

  41. Atlas: Multi-Scale Attention Improves Long Context Image Modeling - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper introduces a novel multi-scale attention mechanism and architecture (Atlas), which aligns with foundational research in model architecture.

  42. BriLLM: Brain-inspired Large Language Model - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper introduces a brain-inspired large language model, which aligns with architectural innovations and foundational research.

  43. MoLEx: Mixture of Layer Experts for Finetuning with Sparse Upcycling - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper introduces a sparse mixture of layer experts for fine-tuning, which is highly relevant to foundational research in model architecture.

  44. Taming Knowledge Conflicts in Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper introduces a method to address knowledge conflicts in LLMs, which aligns with foundational research in LLM behavior.

  45. Transformers without Normalization - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper introduces Dynamic Tanh as a replacement for normalization layers in Transformers, aligning with 'Model Architecture' due to its challenge to conventional practices.

  46. ASIDE: Architectural Separation of Instructions and Data in Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper proposes an architectural change (ASIDE) for LLMs to separate instructions and data, contributing to foundational insights into LLM architecture.

  47. Architecture-Aware Minimization (A$^2$M): How to Find Flat Minima in Neural Architecture Search - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper proposes a framework for flat minima in neural architecture search, contributing to foundational research in model architecture and optimization.

  48. From Equations to Insights: Unraveling Symbolic Structures in PDEs with LLMs - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper explores symbolic structures in PDEs using LLMs, contributing to foundational research in AI for science.

  49. Cost-Optimal Grouped-Query Attention for Long-Context LLMs - Score: 17 (R=9, N=8) - Date: 2025-03-13 - Comment: The paper explores cost-optimal grouped-query attention for long-context LLMs, which aligns with foundational research in model architecture and efficiency. It provides insights into attention head configurations and scaling laws.

  50. Astrea: A MOE-based Visual Understanding Model with Progressive Alignment - Score: 17 (R=9, N=8) - Date: 2025-03-13 - Comment: The paper introduces a MoE-based visual understanding model, which aligns with the model architecture criterion, particularly focusing on MoE innovations.

  51. Priority-Aware Preemptive Scheduling for Mixed-Priority Workloads in MoE Inference - Score: 17 (R=9, N=8) - Date: 2025-03-13 - Comment: The paper introduces a priority-aware preemptive scheduling system for MoE inference, which aligns with architectural innovations in MoE models.

  52. Discovering Influential Neuron Path in Vision Transformers - Score: 17 (R=9, N=8) - Date: 2025-03-13 - Comment: The paper investigates influential neuron paths in Vision Transformers, which aligns with understanding model architecture and interpretability. It provides insights into the inner workings of Transformers.

  53. Interpreting the Repeated Token Phenomenon in Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-13 - Comment: The paper provides a mechanistic explanation for a specific failure mode in LLMs and proposes a targeted patch, aligning with the criterion of theoretical insights into LLM behavior.

  54. Accelerating MoE Model Inference with Expert Sharding - Score: 17 (R=9, N=8) - Date: 2025-03-12 - Comment: The paper addresses efficiency in Mixture-of-Experts (MoE) inference through expert sharding, which directly aligns with the model architecture and compression criteria. The tensor sharding approach is a novel contribution to MoE inference.

  55. ProTeX: Structure-In-Context Reasoning and Editing of Proteins with Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-12 - Comment: The paper introduces a framework for protein structure reasoning and editing using LLMs, which aligns with foundational AI for science and multimodal generative paradigms.

  56. ResMoE: Space-efficient Compression of Mixture of Experts LLMs via Residual Restoration - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper introduces a compression method for Mixture-of-Experts models, which aligns with model compression and efficiency improvements.

  57. eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper proposes a memory-efficient MoE inference system, directly aligning with the model architecture and efficiency criteria.

  58. InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: This paper introduces a novel paradigm for long-context reasoning in LLMs, addressing computational scaling and reasoning depth. It aligns with foundational research in LLMs by proposing a new iterative reasoning framework, which could have broader implications for model efficiency and architecture.

  59. MoEMoE: Question Guided Dense and Scalable Sparse Mixture-of-Expert for Multi-source Multi-modal Answering - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper proposes a sparse Mixture-of-Experts framework for multi-source, multi-modal question answering, which aligns with foundational research on MoE and scalability.

  60. Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs - Score: 17 (R=9, N=8) - Date: 2025-03-10 - Comment: The paper discusses scaling Mixture-of-Experts (MoE) models efficiently, which directly aligns with foundational research in model architecture and efficiency.

  61. Continual Pre-training of MoEs: How robust is your router? - Score: 17 (R=9, N=8) - Date: 2025-03-10 - Comment: The paper investigates continual pre-training of MoE models, providing insights into routing algorithms and robustness, which is highly relevant to foundational research in MoE architectures.

  62. HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper proposes HybridNorm, a novel normalization strategy for transformers, which directly aligns with the model architecture criterion. It provides insights into training stability and performance improvements.

  63. SOLAR: Scalable Optimization of Large-scale Architecture for Reasoning - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper introduces SOLAR, a framework for reasoning in LLMs with novel topological approaches, aligning with foundational research in model architecture and reasoning.

  64. Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper focuses on improving MoE inference efficiency with speculative parallelization, which directly aligns with foundational research in MoE architectures and efficiency.

  65. Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper meta-analyzes design decisions in language models, providing insights into architectural choices and their downstream impact, which aligns with foundational research in model architecture.

  66. Conformal Transformations for Symmetric Power Transformers - Score: 17 (R=9, N=8) - Date: 2025-03-06 - Comment: The paper introduces a novel architectural improvement to linear transformers by addressing capacity limitations in symmetric power transformers using conformal transformations. This aligns with the 'Model Architecture' criterion, focusing on architectural innovations.

  67. Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs - Score: 17 (R=9, N=8) - Date: 2025-03-06 - Comment: The paper investigates cognitive behaviors in language models that enable self-improvement, providing theoretical insights into reasoning behaviors and their impact on model performance. This aligns with the 'Large Language Models' criterion, focusing on theoretical insights into LLM behavior.

  68. Forgetting Transformer: Softmax Attention with a Forget Gate - Score: 17 (R=9, N=8) - Date: 2025-03-05 - Comment: The paper introduces a Forgetting Transformer with a novel attention mechanism, which aligns with foundational research in model architecture and transformer innovations.

  69. Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper provides theoretical insights into the depth-width tradeoffs in transformers for graph tasks, which is highly relevant to understanding transformer architectures and their efficiency.

  70. Compositional Reasoning with Transformers, RNNs, and Chain of Thought - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper compares the expressive power of transformers, RNNs, and chain-of-thought methods for compositional reasoning, providing theoretical insights into model capabilities. This aligns with the interest in analyzing architectures.

  71. Liger: Linearizing Large Language Models to Gated Recurrent Structures - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper introduces Liger, a method for linearizing LLMs into gated recurrent structures, which aligns with foundational research in model architecture and efficiency. The use of LoRA for lightweight fine-tuning and the introduction of Liger Attention are novel contributions.

  72. DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper proposes a method for enhancing parameter efficiency in Mixture-of-Experts models, which aligns with foundational research in model architecture and efficiency.

  73. Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper introduces Neural ODE Transformers, offering insights into internal dynamics and adaptive fine-tuning. This aligns with foundational research in model architecture and interpretability.

  74. Transformer Meets Twicing: Harnessing Unattended Residual Information - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper proposes Twicing Attention, a novel attention mechanism addressing representational capacity decay in transformers. This aligns with foundational research in model architecture and offers theoretical guarantees.

  75. CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper introduces a dual momentum Mixture-of-Experts framework for continual learning in multimodal tasks, which is highly relevant to MoE and architectural innovations.

  76. CoSMoEs: Compact Sparse Mixture of Experts - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: This paper introduces Compact Sparse Mixture of Experts (CoSMoEs) for on-device inference, addressing quality, memory, and latency. It is highly relevant to the Mixture-of-Experts (MoE) criterion and provides insights into architectural innovations.

  77. FANformer: Improving Large Language Models Through Effective Periodicity Modeling - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: FANformer integrates Fourier Analysis Network into the attention mechanism, providing a novel architectural improvement for LLMs with potential foundational impact on periodicity modeling in transformers.

  78. Oscillation-Reduced MXFP4 Training for Vision Transformers - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: The paper addresses FP4 training for Vision Transformers with novel methods to reduce weight oscillation, aligning with 'Model Compression' and efficiency breakthroughs.

  79. Triple Phase Transitions: Understanding the Learning Dynamics of Large Language Models from a Neuroscience Perspective - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: The paper explores phase transitions in LLMs from a neuroscience perspective, providing theoretical insights into emergent behaviors in LLM training.

  80. Disentangling Feature Structure: A Mathematically Provable Two-Stage Training Dynamics in Transformers - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: The paper provides a theoretical analysis of two-stage training dynamics in transformers, contributing to understanding of feature disentanglement and optimization processes.

  81. Revisiting Kernel Attention with Correlated Gaussian Process Representation - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: The paper introduces a novel transformer architecture using Correlated Gaussian Processes (CGPs) to enhance representation capacity, aligning with the 'Model Architecture' criterion. It also includes a sparse approximation, which touches on 'Model Compression'.

  82. Enhancing Multi-modal Models with Heterogeneous MoE Adapters for Fine-tuning - Score: 16 (R=9, N=7) - Date: 2025-03-27 - Comment: The paper proposes heterogeneous MoE adapters for multi-modal fine-tuning, which aligns with the topic of Mixture-of-Experts and architectural innovations. The focus on low-rank space for efficient modal fusion adds to its relevance.

  83. A Survey on Transformer Context Extension: Approaches and Evaluation - Score: 16 (R=9, N=7) - Date: 2025-03-18 - Comment: The survey focuses on extending Transformer context for long sequences, which aligns with foundational research in Transformer architecture and efficiency.

  84. Positivity sets of hinge functions - Score: 16 (R=9, N=7) - Date: 2025-03-17 - Comment: The paper provides theoretical insights into the expressivity of one-layer ReLU neural networks related to their activation regions, targeting foundational architectural understanding.

  85. This Is Your Doge, If It Please You: Exploring Deception and Robustness in Mixture of LLMs - Score: 16 (R=9, N=7) - Date: 2025-03-11 - Comment: The paper evaluates the robustness of MoE architectures, directly aligning with the model architecture criterion.

  86. Rethinking Graph Structure Learning in the Era of LLMs - Score: 16 (R=8, N=8) - Date: 2025-03-29 - Comment: The paper proposes a new paradigm for graph structure learning (GSL) in the context of LLMs, which involves architectural innovations and efficient integration methods, aligning with 'Model Architecture' and 'Emerging Trends' criteria.

  87. Generative AI for Validating Physics Laws - Score: 16 (R=8, N=8) - Date: 2025-03-25 - Comment: The paper introduces a novel generative AI approach to validate physics laws, which aligns with foundational research in AI for Science. The framing of physics laws as causal problems is innovative.

  88. Ordered Topological Deep Learning: a Network Modeling Case Study - Score: 16 (R=8, N=8) - Date: 2025-03-24 - Comment: The paper introduces a novel topological deep learning framework, which aligns with architectural innovations and emerging trends.

  89. QCPINN: Quantum Classical Physics-Informed Neural Networks for Solving PDEs - Score: 16 (R=8, N=8) - Date: 2025-03-21 - Comment: The paper explores quantum-classical hybrid architectures for physics-informed neural networks, which introduces architectural innovations relevant to AI for Science.

  90. Gene42: Long-Range Genomic Foundation Model With Dense Attention - Score: 16 (R=8, N=8) - Date: 2025-03-21 - Comment: The paper introduces Gene42, a genomic foundation model with dense attention for long-range context. It aligns with foundational research in architecture innovations for science applications.

  91. PENCIL: Long Thoughts with Short Memory - Score: 16 (R=8, N=8) - Date: 2025-03-19 - Comment: PENCIL introduces a novel reduction mechanism for autoregressive generation, which aligns with foundational research in model architecture and efficiency improvements.

  92. Analytic Subspace Routing: How Recursive Least Squares Works in Continual Learning of Large Language Model - Score: 16 (R=8, N=8) - Date: 2025-03-19 - Comment: The paper proposes a novel subspace routing mechanism for continual learning in LLMs, which aligns with representation learning and model architecture innovations.

  93. COSMOS: Continuous Simplicial Neural Networks - Score: 16 (R=8, N=8) - Date: 2025-03-18 - Comment: The paper introduces a novel architecture for simplicial neural networks derived from PDEs, which aligns with architectural innovations and addresses over-smoothing in geometric deep learning.

  94. Hybrid Learners Do Not Forget: A Brain-Inspired Neuro-Symbolic Approach to Continual Learning - Score: 16 (R=8, N=8) - Date: 2025-03-17 - Comment: Introduces a neuro-symbolic approach to continual learning, aligning with architectural innovation and emerging trends.

  95. Permutation Equivariant Neural Networks for Symmetric Tensors - Score: 16 (R=8, N=8) - Date: 2025-03-17 - Comment: The paper introduces permutation equivariant neural networks for symmetric tensors, which aligns with architectural innovations and foundational research.

  96. Neuromorphic Quantum Neural Networks with Tunnel-Diode Activation Functions - Score: 16 (R=8, N=8) - Date: 2025-03-10 - Comment: The use of tunnel-diode activation functions introduces a novel physics-based activation mechanism, which is relevant to architectural innovations in neural networks.

  97. Seeing is Understanding: Unlocking Causal Attention into Modality-Mutual Attention for Multimodal LLMs - Score: 16 (R=8, N=8) - Date: 2025-03-05 - Comment: The paper proposes a novel attention mechanism (MMA) for multimodal LLMs, which aligns with architectural innovations in large models.

  98. CrystalFramer: Rethinking the Role of Frames for SE(3)-Invariant Crystal Structure Modeling - Score: 16 (R=8, N=8) - Date: 2025-03-05 - Comment: The paper introduces dynamic frames for crystal structure modeling, which is a novel architectural concept in the context of SE(3)-invariant modeling.

  99. Relating Piecewise Linear Kolmogorov Arnold Networks to ReLU Networks - Score: 16 (R=8, N=8) - Date: 2025-03-04 - Comment: The paper explores the connection between Kolmogorov-Arnold Networks and ReLU networks, providing explicit constructions. This aligns with the interest in theoretical insights into architectures.

  100. Depth-Adaptive Graph Neural Networks via Learnable Bakry-'Emery Curvature - Score: 16 (R=8, N=8) - Date: 2025-03-04 - Comment: The paper proposes a depth-adaptive GNN leveraging Bakry-Émery curvature, which aligns with architectural innovations in graph neural networks.

  101. A Proposal for Networks Capable of Continual Learning - Score: 15 (R=8, N=7) - Date: 2025-03-31 - Comment: The paper proposes a novel architecture for continual learning, which aligns with the model architecture criterion. The approach introduces a new paradigm for response preservation, making it relevant and moderately novel.

  102. Uncertainty propagation in feed-forward neural network models - Score: 15 (R=8, N=7) - Date: 2025-03-29 - Comment: The paper develops uncertainty propagation methods for feed-forward neural networks, offering theoretical insights into how information propagates, which is relevant to foundational research in neural network behavior.

  103. On the Optimality of Single-label and Multi-label Neural Network Decoders - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: The paper analytically proves the optimality of certain neural network decoders, which aligns with foundational research in model architecture and theoretical insights.

  104. Distil-xLSTM: Learning Attention Mechanisms through Recurrent Structures - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: The paper introduces a recurrent architecture (Distil-xLSTM) as an alternative to attention-based models, which aligns with foundational research in model architecture.

  105. Token-Level Uncertainty-Aware Objective for Language Model Post-Training - Score: 15 (R=8, N=7) - Date: 2025-03-24 - Comment: The paper proposes a token-level uncertainty-aware objective for language model post-training, which aligns with foundational research in LLM training dynamics and uncertainty modeling.

  106. InhibiDistilbert: Knowledge Distillation for a ReLU and Addition-based Transformer - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper explores inhibitor attention and knowledge distillation, which aligns with foundational research in model compression and efficiency.

  107. Blend the Separated: Mixture of Synergistic Experts for Data-Scarcity Drug-Target Interaction Prediction - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper proposes a Mixture of Synergistic Experts for drug-target interaction prediction, which aligns with foundational research in representation learning under data scarcity.

  108. Foundation models may exhibit staged progression in novel CBRN threat disclosure - Score: 15 (R=8, N=7) - Date: 2025-03-20 - Comment: The paper discusses staged progression in foundation models' reasoning capabilities, which aligns with the 'Large Language Models' criterion, particularly in understanding theoretical insights into model behavior.

  109. Long Context Modeling with Ranked Memory-Augmented Retrieval - Score: 15 (R=8, N=7) - Date: 2025-03-20 - Comment: The paper introduces a novel memory-augmented retrieval framework for long-context modeling, which aligns with model architecture innovations and efficiency improvements.

  110. Dynamic Accumulated Attention Map for Interpreting Evolution of Decision-Making in Vision Transformer - Score: 15 (R=8, N=7) - Date: 2025-03-20 - Comment: The paper introduces a novel method for visualizing attention flow in Vision Transformers, which provides insights into the inner workings of an existing architecture, aligning with the model architecture criterion.

  111. MetaScale: Test-Time Scaling with Evolving Meta-Thoughts - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper proposes MetaScale, a novel test-time scaling framework for LLMs, which aligns with foundational research in LLM behavior and adaptability.

  112. Deep Belief Markov Models for POMDP Inference - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper introduces a novel architecture, Deep Belief Markov Models, which aligns with model architecture innovations, particularly in dynamic and conditional networks.

  113. Do you understand epistemic uncertainty? Think again! Rigorous frequentist epistemic uncertainty estimation in regression - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper provides a theoretical framework for epistemic uncertainty estimation in regression, which aligns with foundational research in understanding model behavior and uncertainty quantification.

  114. On Local Posterior Structure in Deep Ensembles - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper investigates deep ensembles and Bayesian neural networks, which aligns with foundational research in model architecture and uncertainty quantification.

  115. A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper proposes a multi-power law for predicting loss curves across learning rate schedules, offering insights into training dynamics, which is relevant to foundational research.

  116. Fast filtering of non-Gaussian models using Amortized Optimal Transport Maps - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper introduces a mixture-of-experts-like approach using amortized optimal transport maps, which is relevant to model architecture innovations.

  117. The Architecture and Evaluation of Bayesian Neural Networks - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper explores Bayesian Neural Networks and their posterior approximations, focusing on architectural choices and uncertainty quantification. This aligns with foundational research in model architecture and uncertainty.

  118. Advanced Deep Learning Methods for Protein Structure Prediction and Design - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: Comprehensive review of deep learning methods for protein structure prediction, aligning with AI for Science foundational research.

  119. Designing Neural Synthesizers for Low Latency Interaction - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper investigates latency optimization in neural audio synthesis, aligning with foundational research in model efficiency and architecture design.

  120. Context-Aware Rule Mining Using a Dynamic Transformer-Based Framework - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: Proposes an improved Transformer architecture with dynamic weight adjustment and temporal dependency modules, aligning with architectural innovation.

  121. Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers? - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper proposes Kolmogorov-Arnold Attention for ViTs, contributing to architectural innovations and representation learning.

  122. Poly-MgNet: Polynomial Building Blocks in Multigrid-Inspired ResNets - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper introduces polynomial building blocks inspired by multigrid methods, contributing to architectural innovations in neural networks.

  123. Radar: Fast Long-Context Decoding for Any Transformer - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: Radar proposes a training-free method to accelerate Transformer inference for long-context data, aligning with the 'Model Compression' criterion due to its focus on efficiency improvements.

  124. Robustness Tokens: Towards Adversarial Robustness of Transformers - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper proposes Robustness Tokens for adversarial robustness in Transformers, aligning with 'Model Architecture' due to its structural innovation.

  125. Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: Gumiho introduces a hybrid speculative decoding architecture for LLMs, aligning with 'Model Architecture' due to its structural innovation.

  126. DTA: Dual Temporal-channel-wise Attention for Spiking Neural Networks - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper introduces Dual Temporal-channel-wise Attention for Spiking Neural Networks, contributing to architectural innovations.

  127. SO(3)-Equivariant Neural Networks for Learning Vector Fields on Spheres - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper introduces SO(3)-equivariant neural networks for learning vector fields on spheres, which involves architectural innovation and symmetry-aware modeling.

  128. Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper proposes a method to mitigate contextual hallucination in LLMs by dynamically adjusting attention maps, aligning with the 'Large Language Models' criterion due to its focus on improving model behavior.

  129. Neurosymbolic Decision Trees - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper introduces neurosymbolic decision trees, which is a novel approach combining symbolic reasoning and neural networks. It aligns with emerging trends in model architecture.

  130. MinGRU-Based Encoder for Turbo Autoencoder Frameworks - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper revisits RNNs for Turbo autoencoders and integrates efficient RNN architectures, which aligns with foundational research in model architecture and sequence modeling.

  131. Whoever Started the Interference Should End It: Guiding Data-Free Model Merging via Task Vectors - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper proposes a model merging method guided by task vectors, which aligns with model architecture and efficiency innovations.

  132. HOFAR: High-Order Augmentation of Flow Autoregressive Transformers - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper introduces high-order supervision for flow autoregressive transformers, which aligns with model architecture innovations.

  133. Can Memory-Augmented Language Models Generalize on Reasoning-in-a-Haystack Tasks? - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper introduces a memory-augmented LLM architecture for reasoning tasks, which provides insights into LLM behavior and architecture, making it relevant to foundational research.

  134. Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper proposes a novel defense mechanism against model merging by modifying model parameters, which aligns with foundational research in model architecture and parameter manipulation.

  135. Deep ARTMAP: Generalized Hierarchical Learning with Adaptive Resonance Theory - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper proposes Deep ARTMAP, a novel hierarchical learning framework extending ARTMAP. It introduces architectural innovations relevant to model architecture research.

  136. TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper introduces sparse autoencoders for diffusion transformers, which aligns with representation learning and architectural insights, particularly in sparse methods and interpretability.

  137. Using Subgraph GNNs for Node Classification:an Overlooked Potential Approach - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper reformulates node classification as a subgraph classification problem, which aligns with architectural innovations in GNNs.

  138. BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper introduces RWKV-7 as a replacement for Transformers in time series modeling, which aligns with foundational research in model architecture.

  139. AF-KAN: Activation Function-Based Kolmogorov-Arnold Networks for Efficient Representation Learning - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper proposes AF-KAN, a novel architecture inspired by Kolmogorov-Arnold Networks, with innovations in activation functions and parameter reduction methods. This aligns with the 'Model Architecture' criterion, as it explores architectural innovations and efficiency improvements.

  140. KunlunBaize: LLM with Multi-Scale Convolution and Multi-Token Prediction Under TransformerX Framework - Score: 15 (R=8, N=7) - Date: 2025-03-10 - Comment: The paper proposes a novel TransformerX framework with multiscale convolution and multitoken prediction, which aligns with architectural innovations in LLMs.

  141. Simple Self Organizing Map with Visual Transformer - Score: 15 (R=8, N=7) - Date: 2025-03-07 - Comment: The paper explores combining Vision Transformers (ViTs) with Self-Organizing Maps (SOMs), which aligns with foundational research in representation learning and architectural innovations.

  142. See What You Are Told: Visual Attention Sink in Large Multimodal Models - Score: 15 (R=8, N=7) - Date: 2025-03-06 - Comment: The paper investigates the visual attention sink phenomenon in large multimodal models and proposes a method to redistribute attention for better performance. This aligns with foundational research in model architecture and interpretability.

  143. MindBridge: Scalable and Cross-Model Knowledge Editing via Memory-Augmented Modality - Score: 15 (R=8, N=7) - Date: 2025-03-05 - Comment: The paper proposes a scalable knowledge editing framework for LLMs, which aligns with foundational research in LLM behavior and architecture-level innovations.

  144. Enhancing Transformer with GNN Structural Knowledge via Distillation: A Novel Approach - Score: 15 (R=8, N=7) - Date: 2025-03-05 - Comment: The paper proposes a knowledge distillation framework to transfer structural knowledge from GNNs to Transformers, aligning with architectural innovations and cross-architectural analysis.

  145. How simple can you go? An off-the-shelf transformer approach to molecular dynamics - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper evaluates a minimally modified transformer for molecular dynamics, questioning the necessity of specialized architectural features. This aligns with the interest in analyzing existing architectures.

  146. Architectural and Inferential Inductive Biases For Exchangeable Sequence Modeling - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper studies architectural and inferential biases in exchangeable sequence modeling, which aligns with the model architecture criterion by analyzing and proposing improvements to Transformer-based architectures.

  147. Amortized Conditional Independence Testing - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: The paper introduces a novel transformer-based architecture (ACID) for conditional independence testing, which aligns with foundational research in representation learning and model architecture.

  148. Information-Theoretic Perspectives on Optimizers - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: The paper introduces an information-theoretic perspective on optimizers, which provides foundational insights into the interplay between optimizers and architectures, aligning with the core topics.

  149. SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture Search via Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: The paper proposes a novel LLM-based neural architecture search method, which aligns with model architecture innovations and demonstrates strong generalization capabilities.

  150. Model-free front-to-end training of a large high performance laser neural network - Score: 15 (R=7, N=8) - Date: 2025-03-24 - Comment: The paper demonstrates a photonic ONN with in-situ learning capabilities, which aligns with emerging trends in unconventional neural network architectures.

  151. Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding - Score: 15 (R=7, N=8) - Date: 2025-03-11 - Comment: The paper explores attention heads in large vision-language models for visual grounding, which provides insights into model architecture and representation learning. However, it is slightly application-driven.

  152. Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM - Score: 14 (R=8, N=6) - Date: 2025-03-25 - Comment: The paper leverages Mixture-of-Experts (MoE) architecture for code LLMs, which directly aligns with the model architecture criterion. However, it focuses on application-specific performance improvements rather than foundational insights.

  153. Experiments with Optimal Model Trees - Score: 14 (R=7, N=7) - Date: 2025-03-18 - Comment: The paper explores globally optimal model trees, providing insights into interpretable machine learning and optimization, which is relevant to foundational research in model architecture.

  154. Enhanced Soups for Graph Neural Networks - Score: 14 (R=7, N=7) - Date: 2025-03-17 - Comment: The work focuses on improving GNN performance using a learned 'souping' mechanism, which brings novel insights into model behavior improvement but within the bounded scope of GNNs.

  155. Towards Robust Multimodal Representation: A Unified Approach with Adaptive Experts and Alignment - Score: 14 (R=7, N=7) - Date: 2025-03-13 - Comment: The paper introduces a Mixture of Experts (MoE) framework for handling incomplete multimodal data, which aligns with the 'Model Architecture' criterion. However, the focus on healthcare applications makes it partially relevant.

  156. Minion Gated Recurrent Unit for Continual Learning - Score: 14 (R=7, N=7) - Date: 2025-03-11 - Comment: The paper proposes a simplified recurrent unit for continual learning, which aligns with model architecture innovations. However, the focus on edge-device applications makes it partially relevant.

  157. Systems and Algorithms for Convolutional Multi-Hybrid Language Models at Scale - Score: 14 (R=7, N=7) - Date: 2025-03-05 - Comment: The paper introduces convolutional multi-hybrid architectures, which is relevant to architectural innovations. However, the focus is on efficiency and hardware optimization rather than foundational insights.

  158. Input Specific Neural Networks - Score: 14 (R=7, N=7) - Date: 2025-03-04 - Comment: The paper introduces Input Specific Neural Networks (ISNNs) with novel architectural constraints for encoding structural relationships. While the focus is on computational mechanics applications, the architectural innovation aligns with the 'Model Architecture' criterion.

  159. Towards Experience Replay for Class-Incremental Learning in Fully-Binary Networks - Score: 13 (R=7, N=6) - Date: 2025-03-11 - Comment: The paper explores class-incremental learning in fully binary networks, which aligns with model architecture and efficiency but is more application-driven.

  160. Partial Convolution Meets Visual Attention - Score: 13 (R=7, N=6) - Date: 2025-03-06 - Comment: The paper proposes PATNet, a hybrid network combining partial convolution and visual attention mechanisms, focusing on efficiency improvements in CNNs and ViTs. This aligns with 'Model Architecture' but leans towards application-driven improvements.

Model Compression and Efficiency (173)

  1. Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion - Score: 20.0 (R=0, N=0) - Date: 2025-03-27 - Comment: Author match

  2. A scalable gene network model of regulatory dynamics in single cells - Score: 20.0 (R=0, N=0) - Date: 2025-03-27 - Comment: Author match

  3. Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training - Score: 20.0 (R=0, N=0) - Date: 2025-03-25 - Comment: Author match

  4. Exploring the Limits of KV Cache Compression in Visual Autoregressive Transformers - Score: 19 (R=10, N=9) - Date: 2025-03-20 - Comment: The paper formalizes KV-cache compression in visual autoregressive transformers, directly addressing the 'Model Compression' criterion with theoretical insights into memory efficiency.

  5. MoQa: Rethinking MoE Quantization with Multi-stage Data-model Distribution Awareness - Score: 18 (R=10, N=8) - Date: 2025-03-28 - Comment: This paper introduces a novel quantization framework for MoE models, addressing compression and efficiency challenges specific to sparse data activation and expert combinations. It aligns with the model compression and MoE criteria.

  6. LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation - Score: 18 (R=10, N=8) - Date: 2025-03-27 - Comment: LogQuant introduces a novel 2-bit quantization technique for KV Cache in LLM inference, addressing memory efficiency and accuracy preservation. This aligns closely with model compression and efficiency breakthroughs, particularly in LLMs.

  7. Large Language Model Compression via the Nested Activation-Aware Decomposition - Score: 18 (R=10, N=8) - Date: 2025-03-24 - Comment: The paper focuses on a novel low-rank decomposition method for compressing large language models (LLMs), which aligns closely with the 'Model Compression' criterion. The proposed nested activation-aware framework (NSVD) introduces a new approach to handle activation variability and outliers, making it a significant contribution to compression techniques.

  8. Malliavin-Bismut Score-based Diffusion Models - Score: 18 (R=9, N=9) - Date: 2025-03-24 - Comment: The paper introduces a novel theoretical framework using Malliavin calculus for score-based diffusion models, which aligns with foundational research in generative modeling.

  9. Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs - Score: 18 (R=9, N=9) - Date: 2025-03-18 - Comment: The paper introduces a GPU-efficient alternative to matrix multiplication in DNNs, which aligns with model compression and efficiency breakthroughs. The Strassen-Tile operator is a novel contribution.

  10. SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders - Score: 18 (R=9, N=9) - Date: 2025-03-17 - Comment: SAUCE utilizes sparse autoencoders for selective concept unlearning, demonstrating theoretical innovations in sparse methods and aligning with foundational model compression research.

  11. Quantum-PEFT: Ultra parameter-efficient fine-tuning - Score: 18 (R=9, N=9) - Date: 2025-03-10 - Comment: The paper proposes Quantum-PEFT, a novel parameter-efficient fine-tuning method leveraging quantum computations, which aligns with model compression and efficiency breakthroughs.

  12. STADE: Standard Deviation as a Pruning Metric - Score: 17 (R=9, N=8) - Date: 2025-03-31 - Comment: The paper proposes STADE, a pruning method for LLMs, and provides theoretical insights into pruning strategies, which aligns with the model compression criterion. It extends the understanding of pruning beyond existing methods like Wanda.

  13. An Efficient Training Algorithm for Models with Block-wise Sparsity - Score: 17 (R=9, N=8) - Date: 2025-03-31 - Comment: The paper introduces an efficient training algorithm for block-wise sparse models, which aligns with the 'Model Compression' criterion. The focus on block-wise sparsity and efficient training adds theoretical and practical value.

  14. Consistent Multigroup Low-Rank Approximation - Score: 17 (R=9, N=8) - Date: 2025-03-29 - Comment: The paper proposes a consistent low-rank approximation method for multigroup data, which aligns with foundational research in model compression and low-rank approaches.

  15. HOT: Hadamard-based Optimized Training - Score: 17 (R=9, N=8) - Date: 2025-03-29 - Comment: The paper introduces Hadamard-based optimizations for backpropagation, which aligns with the 'Model Compression' criterion due to its focus on memory and computational efficiency.

  16. ASGO: Adaptive Structured Gradient Optimization - Score: 17 (R=9, N=8) - Date: 2025-03-27 - Comment: The paper introduces ASGO, a novel optimization algorithm leveraging structured gradients and low-rank properties, which aligns with model compression and efficiency breakthroughs. The theoretical analysis and practical modifications add to its novelty.

  17. TeleLoRA: Teleporting Model-Specific Alignment Across LLMs - Score: 17 (R=9, N=8) - Date: 2025-03-27 - Comment: TeleLoRA introduces a novel framework for low-rank adaptation across LLMs, which aligns with model compression and efficiency topics. The permutation-symmetric generator and memory-efficient design are innovative contributions.

  18. xKV: Cross-Layer SVD for KV-Cache Compression - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: Proposes a novel KV-cache compression method using cross-layer SVD, which directly aligns with model compression and efficiency breakthroughs.

  19. BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV Cache - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: Introduces a GPU-optimized framework for low-bit KV-cache decoding, which aligns with model compression and efficiency improvements.

  20. Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper proposes a hybrid KV cache quantization method for LLMs, which aligns with model compression and efficiency breakthroughs. The co-design of algorithm and hardware adds novelty.

  21. Maximum Redundancy Pruning: A Principle-Driven Layerwise Sparsity Allocation for LLMs - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper proposes a principle-driven pruning method for LLMs, which aligns with foundational research in model compression and efficiency.

  22. TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper introduces a token pruning method for vision-language models, focusing on inference optimization and compatibility with FlashAttention, aligning with model compression and efficiency.

  23. Decoupling Angles and Strength in Low-rank Adaptation - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper introduces DeLoRA, a novel parameter-efficient fine-tuning method that aligns with model compression and efficiency breakthroughs.

  24. Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper introduces RaNA adapters for improving inference efficiency in Transformers, focusing on low-rank matrix decompositions and adaptive masking. This aligns with model compression and architectural efficiency.

  25. Optimal Neural Compressors for the Rate-Distortion-Perception Tradeoff - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: Addresses neural compression with a focus on the rate-distortion-perception tradeoff, providing theoretical insights and novel methods like lattice coding and shared randomness, aligning well with model compression criteria.

  26. Improving Quantization with Post-Training Model Expansion - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: Explores post-training model expansion to improve quantization, which is a novel approach within the model compression domain and aligns with foundational research in efficiency improvements.

  27. Variance Control via Weight Rescaling in LLM Pre-training - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper introduces new variance control strategies (LIR and TVR) for LLM pretraining, which aligns with foundational research in LLM behavior and efficiency improvements.

  28. Efficient Knowledge Distillation via Curriculum Extraction - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper introduces a curriculum extraction method for knowledge distillation, which aligns with model compression and efficiency. It provides theoretical guarantees and demonstrates practical benefits.

  29. Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs - Score: 17 (R=9, N=8) - Date: 2025-03-24 - Comment: The paper proposes a sparse logit sampling method for knowledge distillation in LLMs, which aligns with model compression and efficiency. The use of importance sampling for unbiased estimates is a novel contribution.

  30. Accelerating Transformer Inference and Training with 2:4 Activation Sparsity - Score: 17 (R=9, N=8) - Date: 2025-03-24 - Comment: The paper explores activation sparsity in Transformers for efficiency, which aligns with foundational research in model compression and efficiency.

  31. Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-21 - Comment: The paper proposes a hybrid-level token compression strategy for MLLMs, which aligns with foundational research in model compression and efficiency.

  32. xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper introduces xLSTM 7B, a recurrent LLM architecture optimized for efficient inference. This aligns with the core topic of model architecture innovations.

  33. ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper proposes a novel compression paradigm, ClusComp, which aligns with model compression criteria by addressing quantization and efficient finetuning with theoretical contributions.

  34. ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper introduces ZO2, a zeroth-order fine-tuning framework for LLMs, which aligns with model compression and efficiency breakthroughs by enabling fine-tuning of extremely large models with limited GPU memory.

  35. Discovering uncertainty: Gaussian constitutive neural networks with correlated weights - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: Gaussian constitutive neural networks enhance interpretability and tackle parameter uncertainty, showing foundational advancements in sparse/low-rank methods for AI.

  36. FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: FW-Merging innovates in model merging via constrained optimization techniques, aligning significantly with foundational research in model efficiency and architecture-level improvements.

  37. Localized Concept Erasure for Text-to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper introduces a training-free low-rank adaptation method for concept erasure in diffusion models, aligning with model compression and efficiency research.

  38. PREAMBLE: Private and Efficient Aggregation of Block Sparse Vectors and Applications - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: Introduces PREAMBLE for efficient aggregation of block-sparse vectors, aligning with model compression and sparsity criteria.

  39. Multi-View Node Pruning for Accurate Graph Representation - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: Proposes a multi-view pruning method for graph representation learning, aligning with representation learning and sparsity criteria.

  40. Limits of KV Cache Compression for Tensor Attention based Autoregressive Transformers - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper analyzes KV cache compression limits in tensor attention, aligning with foundational research in model compression and efficiency.

  41. Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper presents Samoyeds, focusing on structured sparsity in MoE models, which aligns closely with model compression and efficiency breakthroughs.

  42. ZeroMerge: Parameter-Free KV Cache Compression for Memory-Efficient Long-Context LLMs - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper proposes a parameter-free KV cache compression method for LLMs, contributing to foundational research in model compression and efficiency.

  43. Explainable Bayesian deep learning through input-skip Latent Binary Bayesian Neural Networks - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper introduces input-skip Latent Binary Bayesian Neural Networks, contributing to sparsity and model compression with theoretical guarantees.

  44. KV-Distill: Nearly Lossless Learnable Context Compression for LLMs - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: KV-Distill introduces a compression framework for LLMs, aligning with 'Model Compression' due to its focus on efficient context representation.

  45. GRU: Mitigating the Trade-off between Unlearning and Retention for Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-13 - Comment: The paper introduces Gradient Rectified Unlearning (GRU) for LLMs, focusing on unlearning while retaining general functionality. This aligns with foundational advancements in LLM behavior and optimization.

  46. LLMs Know What to Drop: Self-Attention Guided KV Cache Eviction for Efficient Long-Context Inference - Score: 17 (R=9, N=8) - Date: 2025-03-13 - Comment: The paper introduces a KV cache eviction method for efficient long-context inference in LLMs, aligning with the 'Model Compression' criterion due to its focus on memory efficiency and sparsity.

  47. ELECTRA: A Symmetry-breaking Cartesian Network for Charge Density Prediction with Floating Orbitals - Score: 17 (R=9, N=8) - Date: 2025-03-12 - Comment: The paper introduces a symmetry-breaking equivariant model for predicting electronic charge densities, which is foundational in AI for science and introduces a novel generative paradigm.

  48. Accurate INT8 Training Through Dynamic Block-Level Fallback - Score: 17 (R=9, N=8) - Date: 2025-03-12 - Comment: The paper proposes a dynamic fallback quantization method for INT8 training, which aligns with the model compression criterion by addressing efficiency and robustness in low-bit training.

  49. EFPC: Towards Efficient and Flexible Prompt Compression - Score: 17 (R=9, N=8) - Date: 2025-03-12 - Comment: The paper proposes a novel prompt compression method for LLMs, which aligns with foundational research in model compression and efficiency.

  50. SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs - Score: 17 (R=9, N=8) - Date: 2025-03-12 - Comment: The paper introduces SplitQuantV2, a novel low-bit quantization method for LLMs, which aligns with the model compression criterion and demonstrates practical efficiency improvements.

  51. MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration - Score: 17 (R=9, N=8) - Date: 2025-03-12 - Comment: The paper focuses on a novel static quantization framework for LLMs, which aligns with the model compression criterion, particularly in sparsity and quantization.

  52. Understanding the Learning Dynamics of LoRA: A Gradient Flow Perspective on Low-Rank Adaptation in Matrix Factorization - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper provides theoretical insights into the learning dynamics of LoRA, which aligns with representation learning and low-rank adaptation in model compression.

  53. Task Vector Quantization for Memory-Efficient Model Merging - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper introduces a memory-efficient model merging method using task vector quantization, which aligns with model compression and efficiency breakthroughs.

  54. Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper introduces a novel data-free delta compression method inspired by JPEG compression, which aligns with model compression and efficiency breakthroughs.

  55. Towards Superior Quantization Accuracy: A Layer-sensitive Approach - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: This paper proposes a layer-sensitive approach to quantization, which directly aligns with the model compression criterion. The methods SensiBoost and KurtBoost provide novel insights into layer-specific quantization strategies, improving accuracy with minimal memory overhead.

  56. Seesaw: High-throughput LLM Inference via Model Re-sharding - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper introduces a dynamic re-sharding technique for LLM inference, which aligns with model compression and efficiency breakthroughs.

  57. Sample-aware Adaptive Structured Pruning for Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper proposes a structured pruning framework for LLMs, which aligns with the model compression criterion. The use of adaptive methods adds novelty to the pruning process.

  58. IDEA Prune: An Integrated Enlarge-and-Prune Pipeline in Generative Language Model Pretraining - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper proposes an integrated enlarge-and-prune pipeline for generative language model pretraining, which aligns with foundational research in model compression.

  59. Balcony: A Lightweight Approach to Dynamic Inference of Generative Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-10 - Comment: The paper introduces Balcony, a framework for dynamic inference in LLMs, which aligns with the model compression and efficiency criterion through its innovative depth-based dynamic inference approach.

  60. Wanda++: Pruning Large Language Models via Regional Gradients - Score: 17 (R=9, N=8) - Date: 2025-03-10 - Comment: The paper introduces Wanda++, a pruning framework for LLMs, which aligns with model compression and sparsity. The use of regional gradients is a novel approach.

  61. Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning - Score: 17 (R=9, N=8) - Date: 2025-03-10 - Comment: The paper proposes task-aware KV cache compression, which aligns with model compression and efficiency improvements in LLMs. The task-aware approach is a novel contribution.

  62. TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation - Score: 17 (R=9, N=8) - Date: 2025-03-10 - Comment: The paper introduces a novel Branch-Merge distillation approach for model compression, which aligns with the model compression criterion, particularly in the context of LLMs.

  63. Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper proposes a novel entropy-weighted quantization method for LLMs, which aligns with the model compression criterion. The findings on entropy and precision requirements are insightful and relevant.

  64. How can representation dimension dominate structurally pruned LLMs? - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper investigates the role of representation dimension in pruned LLMs, providing foundational insights into structured pruning and its impact on model performance.

  65. PowerAttention: Exponentially Scaling of Receptive Fields for Effective Sparse Attention - Score: 17 (R=9, N=8) - Date: 2025-03-06 - Comment: The paper introduces PowerAttention, a sparse attention mechanism for LLMs that improves efficiency and scalability. This aligns with the 'Model Compression' criterion, focusing on efficiency breakthroughs in attention mechanisms.

  66. Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression - Score: 17 (R=9, N=8) - Date: 2025-03-05 - Comment: The paper introduces Q-Filters, a novel KV Cache compression method leveraging QK geometry, which aligns with the model compression criterion. It provides theoretical insights and demonstrates compatibility with FlashAttention, making it highly relevant.

  67. An Accelerated Alternating Partial Bregman Algorithm for ReLU-based Matrix Decomposition - Score: 17 (R=9, N=8) - Date: 2025-03-05 - Comment: The paper introduces a novel matrix decomposition framework with theoretical contributions to sparsity and low-rank methods, which aligns with model compression and representation learning.

  68. Pruning Deep Neural Networks via a Combination of the Marchenko-Pastur Distribution and Regularization - Score: 17 (R=9, N=8) - Date: 2025-03-05 - Comment: The paper uses Random Matrix Theory for pruning DNNs, aligning with the model compression criterion and providing both theoretical and empirical contributions.

  69. Identifying Sensitive Weights via Post-quantization Integral - Score: 17 (R=9, N=8) - Date: 2025-03-05 - Comment: The paper proposes a novel sensitivity metric (PQI) for post-training quantization, which is highly relevant to model compression and efficiency.

  70. CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging - Score: 17 (R=9, N=8) - Date: 2025-03-05 - Comment: The paper introduces a sparsification framework (CABS) for model merging, which aligns with model compression and sparsity-related research.

  71. When Can You Get Away with Low Memory Adam? - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper introduces SlimAdam, a memory-efficient variant of Adam optimizer, which aligns with the model compression criterion by addressing memory efficiency through a novel SNR-based approach.

  72. RSQ: Learning from Important Tokens Leads to Better Quantized LLMs - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper proposes a novel quantization method (RSQ) for LLMs, focusing on token importance and efficiency, which aligns with model compression and efficiency breakthroughs.

  73. EliteKV: Scalable KV Cache Compression via RoPE Frequency Selection and Joint Low-Rank Projection - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: EliteKV proposes a novel KV cache compression method for RoPE-based models, which aligns with foundational research in model compression and efficiency.

  74. Revisiting Large Language Model Pruning using Neuron Semantic Attribution - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper revisits pruning in LLMs using neuron semantic attribution, which aligns with model compression and provides insights into pruning behavior.

  75. KurTail : Kurtosis-based LLM Quantization - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: This paper introduces a novel quantization method for LLMs, addressing outliers and optimizing memory efficiency. It aligns with the model compression criterion, particularly in quantization and efficiency breakthroughs.

  76. Parameter-Efficient Fine-Tuning of Large Language Models via Deconvolution in Subspace - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper proposes a parameter-efficient fine-tuning method (DCFT) for LLMs, which aligns with foundational research in model efficiency.

  77. Dialogue Without Limits: Constant-Sized KV Caches for Extended Responses in LLMs - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper introduces MorphKV, a novel inference-time technique for maintaining constant-sized KV caches in LLMs, addressing memory efficiency and accuracy trade-offs. This aligns with the 'Model Compression' criterion, particularly in the context of KV cache optimization.

  78. LoR2C : Low-Rank Residual Connection Adaptation for Parameter-Efficient Fine-Tuning - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper introduces a novel low-rank residual connection adaptation for parameter-efficient fine-tuning, which aligns with model compression and efficiency breakthroughs.

  79. Progressive Sparse Attention: Algorithm and System Co-design for Efficient Attention in LLM Serving - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper introduces Progressive Sparse Attention (PSA) for efficient attention in LLMs, focusing on reducing KV cache usage and improving inference efficiency. This aligns with model compression and efficiency breakthroughs.

  80. KVCrush: Key value cache size-reduction using similarity in head-behaviour - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper proposes a KV cache compression method for LLMs, addressing memory efficiency with minimal accuracy loss. This aligns with the model compression criterion, particularly in KV cache optimization.

  81. Training LLMs with MXFP4 - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: The paper focuses on low-precision training with MXFP4, which aligns with the model compression criterion, specifically addressing efficiency breakthroughs through stochastic rounding and variance reduction techniques.

  82. Stochastic Rounding for LLM Training: Theory and Practice - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: The paper explores stochastic rounding for LLM training, providing theoretical insights into implicit regularization and convergence. This aligns with the 'Large Language Models' criterion, focusing on foundational efficiency improvements.

  83. Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing - Score: 16 (R=9, N=7) - Date: 2025-03-21 - Comment: The paper explores attention pruning for bias mitigation in LLMs, which aligns with foundational research in model compression and efficiency.

  84. Lightweight Software Kernels and Hardware Extensions for Efficient Sparse Deep Neural Networks on Microcontrollers - Score: 16 (R=9, N=7) - Date: 2025-03-11 - Comment: The paper focuses on sparsity and pruning techniques for efficient DNNs on microcontrollers, aligning with the model compression criterion.

  85. AdaRank: Adaptive Rank Pruning for Enhanced Model Merging - Score: 16 (R=8, N=8) - Date: 2025-03-31 - Comment: The paper proposes AdaRank, a framework for adaptive rank pruning in model merging, which aligns with the model compression criterion by addressing low-rank approaches and pruning strategies. It provides insights into mitigating task interference during model merging.

  86. Stochastic Engrams for Efficient Continual Learning with Binarized Neural Networks - Score: 16 (R=8, N=8) - Date: 2025-03-28 - Comment: The paper proposes a neuroscience-inspired approach to continual learning using binarized neural networks, which aligns with model compression (binarization) and sparsity. The integration of stochastic engrams adds a novel perspective.

  87. Gumbel-Softmax Flow Matching with Straight-Through Guidance for Controllable Biological Sequence Generation - Score: 16 (R=8, N=8) - Date: 2025-03-24 - Comment: The paper introduces a novel generative framework for biological sequence generation, which aligns with foundational research in AI for Science.

  88. Token Dynamics: Towards Efficient and Dynamic Video Token Representation for Video Large Language Models - Score: 16 (R=8, N=8) - Date: 2025-03-24 - Comment: The paper introduces a novel token reduction framework for video representation in large language models, which aligns with architectural innovations and efficiency improvements. The focus on extreme token reduction is a promising direction.

  89. Universal approximation property of neural stochastic differential equations - Score: 16 (R=8, N=8) - Date: 2025-03-21 - Comment: The paper identifies neural networks capable of approximating continuous functions under linear growth constraints, aligning with foundational research in neural network theory.

  90. Efficient Personalization of Quantized Diffusion Model without Backpropagation - Score: 16 (R=8, N=8) - Date: 2025-03-20 - Comment: The paper addresses memory-efficient fine-tuning of quantized diffusion models using zeroth-order optimization, which aligns with the model compression criterion, particularly in terms of efficiency and low-resource adaptation.

  91. Quantum-Enhanced LLM Efficient Fine Tuning - Score: 16 (R=8, N=8) - Date: 2025-03-18 - Comment: The paper proposes a quantum-enhanced fine-tuning method, which aligns with model compression and efficiency breakthroughs, particularly in low-rank approaches.

  92. Proof-Driven Clause Learning in Neural Network Verification - Score: 16 (R=8, N=8) - Date: 2025-03-18 - Comment: The paper proposes a novel conflict-driven clause learning approach for DNN verification, which aligns with foundational research in model efficiency and scalability.

  93. FlowKac: An Efficient Neural Fokker-Planck solver using Temporal Normalizing flows and the Feynman Kac-Formula - Score: 16 (R=8, N=8) - Date: 2025-03-17 - Comment: The paper introduces a novel approach to solving the Fokker-Planck equation using temporal normalizing flows, which aligns with foundational research in representation learning and efficiency improvements.

  94. Riemannian Geometric-based Meta Learning - Score: 16 (R=8, N=8) - Date: 2025-03-17 - Comment: The Stiefel-MAML approach provides novel insights using Riemannian geometry for meta-learning, advancing foundational algorithmic methodologies for learning paradigms.

  95. Thermodynamic Bound on Energy and Negentropy Costs of Inference in Deep Neural Networks - Score: 16 (R=8, N=8) - Date: 2025-03-14 - Comment: The paper derives thermodynamic bounds for inference in DNNs, contributing to foundational insights into efficiency and theoretical limits.

  96. Robust Multi-Objective Controlled Decoding of Large Language Models - Score: 16 (R=8, N=8) - Date: 2025-03-13 - Comment: The paper proposes a novel inference-time algorithm for multi-objective decoding in LLMs, which aligns with the 'Large Language Models' criterion due to its focus on theoretical improvements in decoding strategies.

  97. Training Plug-n-Play Knowledge Modules with Deep Context Distillation - Score: 16 (R=8, N=8) - Date: 2025-03-13 - Comment: The paper proposes a novel method for modularizing knowledge in LLMs using parameter-efficient LoRA modules, which aligns with the 'Large Language Models' criterion due to its focus on foundational improvements in knowledge integration.

  98. Ideas in Inference-time Scaling can Benefit Generative Pre-training Algorithms - Score: 16 (R=8, N=8) - Date: 2025-03-11 - Comment: The paper discusses inference-time scaling for generative pre-training algorithms, which aligns with foundational research on efficiency and generative paradigms.

  99. Nearly Optimal Differentially Private ReLU Regression - Score: 16 (R=8, N=8) - Date: 2025-03-11 - Comment: The paper investigates differentially private ReLU regression, which is a foundational topic in model efficiency and privacy. It provides theoretical insights into optimal utility bounds and relaxes prior assumptions, making it relevant to model compression and efficiency.

  100. State-offset Tuning: State-based Parameter-Efficient Fine-Tuning for State Space Models - Score: 16 (R=8, N=8) - Date: 2025-03-06 - Comment: The paper proposes a state-based parameter-efficient fine-tuning method for State Space Models, which aligns with foundational research in model compression and architectural innovations. The method is novel and leverages the unique characteristics of SSMs.

  101. Early-Stopped Mirror Descent for Linear Regression over Convex Bodies - Score: 16 (R=8, N=8) - Date: 2025-03-06 - Comment: The paper provides a theoretical analysis of early-stopped mirror descent for linear regression over convex bodies, offering insights into optimization and regularization. This aligns with 'Emerging Trends' as it challenges assumptions about regularization methods.

  102. AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model - Score: 16 (R=8, N=8) - Date: 2025-03-06 - Comment: The paper addresses post-training quantization for the Segment Anything Model, which aligns with model compression techniques. The proposed hybrid quantization and channel-aware grouping are novel contributions.

  103. The Distributionally Robust Optimization Model of Sparse Principal Component Analysis - Score: 16 (R=8, N=8) - Date: 2025-03-05 - Comment: The paper addresses sparse PCA using a distributionally robust optimization model, which aligns with foundational research in sparse methods and optimization.

  104. PaCA: Partial Connection Adaptation for Efficient Fine-Tuning - Score: 16 (R=8, N=8) - Date: 2025-03-05 - Comment: The paper proposes a parameter-efficient fine-tuning method (PaCA) that improves training speed and memory usage, aligning with the model compression criterion.

  105. Unsupervised Parameter Efficient Source-free Post-pretraining - Score: 16 (R=8, N=8) - Date: 2025-03-03 - Comment: The paper introduces a parameter-efficient method for adapting large models in a source-free setting, aligning with 'Model Compression' and efficiency breakthroughs.

  106. MSPLoRA: A Multi-Scale Pyramid Low-Rank Adaptation for Efficient Model Fine-Tuning - Score: 15 (R=8, N=7) - Date: 2025-03-31 - Comment: The paper proposes a hierarchical low-rank adaptation method (MSPLoRA) for efficient fine-tuning, which aligns with the 'Model Compression' criterion. The multi-scale approach and validation through SVD analysis add novelty.

  107. Fwd2Bot: LVLM Visual Token Compression with Double Forward Bottleneck - Score: 15 (R=8, N=7) - Date: 2025-03-29 - Comment: The paper introduces a novel compression method for visual tokens in LVLMs, focusing on efficiency and representation strength. This aligns with model compression and efficiency breakthroughs.

  108. Faster Parameter-Efficient Tuning with Token Redundancy Reduction - Score: 15 (R=8, N=7) - Date: 2025-03-27 - Comment: This paper introduces a token redundancy reduction module for parameter-efficient tuning, which aligns with model compression and efficiency improvements. The focus on reducing inference latency and memory usage is a notable contribution.

  109. Neuromorphic Principles for Efficient Large Language Models on Intel Loihi 2 - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: The paper explores neuromorphic principles for efficient LLMs, which aligns with model compression and efficiency breakthroughs, particularly through hardware-aware innovations.

  110. Dynamic Gradient Sparse Update for Edge Training - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: The paper proposes a dynamic gradient sparse update method for edge training, which aligns with model compression and efficiency breakthroughs.

  111. ConSol: Sequential Probability Ratio Testing to Find Consistent LLM Reasoning Paths Efficiently - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: The paper introduces a method to improve efficiency in reasoning paths for LLMs, which is relevant to foundational research in LLM efficiency and reasoning.

  112. PRIOT: Pruning-Based Integer-Only Transfer Learning for Embedded Systems - Score: 15 (R=8, N=7) - Date: 2025-03-24 - Comment: The paper introduces a pruning-based integer-only training method for embedded systems, which aligns with model compression topics like pruning and quantization, making it relevant to foundational research.

  113. An Accelerated Bregman Algorithm for ReLU-based Symmetric Matrix Decomposition - Score: 15 (R=8, N=7) - Date: 2025-03-24 - Comment: The paper focuses on low-rank structure and sparsity in symmetric matrix decomposition, which aligns with the model compression and representation learning criteria.

  114. Efficient ANN-Guided Distillation: Aligning Rate-based Features of Spiking Neural Networks through Hybrid Block-wise Replacement - Score: 15 (R=8, N=7) - Date: 2025-03-24 - Comment: The paper proposes a novel ANN-SNN distillation framework with a block-wise replacement strategy, which aligns with foundational research in model compression and efficiency.

  115. KVShare: Semantic-Aware Key-Value Cache Sharing for Efficient Large Language Model Inference - Score: 15 (R=8, N=7) - Date: 2025-03-24 - Comment: The paper introduces KVShare for efficient LLM inference, which aligns with foundational research in model compression and efficiency.

  116. QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper proposes post-training quantization for depth estimation, which aligns with foundational research in model compression and efficiency.

  117. Efficient Training of Neural Fractional-Order Differential Equation via Adjoint Backpropagation - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper introduces an adjoint backpropagation method for training neural fractional-order differential equations, which offers efficiency improvements and theoretical insights into training dynamics.

  118. Bezier Distillation - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper discusses Bezier distillation in flow models, which aligns with foundational research in model compression and efficiency.

  119. Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper proposes a framework for optimizing long chain-of-thought reasoning in LLMs, which aligns with foundational research in LLM behavior and reasoning capabilities.

  120. VP-NTK: Exploring the Benefits of Visual Prompting in Differentially Private Data Synthesis - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper explores visual prompting in differentially private data synthesis, which aligns with foundational research in model compression and efficiency.

  121. Squeeze Out Tokens from Sample for Finer-Grained Data Governance - Score: 15 (R=8, N=7) - Date: 2025-03-20 - Comment: The paper discusses a novel approach to data governance by introducing finer-grained intra-sample compression and purification, which aligns with the model compression criterion, particularly in terms of efficiency and sparsity.

  122. Layer-wise Adaptive Gradient Norm Penalizing Method for Efficient and Accurate Deep Learning - Score: 15 (R=8, N=7) - Date: 2025-03-19 - Comment: The paper proposes a layer-wise gradient norm penalizing method to improve computational efficiency, which aligns with model compression and training dynamics.

  123. Robust Machine Unlearning for Quantized Neural Networks via Adaptive Gradient Reweighting with Similar Labels - Score: 15 (R=8, N=7) - Date: 2025-03-19 - Comment: Q-MUL addresses machine unlearning for quantized models, which is relevant to model compression and efficiency improvements.

  124. Learning local neighborhoods of non-Gaussian graphical models: A measure transport approach - Score: 15 (R=8, N=7) - Date: 2025-03-19 - Comment: The paper introduces a novel method (L-SING) for identifying conditional independence relationships in non-Gaussian graphical models, leveraging sparsity and transport maps. This aligns with the representation learning criterion, particularly in sparse methods and training dynamics.

  125. ML-SpecQD: Multi-Level Speculative Decoding with Quantized Drafts - Score: 15 (R=8, N=7) - Date: 2025-03-19 - Comment: The paper introduces multi-level speculative decoding with quantized drafts, which aligns with model compression and efficiency improvements.

  126. Scale Efficient Training for Large Datasets - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper proposes a dynamic sample pruning approach for efficient training on large datasets, which aligns with model compression and efficiency breakthroughs.

  127. Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper provides insights into how multimodal large language models process visual information and introduces a pruning method for efficient inference. The analysis of visual information flow aligns with foundational research in model efficiency and sparsity, making it relevant.

  128. ROMA: a Read-Only-Memory-based Accelerator for QLoRA-based On-Device LLM - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper introduces a hardware accelerator for QLoRA-based on-device LLMs, which aligns with model compression and efficiency breakthroughs.

  129. SparseLUT: Sparse Connectivity Optimization for Lookup Table-based Deep Neural Networks - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: SparseLUT introduces a novel connectivity optimization technique for LUT-based DNNs, focusing on sparsity and pruning. This aligns with the model compression criterion.

  130. An Optimization Framework for Differentially Private Sparse Fine-Tuning - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper focuses on sparse fine-tuning under differential privacy, which aligns with the model compression criterion, particularly in sparsity and efficiency breakthroughs.

  131. MEADOW: Memory-efficient Dataflow and Data Packing for Low Power Edge LLMs - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: MEADOW introduces a memory-efficient dataflow and weight packing strategy for LLMs on edge devices. This aligns with model compression and efficiency breakthroughs.

  132. Adaptive Stochastic Gradient Descents on Manifolds with an Application on Weighted Low-Rank Approximation - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper applies stochastic gradient descents on manifolds to weighted low-rank approximation, which aligns with model compression and efficiency research.

  133. Make Optimization Once and for All with Fine-grained Guidance - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper discusses a general framework for learning optimization, which aligns with foundational research in optimization methods.

  134. Asynchronous Sharpness-Aware Minimization For Fast and Accurate Deep Learning - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: Proposes asynchronous sharpness-aware minimization, aligning with training dynamics and efficiency criteria.

  135. Statistical Impossibility and Possibility of Aligning LLMs with Human Preferences: From Condorcet Paradox to Nash Equilibrium - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper discusses statistical limits of aligning LLMs with human preferences, which provides theoretical insights into LLM behavior.

  136. FedOSAA: Improving Federated Learning with One-Step Anderson Acceleration - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper proposes a federated learning method with Anderson acceleration, aligning with foundational research in model efficiency.

  137. OuroMamba: A Data-Free Quantization Framework for Vision Mamba Models - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper presents a data-free quantization framework for vision models, contributing to foundational research in model compression and efficiency.

  138. Structured Preconditioners in Adaptive Optimization: A Unified Analysis - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper provides a unified analysis of structured preconditioners in adaptive optimization, contributing to foundational insights into model efficiency and optimization.

  139. Sample Compression for Continual Learning - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper introduces a sample compression method for continual learning, which aligns with foundational research in model compression and efficiency.

  140. Numerical Error Analysis of Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper provides theoretical analysis on numerical errors in LLMs, which aligns with foundational research in model efficiency and robustness.

  141. Exploiting Unstructured Sparsity in Fully Homomorphic Encrypted DNNs - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper explores unstructured sparsity in FHE DNNs, which aligns with the model compression criterion. The focus on sparsity and performance gains in encrypted environments is relevant.

  142. Adaptive Temperature Based on Logits Correlation in Knowledge Distillation - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper proposes an adaptive temperature method for knowledge distillation, which aligns with model compression and efficiency improvements.

  143. Quantitative Analysis of Deeply Quantized Tiny Neural Networks Robust to Adversarial Attacks - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper explores quantization-aware training and adversarial robustness in tiny neural networks, which aligns with model compression and efficiency breakthroughs.

  144. A Triple-Inertial Accelerated Alternating Optimization Method for Deep Learning Training - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper introduces a novel optimization framework (TIAM) for neural network training, which could provide insights into training dynamics and efficiency improvements.

  145. Scaling Probabilistic Circuits via Data Partitioning - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper introduces Federated Circuits (FCs) for scaling probabilistic circuits, which aligns with foundational research in model efficiency and scalability.

  146. TokenButler: Token Importance is Predictable - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper introduces a token importance predictor for KV-cache optimization, which aligns with model compression and efficiency improvements in LLMs.

  147. Enhancing Layer Attention Efficiency through Pruning Redundant Retrievals - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper proposes a method to enhance layer attention efficiency through pruning, which aligns with foundational research in model compression and efficiency.

  148. TPU-Gen: LLM-Driven Custom Tensor Processing Unit Generator - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper introduces an LLM-driven framework for TPU generation, which aligns with foundational research in model compression and efficiency.

  149. Extrapolation Merging: Keep Improving With Extrapolation and Merging - Score: 15 (R=8, N=7) - Date: 2025-03-10 - Comment: The extrapolation merging paradigm for improving model performance without additional resources is relevant to foundational research in model optimization and efficiency.

  150. IDInit: A Universal and Stable Initialization Method for Neural Network Training - Score: 15 (R=8, N=7) - Date: 2025-03-07 - Comment: The paper proposes a novel initialization method for neural networks, which aligns with foundational research in training dynamics and stability.

  151. LEWIS (LayEr WIse Sparsity) -- A Training Free Guided Model Merging Approach - Score: 15 (R=8, N=7) - Date: 2025-03-07 - Comment: The paper introduces a sparsity-based model merging approach, which aligns with the model compression criterion. The method is novel in its use of layer-wise sparsity for task-specific performance improvements.

  152. DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models - Score: 15 (R=8, N=7) - Date: 2025-03-05 - Comment: The paper proposes a diversity-based token pruning method for multimodal models, which aligns with the model compression criterion, particularly in reducing redundancy and improving efficiency.

  153. Online Pseudo-average Shifting Attention(PASA) for Robust Low-precision LLM Inference: Algorithms and Numerical Analysis - Score: 15 (R=8, N=7) - Date: 2025-03-05 - Comment: The paper proposes a low-precision algorithm (PASA) for efficient attention computation in LLMs, which aligns with model compression and efficiency breakthroughs.

  154. Cauchy-Schwarz Regularizers - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper introduces Cauchy-Schwarz regularizers, which align with foundational research in model compression and efficiency.

  155. Regularization-based Framework for Quantization-, Fault- and Variability-Aware Training - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper proposes a regularization-based framework for quantization-aware training, which aligns with model compression and efficiency topics.

  156. ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Parity LLM Data Valuation - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper proposes a scalable data valuation method for LLMs, which is relevant to foundational research in LLM efficiency and data optimization.

  157. Personalize Your LLM: Fake it then Align it - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper introduces SlimAdam, a memory-efficient variant of Adam optimizer, aligning with foundational research in model compression and efficiency.

  158. Parameter Expanded Stochastic Gradient Markov Chain Monte Carlo - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper proposes a reparameterization method to enhance sample diversity in SGMCMC for Bayesian Neural Networks, which aligns with foundational research in training dynamics and efficiency.

  159. Constraining Sequential Model Editing with Editing Anchor Compression - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper introduces Editing Anchor Compression (EAC) to address sequential model editing in LLMs, which aligns with foundational research in model compression and efficiency. The focus on preserving general abilities while editing is a novel contribution.

  160. FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: FlexPrefill introduces a context-aware sparse attention mechanism for efficient long-sequence inference, which is relevant to model compression and efficiency improvements in LLMs.

  161. LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: The paper introduces LiteASR, a low-rank compression scheme for ASR encoders, which aligns with the 'Model Compression' criterion by leveraging low-rank approximations for efficiency improvements.

  162. AMUN: Adversarial Machine UNlearning - Score: 15 (R=7, N=8) - Date: 2025-03-04 - Comment: The paper introduces a novel adversarial machine unlearning method, which is relevant to model compression and efficiency, particularly in terms of fine-tuning and decision boundary adjustments.

  163. MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning - Score: 14 (R=7, N=7) - Date: 2025-03-25 - Comment: The paper introduces a sparse tuning method (MoST) for 3D representation learning, which aligns with model compression and sparsity topics. However, its focus on 3D-specific applications reduces its broader foundational relevance.

  164. Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models - Score: 14 (R=7, N=7) - Date: 2025-03-19 - Comment: The paper proposes a hierarchical reward model for reasoning in LLMs, which aligns with foundational insights into improving reasoning capabilities in LLMs.

  165. Permutation Learning with Only N Parameters: From SoftSort to Self-Organizing Gaussians - Score: 14 (R=7, N=7) - Date: 2025-03-18 - Comment: The paper introduces a novel method for permutation learning with reduced memory requirements, which could have implications for efficiency in foundational models.

  166. AttentionRAG: Attention-Guided Context Pruning in Retrieval-Augmented Generation - Score: 14 (R=7, N=7) - Date: 2025-03-14 - Comment: The paper proposes AttentionRAG, focusing on context pruning in retrieval-augmented generation, which is relevant to model compression and efficiency.

  167. Adaptive Moment Estimation Optimization Algorithm Using Projection Gradient for Deep Learning - Score: 14 (R=7, N=7) - Date: 2025-03-14 - Comment: The paper proposes a novel optimization algorithm (PadamP) for training deep networks, which contributes to foundational research in training dynamics.

  168. Robust Conformal Prediction with a Single Binary Certificate - Score: 14 (R=7, N=7) - Date: 2025-03-10 - Comment: The paper introduces a novel approach to robust conformal prediction, which aligns with foundational research in efficiency and robustness in machine learning.

  169. Continual Optimization with Symmetry Teleportation for Multi-Task Learning - Score: 14 (R=7, N=7) - Date: 2025-03-07 - Comment: The paper introduces a novel optimization method for multi-task learning using low-rank adapters, which aligns with foundational research in model architecture and training dynamics.

  170. Frankenstein Optimizer: Harnessing the Potential by Revisiting Optimization Tricks - Score: 14 (R=7, N=7) - Date: 2025-03-05 - Comment: The paper proposes a novel optimizer (Frankenstein) and provides insights into optimization dynamics, which could be relevant to training dynamics in neural networks.

  171. Accelerated Distributed Optimization with Compression and Error Feedback - Score: 13 (R=7, N=6) - Date: 2025-03-12 - Comment: The paper proposes a distributed optimization algorithm with compression and error feedback, which aligns with model compression and efficiency but is not groundbreaking.

  172. ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs - Score: 13 (R=7, N=6) - Date: 2025-03-03 - Comment: ByteScale introduces a novel parallelism strategy for efficient LLM training, which is relevant to model compression and efficiency but focuses more on engineering optimizations than foundational breakthroughs.

  173. Fuzzy Speculative Decoding for a Tunable Accuracy-Runtime Tradeoff - Score: 13 (R=7, N=6) - Date: 2025-03-03 - Comment: Fuzzy Speculative Decoding provides a novel tradeoff mechanism for accuracy and runtime in LLM inference, which is relevant to model efficiency but lacks broader foundational insights.

High Performance Computing (51)

  1. Offline Model-Based Optimization: Comprehensive Review - Score: 20.0 (R=0, N=0) - Date: 2025-03-24 - Comment: Author match

  2. Denoising Hamiltonian Network for Physical Reasoning - Score: 20.0 (R=0, N=0) - Date: 2025-03-11 - Comment: Author match

  3. Learning Decision Trees as Amortized Structure Inference - Score: 20.0 (R=0, N=0) - Date: 2025-03-11 - Comment: Author match

  4. Cognitive Activation and Chaotic Dynamics in Large Language Models: A Quasi-Lyapunov Analysis of Reasoning Mechanisms - Score: 19 (R=10, N=9) - Date: 2025-03-17 - Comment: The paper proposes a chaos theory framework for understanding LLM reasoning mechanisms, aligning closely with foundational research in LLM behavior.

  5. Squared families: Searching beyond regular probability models - Score: 18 (R=9, N=9) - Date: 2025-03-29 - Comment: The paper introduces squared families, a novel statistical framework with foundational insights into probability models and their properties. It aligns with the 'Emerging Trends' criterion due to its theoretical contributions challenging established assumptions in statistical modeling.

  6. Self-Organizing Graph Reasoning Evolves into a Critical State for Continuous Discovery Through Structural-Semantic Dynamics - Score: 18 (R=9, N=9) - Date: 2025-03-25 - Comment: This paper provides theoretical insights into self-organizing graph reasoning systems and their evolution into a critical state, which aligns with emerging trends and foundational research. The entropy-based principle governing adaptability and innovation is novel and interdisciplinary.

  7. Intelligence Sequencing and the Path-Dependence of Intelligence Evolution: AGI-First vs. DCI-First as Irreversible Attractors - Score: 18 (R=9, N=9) - Date: 2025-03-25 - Comment: The paper explores intelligence sequencing and path-dependence in intelligence evolution, introducing a novel theoretical framework. It aligns with emerging trends and challenges established assumptions.

  8. Glivenko-Cantelli for $f$-divergence - Score: 18 (R=9, N=9) - Date: 2025-03-24 - Comment: The paper extends the Glivenko-Cantelli theorem to f-divergences, which is a cutting-edge theoretical contribution and aligns with emerging trends in foundational research.

  9. SuperARC: A Test for General and Super Intelligence Based on First Principles of Recursion Theory and Algorithmic Probability - Score: 18 (R=9, N=9) - Date: 2025-03-24 - Comment: The paper introduces a test for AGI and ASI based on algorithmic probability, which challenges established assumptions and aligns with emerging trends in foundational AI research.

  10. Neural Manifolds and Cognitive Consistency: A New Approach to Memory Consolidation in Artificial Systems - Score: 18 (R=9, N=9) - Date: 2025-03-05 - Comment: The paper introduces a novel framework for memory consolidation inspired by neuroscience, which aligns with foundational research in representation learning and emerging trends.

  11. A Theoretical Framework for Prompt Engineering: Approximating Smooth Functions with Transformer Prompts - Score: 17 (R=9, N=8) - Date: 2025-03-27 - Comment: This paper provides a theoretical framework for prompt engineering, demonstrating how transformer prompts can approximate smooth functions and act as configurable computational systems. It aligns closely with foundational research in LLMs and offers theoretical insights into their behavior and adaptability.

  12. Theory-to-Practice Gap for Neural Networks and Neural Operators - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: Analyzes the theory-to-practice gap in neural networks and neural operators, providing theoretical insights into sampling complexity, which aligns with foundational research.

  13. The global convergence time of stochastic gradient descent in non-convex landscapes: Sharp estimates via large deviations - Score: 17 (R=9, N=8) - Date: 2025-03-21 - Comment: The paper provides theoretical insights into the global convergence time of SGD in non-convex landscapes, which aligns with foundational research in training dynamics of neural networks.

  14. Tuning LLMs by RAG Principles: Towards LLM-native Memory - Score: 17 (R=9, N=8) - Date: 2025-03-21 - Comment: The paper proposes a novel method combining RAG principles with LLM fine-tuning, which aligns with foundational research in LLM architecture and memory integration.

  15. Landscape Complexity for the Empirical Risk of Generalized Linear Models: Discrimination between Structured Data - Score: 17 (R=9, N=8) - Date: 2025-03-19 - Comment: The paper uses random matrix theory to analyze the landscape complexity of empirical risk functions, which is foundational and relevant to understanding training dynamics in neural networks.

  16. Probabilistic Neural Networks (PNNs) with t-Distributed Outputs: Adaptive Prediction Intervals Beyond Gaussian Assumptions - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper introduces t-distributed outputs for PNNs, aligning with foundational research in representation learning and uncertainty quantification.

  17. Neural Tangent Kernel of Neural Networks with Loss Informed by Differential Operators - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper develops NTK theory for physics-informed loss, providing foundational insights into training dynamics and spectral bias.

  18. The Relativity of Causal Knowledge - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper introduces a novel perspective on causal knowledge using category theory, which aligns with emerging trends in foundational research.

  19. Compute Optimal Scaling of Skills: Knowledge vs Reasoning - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper studies skill-dependent scaling laws in LLMs, aligning with 'Large Language Models' due to its theoretical insights into scaling behavior.

  20. Spherical Tree-Sliced Wasserstein Distance - Score: 17 (R=8, N=9) - Date: 2025-03-17 - Comment: Introduces the Spherical Tree-Sliced Wasserstein Distance, a method extending sliced optimal transport in high-dimensional spaces, aligning well with foundational mathematical innovations.

  21. Physics-Informed Deep B-Spline Networks for Dynamical Systems - Score: 16 (R=8, N=8) - Date: 2025-03-24 - Comment: The paper proposes a hybrid framework using B-spline networks for solving PDEs, which is relevant to AI for science and introduces theoretical guarantees, making it foundational.

  22. Verification Learning: Make Unsupervised Neuro-Symbolic System Feasible - Score: 16 (R=8, N=8) - Date: 2025-03-18 - Comment: The paper introduces a novel verification learning paradigm for neuro-symbolic systems, which aligns with emerging trends in foundational AI research.

  23. From Denoising Score Matching to Langevin Sampling: A Fine-Grained Error Analysis in the Gaussian Setting - Score: 16 (R=8, N=8) - Date: 2025-03-17 - Comment: The paper offers a fine-grained theoretical analysis of Langevin sampling methods, contributing to foundational understanding in generative sampling algorithms.

  24. Deep Learning based discovery of Integrable Systems - Score: 16 (R=8, N=8) - Date: 2025-03-14 - Comment: The paper introduces a novel framework for discovering integrable systems using neural networks, which aligns with foundational AI for science research.

  25. Symbolic Neural Ordinary Differential Equations - Score: 16 (R=8, N=8) - Date: 2025-03-12 - Comment: The paper proposes Symbolic Neural ODEs, integrating symbolic and neural approaches for learning dynamical systems, which aligns with foundational research in representation learning.

  26. Understanding the role of autoencoders for stiff dynamical systems using information theory - Score: 16 (R=8, N=8) - Date: 2025-03-11 - Comment: The paper provides insights into how autoencoders encode information in stiff dynamical systems, aligning with representation learning.

  27. Riemann Tensor Neural Networks: Learning Conservative Systems with Physics-Constrained Networks - Score: 16 (R=8, N=8) - Date: 2025-03-04 - Comment: The introduction of Riemann Tensor Neural Networks (RTNNs) aligns with foundational research in model architecture by enforcing physics-constrained inductive biases.

  28. Modeling Arbitrarily Applicable Relational Responding with the Non-Axiomatic Reasoning System: A Machine Psychology Approach - Score: 16 (R=8, N=8) - Date: 2025-03-04 - Comment: This paper introduces a novel theoretical approach to model Arbitrarily Applicable Relational Responding (AARR) using the Non-Axiomatic Reasoning System (NARS). It aligns with emerging trends in integrating behavioral science insights into AI, making it relevant to foundational research.

  29. Effective Skill Unlearning through Intervention and Abstention - Score: 15 (R=8, N=7) - Date: 2025-03-28 - Comment: The paper proposes lightweight, training-free methods for skill unlearning in LLMs, which aligns with foundational research in understanding and controlling LLM behavior.

  30. SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging - Score: 15 (R=8, N=7) - Date: 2025-03-24 - Comment: The paper introduces a selective layer-wise merging method for fine-tuned LLMs, which aligns with foundational research in large language models and safety alignment.

  31. Subgradient Method for System Identification with Non-Smooth Objectives - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper analyzes subgradient methods for system identification with non-smooth objectives, providing theoretical convergence guarantees. It aligns with foundational research in optimization and training dynamics.

  32. Machine learning identifies nullclines in oscillatory dynamical systems - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper introduces a neural network-based method for identifying nullclines in oscillatory systems, which aligns with foundational research in representation learning and interpretability.

  33. Distributed Learning over Arbitrary Topology: Linear Speed-Up with Polynomial Transient Time - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper introduces Spanning Tree Push-Pull (STPP) for distributed learning, which aligns with foundational research in distributed model efficiency and scalability.

  34. Unified Analysis of Decentralized Gradient Descent: a Contraction Mapping Framework - Score: 15 (R=8, N=7) - Date: 2025-03-19 - Comment: The paper provides a novel contraction mapping framework for decentralized gradient descent, which is relevant to emerging trends in foundational optimization research.

  35. Bayes and Biased Estimators Without Hyper-parameter Estimation: Comparable Performance to the Empirical-Bayes-Based Regularized Estimator - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper develops estimators for regularized system identification without hyper-parameter estimation, which aligns with foundational research in model efficiency.

  36. Understanding Flatness in Generative Models: Its Role and Benefits - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper investigates flatness in generative models, which is relevant to foundational research in model behavior and robustness.

  37. PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper studies the stability of language model pre-training, which aligns with foundational research on LLM behavior and training dynamics.

  38. Automatic Operator-level Parallelism Planning for Distributed Deep Learning -- A Mixed-Integer Programming Approach - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper focuses on distributed deep learning and operator-level parallelism planning, which is relevant to model efficiency and scalability. It introduces a novel mixed-integer programming approach, aligning with foundational research in model compression and efficiency.

  39. Delusions of Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper investigates LLM delusions, linking them to training dynamics and dataset noise, which aligns with foundational research on training dynamics in neural networks.

  40. DILEMMA: Joint LLM Quantization and Distributed LLM Inference Over Edge Computing Systems - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper addresses LLM quantization and distributed inference, which is relevant to model compression and efficiency, particularly in resource-constrained environments.

  41. PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper introduces a memory optimization strategy for pipeline parallelism in LLM training, which is relevant to model compression and efficiency.

  42. Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper extends scaling law phenomena to multiple and kernel regression, contributing to theoretical insights into scaling laws, which are relevant to understanding LLM behavior.

  43. SPD: Sync-Point Drop for efficient tensor parallelism of Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: The paper introduces Sync-Point Drop (SPD), a novel optimization technique for reducing communication overhead in tensor parallelism for LLMs. This aligns with the 'Model Compression' criterion, focusing on efficiency improvements in distributed inference.

  44. Palatable Conceptions of Disembodied Being: Terra Incognita in the Space of Possible Minds - Score: 15 (R=7, N=8) - Date: 2025-03-21 - Comment: The paper explores philosophical questions about consciousness in AI, which could be considered an emerging trend challenging established assumptions.

  45. Decision-Dependent Stochastic Optimization: The Role of Distribution Dynamics - Score: 15 (R=7, N=8) - Date: 2025-03-11 - Comment: The paper introduces a theoretical framework for decision-dependent stochastic optimization, which aligns with emerging trends in foundational research.

  46. System 0/1/2/3: Quad-process theory for multi-timescale embodied collective cognitive systems - Score: 15 (R=7, N=8) - Date: 2025-03-11 - Comment: The paper introduces a quad-process theory for multi-timescale cognitive systems, which aligns with emerging trends in foundational research on cognition and AI.

  47. Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future Directions - Score: 14 (R=8, N=6) - Date: 2025-03-21 - Comment: The paper surveys distributed and multimodal LLMs, which is relevant to foundational research in LLM scalability and architecture.

  48. On the clustering behavior of sliding windows - Score: 14 (R=7, N=7) - Date: 2025-03-19 - Comment: The paper provides theoretical insights into clustering behavior with sliding windows, which could be relevant for emerging trends in foundational research.

  49. Two-Dimensional Deep ReLU CNN Approximation for Korobov Functions: A Constructive Approach - Score: 14 (R=7, N=7) - Date: 2025-03-12 - Comment: The paper provides a theoretical analysis of 2D CNNs for approximating Korobov functions, contributing to foundational understanding of CNN approximation capabilities.

  50. A new local time-decoupled squared Wasserstein-2 method for training stochastic neural networks to reconstruct uncertain parameters in dynamical systems - Score: 14 (R=7, N=7) - Date: 2025-03-10 - Comment: The paper introduces a new method for training stochastic neural networks using a Wasserstein-2 loss, which is relevant to representation learning and training dynamics.

  51. Learning Stochastic Dynamical Systems with Structured Noise - Score: 14 (R=7, N=7) - Date: 2025-03-04 - Comment: The paper introduces a framework for learning stochastic dynamical systems with structured noise, which has potential relevance to foundational research in representation learning.

Representation Learning (273)

  1. Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry - Score: 19 (R=10, N=9) - Date: 2025-03-25 - Comment: The paper introduces a geometric framework for analyzing feature learning, providing novel insights into representational geometry and task-relevant manifold evolution, which is highly relevant to representation learning.

  2. Borsuk-Ulam and Replicable Learning of Large-Margin Halfspaces - Score: 19 (R=10, N=9) - Date: 2025-03-20 - Comment: The paper provides theoretical insights into learning large-margin halfspaces and addresses open problems in learning theory, making it highly relevant to foundational research in representation learning.

  3. Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $\mu$P Parametrization - Score: 19 (R=10, N=9) - Date: 2025-03-13 - Comment: The paper provides theoretical insights into training dynamics and feature learning in infinite-width neural networks, aligning strongly with representation learning and training dynamics.

  4. Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry - Score: 19 (R=10, N=9) - Date: 2025-03-04 - Comment: The paper provides a theoretical framework for sparse autoencoders, directly addressing representation learning and the biases in concept detection.

  5. Exploring the Energy Landscape of RBMs: Reciprocal Space Insights into Bosons, Hierarchical Learning and Symmetry Breaking - Score: 18 (R=9, N=9) - Date: 2025-03-28 - Comment: The paper explores the energy landscape of RBMs and connects them to broader theoretical frameworks like symmetry breaking and hierarchical learning, which aligns with representation learning and theoretical insights into generative models.

  6. Exploring a Principled Framework for Deep Subspace Clustering - Score: 18 (R=9, N=9) - Date: 2025-03-24 - Comment: The paper presents a principled framework for deep subspace clustering, addressing feature collapse and providing theoretical guarantees. This aligns with representation learning and foundational clustering methods.

  7. I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data? - Score: 18 (R=9, N=9) - Date: 2025-03-13 - Comment: The paper provides theoretical insights into how LLMs learn human-interpretable concepts, aligning with foundational research in representation learning and LLM behavior.

  8. Disentangling Task Interference within Neurons: Model Merging in Alignment with Neuronal Mechanisms - Score: 18 (R=9, N=9) - Date: 2025-03-10 - Comment: The paper introduces NeuroMerging, a novel framework for model merging that addresses task interference at the neuronal level, aligning with the representation learning and model architecture criteria.

  9. Deep Learning is Not So Mysterious or Different - Score: 18 (R=9, N=9) - Date: 2025-03-05 - Comment: The paper provides a theoretical perspective on generalization phenomena in deep learning, which aligns with foundational research in representation learning and training dynamics.

  10. Synergy Between Sufficient Changes and Sparse Mixing Procedure for Disentangled Representation Learning - Score: 18 (R=9, N=9) - Date: 2025-03-04 - Comment: The paper proposes a novel framework combining sparse mixing and distributional changes for disentangled representation learning, which directly aligns with foundational research in representation learning.

  11. Dataset Distillation with Neural Characteristic Function: A Minmax Perspective - Score: 18 (R=9, N=9) - Date: 2025-03-03 - Comment: The paper introduces Neural Characteristic Function Matching for dataset distillation, which is a novel approach to representation learning with significant theoretical contributions.

  12. Meta-Representational Predictive Coding: Biomimetic Self-Supervised Learning - Score: 17 (R=9, N=8) - Date: 2025-03-31 - Comment: The paper introduces a novel self-supervised learning framework, Meta-Representational Predictive Coding (MPC), which aligns with representation learning by focusing on biologically plausible mechanisms and encoder-only learning. It provides theoretical insights into predictive coding and active inference.

  13. How do language models learn facts? Dynamics, curricula and hallucinations - Score: 17 (R=9, N=8) - Date: 2025-03-29 - Comment: This paper investigates the learning dynamics of language models, focusing on how they acquire and store factual knowledge, which aligns with foundational insights into LLM behavior and interpretability.

  14. F-INR: Functional Tensor Decomposition for Implicit Neural Representations - Score: 17 (R=9, N=8) - Date: 2025-03-29 - Comment: The paper introduces F-INR, a framework for implicit neural representations using functional tensor decomposition. This aligns with representation learning and efficiency breakthroughs, offering modular and scalable solutions.

  15. Shared Global and Local Geometry of Language Model Embeddings - Score: 17 (R=9, N=8) - Date: 2025-03-29 - Comment: The paper explores the geometric structure of token embeddings in language models, aligning with representation learning and interpretability in LLMs. It provides insights into intrinsic dimensions and transferability of steering vectors, which are foundational contributions.

  16. Nonlinear Multiple Response Regression and Learning of Latent Spaces - Score: 17 (R=9, N=8) - Date: 2025-03-28 - Comment: This paper proposes a novel method for learning latent spaces, which aligns with representation learning. The approach offers interpretability and theoretical guarantees, making it relevant to foundational research.

  17. Fundamental Limits of Perfect Concept Erasure - Score: 17 (R=9, N=8) - Date: 2025-03-27 - Comment: This paper provides an information-theoretic perspective on concept erasure, which is highly relevant to representation learning. The focus on fundamental limits and theoretical bounds adds significant novelty.

  18. Reasoning to Learn from Latent Thoughts - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper explores latent thought modeling for data-efficient pretraining, which is relevant to foundational research in large language models and representation learning.

  19. OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper proposes a novel framework (OCRT) for improving generalization in foundation models, which aligns with representation learning and architectural innovations.

  20. Feature Qualification by Deep Nets: A Constructive Approach - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper provides a theoretical approach to feature qualification using deep nets, which aligns with representation learning by focusing on how features are encoded and qualified. The constructive approach adds theoretical depth.

  21. Learning Multi-Level Features with Matryoshka Sparse Autoencoders - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The introduction of Matryoshka Sparse Autoencoders directly contributes to representation learning by addressing hierarchical feature learning and disentanglement, which is highly relevant.

  22. Bayesian Teaching Enables Probabilistic Reasoning in Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-25 - Comment: The paper explores Bayesian reasoning in LLMs and proposes a method to improve their probabilistic reasoning capabilities, which aligns with foundational research in LLM behavior and interpretability.

  23. NdLinear Is All You Need for Representation Learning - Score: 17 (R=9, N=8) - Date: 2025-03-24 - Comment: The paper introduces NdLinear, a novel linear transformation for preserving multi-dimensional data structures, which aligns with foundational research in representation learning and architectural innovations.

  24. Nonparametric Factor Analysis and Beyond - Score: 17 (R=9, N=8) - Date: 2025-03-24 - Comment: The paper provides a theoretical framework for identifying latent variables in noisy settings, which aligns with foundational research in representation learning.

  25. Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens - Score: 17 (R=9, N=8) - Date: 2025-03-21 - Comment: The paper introduces Uni-3DAR, a unified framework for 3D generation and understanding via autoregression. It aligns with foundational research in representation learning and architecture innovations.

  26. Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-21 - Comment: The paper introduces CK-PLUG for controlling knowledge reliance in LLMs, which aligns with foundational research in LLM behavior and interpretability.

  27. LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding - Score: 17 (R=9, N=8) - Date: 2025-03-20 - Comment: The paper introduces LIFT, a novel framework for task- and data-agnostic encoding using implicit neural representations (INRs). It aligns with representation learning by addressing multiscale information and meta-learning, which are foundational aspects. The hierarchical latent generator and residual connections in ReLIFT also touch on architectural innovations.

  28. Robustness of Nonlinear Representation Learning - Score: 17 (R=9, N=8) - Date: 2025-03-20 - Comment: This paper focuses on robustness in nonlinear representation learning, particularly in the context of Independent Component Analysis (ICA). It provides theoretical insights into identifiability under misspecified conditions, which is highly relevant to foundational representation learning.

  29. Machine Unlearning in Hyperbolic vs. Euclidean Multimodal Contrastive Learning: Adapting Alignment Calibration to MERU - Score: 17 (R=9, N=8) - Date: 2025-03-20 - Comment: The paper explores machine unlearning in hyperbolic contrastive learning, which aligns with representation learning and provides insights into geometric properties influencing concept representation. The focus on hyperbolic-specific components and unlearning dynamics adds theoretical depth.

  30. Reasoning Effort and Problem Complexity: A Scaling Analysis in LLMs - Score: 17 (R=9, N=8) - Date: 2025-03-20 - Comment: The paper investigates reasoning scalability in LLMs, which aligns with foundational research into LLM behavior and interpretability, particularly in understanding limitations in reasoning.

  31. Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis - Score: 17 (R=9, N=8) - Date: 2025-03-20 - Comment: The paper proposes a unified SSL framework combining representation learning and generative modeling, introducing a novel contrastive-reconstruction objective. This aligns with representation learning and architectural efficiency.

  32. Robust Weight Imprinting: Insights from Neural Collapse and Proxy-Based Aggregation - Score: 17 (R=9, N=8) - Date: 2025-03-20 - Comment: The paper systematically studies weight imprinting and connects it to the neural collapse phenomenon, providing insights into representation learning and model adaptation.

  33. Higher-Order Graphon Neural Networks: Approximation and Cut Distance - Score: 17 (R=9, N=8) - Date: 2025-03-19 - Comment: The paper extends higher-order GNNs to graphon models, providing theoretical insights into their approximation and transferability, aligning with representation learning and emerging trends.

  34. Improved Scalable Lipschitz Bounds for Deep Neural Networks - Score: 17 (R=9, N=8) - Date: 2025-03-19 - Comment: The paper introduces improved scalable Lipschitz bounds for deep neural networks, which directly contributes to understanding training dynamics and robustness, aligning with representation learning.

  35. Learning on LLM Output Signatures for gray-box LLM Behavior Analysis - Score: 17 (R=9, N=8) - Date: 2025-03-19 - Comment: The paper proposes a gray-box analysis method for LLM behavior, which aligns with foundational research in understanding LLM behavior and interpretability.

  36. ROCK: A variational formulation for occupation kernel methods in Reproducing Kernel Hilbert Spaces - Score: 17 (R=9, N=8) - Date: 2025-03-19 - Comment: The paper presents a variational formulation for kernel methods, which is a foundational contribution to representation learning and computational efficiency.

  37. Using the Tools of Cognitive Science to Understand Large Language Models at Different Levels of Analysis - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper proposes using cognitive science methods to understand LLMs, which aligns with theoretical insights into LLM behavior and interpretability.

  38. Computation Mechanism Behind LLM Position Generalization - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper provides computational insights into LLM position generalization, which aligns with foundational research in LLM behavior and interpretability.

  39. Gradient Extrapolation for Debiased Representation Learning - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper proposes a novel gradient extrapolation method for debiased representation learning, which aligns with foundational research in representation learning and optimization.

  40. Beyond Propagation of Chaos: A Stochastic Algorithm for Mean Field Optimization - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper explores a stochastic algorithm for mean field optimization, which aligns with foundational research in representation learning and training dynamics. It provides theoretical insights into optimization in Wasserstein space.

  41. Edgeworth Expansion for Semi-hard Triplet Loss - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: This paper offers a higher-order asymptotic analysis of the semi-hard triplet loss, providing theoretical insights into its behavior. It aligns with foundational research in representation learning by analyzing the training dynamics and sensitivity of a loss function.

  42. Finite Samples for Shallow Neural Networks - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper investigates the identifiability of shallow neural networks with finite samples, providing theoretical insights into network irreducibility and activation functions. This aligns with foundational research in representation learning and training dynamics.

  43. Training Diagonal Linear Networks with Stochastic Sharpness-Aware Minimization - Score: 17 (R=9, N=8) - Date: 2025-03-18 - Comment: The paper provides theoretical insights into training dynamics and sharpness-aware minimization, which aligns with representation learning and training dynamics in neural networks.

  44. Fuzzy Rule-based Differentiable Representation Learning - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper introduces a novel representation learning method grounded in interpretable fuzzy rule-based models, aligning with the foundational research in representation learning.

  45. HyperKAN: Hypergraph Representation Learning with Kolmogorov-Arnold Networks - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper introduces HyperKAN for hypergraph representation learning, aligning with foundational research in representation learning.

  46. PrivacyScalpel: Enhancing LLM Privacy via Interpretable Feature Intervention with Sparse Autoencoders - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: PrivacyScalpel introduces sparse autoencoders for privacy enhancement in LLMs, aligning with foundational research in sparsity and representation learning.

  47. From Dionysius Emerges Apollo -- Learning Patterns and Abstractions from Perceptual Sequences - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper explores chunking and abstraction in sequence learning, which is relevant to representation learning and foundational insights into how models encode information.

  48. Auditing language models for hidden objectives - Score: 17 (R=9, N=8) - Date: 2025-03-17 - Comment: The paper studies alignment audits for LLMs, which provides theoretical insights into model behavior and interpretability, aligning with foundational research.

  49. On the Identifiability of Causal Abstractions - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper explores causal representation learning with a focus on identifiability, aligning with foundational research in representation learning.

  50. Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper proposes CS-ReFT, focusing on representation-based fine-tuning for LLMs, contributing to foundational insights into representation learning and LLM behavior.

  51. The Spectral Bias of Shallow Neural Network Learning is Shaped by the Choice of Non-linearity - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper explores the spectral bias of shallow neural networks shaped by activation functions, providing theoretical insights into representation learning and training dynamics.

  52. Understanding the Logical Capabilities of Large Language Models via Out-of-Context Representation Learning - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper introduces out-of-context representation learning for logical tasks, contributing to foundational insights into representation learning and LLM behavior.

  53. Spherical dimension - Score: 17 (R=9, N=8) - Date: 2025-03-14 - Comment: The paper introduces spherical dimension as a topological relaxation of VC dimension, contributing to foundational research in representation learning and theoretical insights.

  54. Online multidimensional dictionary learning - Score: 17 (R=9, N=8) - Date: 2025-03-13 - Comment: The paper focuses on online multidimensional dictionary learning, which is directly relevant to representation learning and sparse methods. It introduces a novel acceleration technique.

  55. Probing Latent Subspaces in LLM for AI Security: Identifying and Manipulating Adversarial States - Score: 17 (R=9, N=8) - Date: 2025-03-13 - Comment: The paper explores latent subspaces in LLMs for adversarial state manipulation, aligning with the 'Large Language Models' criterion due to its focus on interpretability and theoretical insights.

  56. Implicit Contrastive Representation Learning with Guided Stop-gradient - Score: 17 (R=9, N=8) - Date: 2025-03-13 - Comment: The paper introduces a novel method for implicit contrastive representation learning, which aligns with representation learning and training dynamics. It provides methodological advancements.

  57. Towards Interpretable Protein Structure Prediction with Sparse Autoencoders - Score: 17 (R=9, N=8) - Date: 2025-03-13 - Comment: The paper scales sparse autoencoders to large protein language models, enabling interpretability in protein structure prediction. This aligns with foundational research in representation learning and AI for science.

  58. How good is PAC-Bayes at explaining generalisation? - Score: 17 (R=9, N=8) - Date: 2025-03-12 - Comment: The paper provides a theoretical analysis of PAC-Bayes bounds and their ability to explain generalization, which is highly relevant to foundational research in representation learning and generalization theory.

  59. A Theoretical Framework for Preventing Class Collapse in Supervised Contrastive Learning - Score: 17 (R=9, N=8) - Date: 2025-03-12 - Comment: The paper provides a theoretical framework to prevent class collapse in supervised contrastive learning, which is highly relevant to foundational research in representation learning.

  60. Route Sparse Autoencoder to Interpret Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-12 - Comment: The paper proposes a sparse autoencoder framework for LLM interpretability, which aligns with representation learning and interpretability of LLMs.

  61. CAD-VAE: Leveraging Correlation-Aware Latents for Comprehensive Fair Disentanglement - Score: 17 (R=9, N=8) - Date: 2025-03-12 - Comment: The paper introduces CAD-VAE, a novel disentangled VAE framework addressing fairness in representation learning, which aligns with foundational research in representation learning.

  62. Learning Energy-Based Models by Self-normalising the Likelihood - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper proposes a novel self-normalized log-likelihood objective for energy-based models, which aligns with foundational research in representation learning.

  63. How LLMs Learn: Tracing Internal Representations with Sparse Autoencoders - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper uses sparse autoencoders to trace internal representations in LLMs, directly addressing representation learning and interpretability in LLMs.

  64. Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper critiques the MMI criterion and proposes a novel alternative for rationale extraction, which aligns with representation learning and interpretability.

  65. Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper provides theoretical insights into feature learning in ReLU networks, which aligns with foundational research in representation learning.

  66. Analyzing the Role of Permutation Invariance in Linear Mode Connectivity - Score: 17 (R=9, N=8) - Date: 2025-03-11 - Comment: The paper provides a theoretical analysis of linear mode connectivity and sparsity in neural networks, which aligns with representation learning and training dynamics.

  67. Strategy Coopetition Explains the Emergence and Transience of In-Context Learning - Score: 17 (R=9, N=8) - Date: 2025-03-10 - Comment: The paper provides a mechanistic understanding of in-context learning dynamics, which aligns with foundational research in representation learning and training dynamics.

  68. Distilling Dataset into Neural Field - Score: 17 (R=9, N=8) - Date: 2025-03-10 - Comment: The paper introduces a novel parameterization framework for dataset distillation using neural fields, which is highly relevant to foundational research in representation learning and efficiency.

  69. Enough Coin Flips Can Make LLMs Act Bayesian - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper investigates whether LLMs perform Bayesian reasoning during in-context learning, providing theoretical insights into LLM behavior and interpretability. This aligns closely with the foundational research on LLMs and their emergent capabilities.

  70. Transferable Foundation Models for Geometric Tasks on Point Cloud Representations: Geometric Neural Operators - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper introduces Geometric Neural Operators (GNPs) for point cloud representations, which aligns with foundational research in representation learning and architecture-level innovations.

  71. Activation Space Interventions Can Be Transferred Between Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper explores activation space interventions and their transferability between LLMs, which aligns with representation learning and foundational insights into LLM behavior.

  72. Causally Reliable Concept Bottleneck Models - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper introduces a concept bottleneck model with causal reasoning capabilities, aligning with representation learning and emerging trends in explainable AI. It also provides a pipeline for learning causal structures.

  73. Learning Causal Response Representations through Direct Effect Analysis - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper focuses on causal representation learning, which aligns with the representation learning criterion. It introduces a novel optimization framework and provides theoretical guarantees, making it relevant to foundational research.

  74. Generalizability of Neural Networks Minimizing Empirical Risk Based on Expressive Ability - Score: 17 (R=9, N=8) - Date: 2025-03-07 - Comment: The paper provides theoretical insights into generalizability based on expressiveness, directly addressing foundational questions in representation learning and over-parameterization.

  75. Process-based Self-Rewarding Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-06 - Comment: The paper explores a self-rewarding paradigm for LLMs with a focus on mathematical reasoning, which aligns with foundational research in LLM behavior and interpretability. The proposed process-based self-rewarding pipeline introduces novel theoretical insights.

  76. Towards Understanding Distilled Reasoning Models: A Representational Approach - Score: 17 (R=9, N=8) - Date: 2025-03-06 - Comment: The paper explores how model distillation impacts reasoning features in LLMs, aligning with representation learning and theoretical insights into LLM behavior. The focus on feature geometry and structured representations is highly relevant.

  77. Effective LLM Knowledge Learning via Model Generalization - Score: 17 (R=9, N=8) - Date: 2025-03-06 - Comment: The paper explores knowledge learning in LLMs and proposes methods to improve generalization during pretraining. This aligns with the 'Large Language Models' criterion, particularly in understanding and enhancing foundational knowledge acquisition.

  78. Towards Understanding Multi-Round Large Language Model Reasoning: Approximability, Learnability and Generalizability - Score: 17 (R=9, N=8) - Date: 2025-03-06 - Comment: The paper provides theoretical insights into multi-round reasoning in LLMs, focusing on approximation, learnability, and generalization. This aligns with the 'Large Language Models' criterion, particularly in understanding foundational behavior and theoretical properties.

  79. Weak-to-Strong Generalization Even in Random Feature Networks, Provably - Score: 17 (R=9, N=8) - Date: 2025-03-05 - Comment: The paper explores weak-to-strong generalization in random feature networks, providing theoretical insights into training dynamics and generalization, which aligns well with foundational research in representation learning.

  80. Unsupervised Attributed Dynamic Network Embedding with Stability Guarantees - Score: 17 (R=9, N=8) - Date: 2025-03-05 - Comment: The paper focuses on unsupervised representation learning for dynamic networks, with a novel stability guarantee and theoretical contributions. This aligns with the representation learning criterion.

  81. (How) Do Language Models Track State? - Score: 17 (R=9, N=8) - Date: 2025-03-05 - Comment: The paper investigates how language models track state and identifies two distinct mechanisms, providing theoretical insights into LLM behavior and interpretability.

  82. A Theory of Initialisation's Impact on Specialisation - Score: 17 (R=9, N=8) - Date: 2025-03-05 - Comment: The paper provides theoretical insights into the impact of initialization on neuron specialization, which is relevant to representation learning and training dynamics in neural networks.

  83. A Near Complete Nonasymptotic Generalization Theory For Multilayer Neural Networks: Beyond the Bias-Variance Tradeoff - Score: 17 (R=9, N=8) - Date: 2025-03-05 - Comment: The paper introduces a nonasymptotic generalization theory for multilayer neural networks, addressing foundational aspects of generalization and double descent, which is highly relevant to understanding training dynamics.

  84. From superposition to sparse codes: interpretable representations in neural networks - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper provides a theoretical framework for understanding neural representations using sparse coding, which aligns with foundational research in representation learning.

  85. On the Power of Context-Enhanced Learning in LLMs - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper formalizes context-enhanced learning for LLMs, providing theoretical insights into gradient-based learning with enhanced context. This aligns with foundational research in LLM behavior and interpretability.

  86. Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper proposes a sparse coding method for adaptive representation learning, which aligns with foundational research in representation learning and efficiency.

  87. Asymptotic Theory of Eigenvectors for Latent Embeddings with Generalized Laplacian Matrices - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper develops an asymptotic theory for eigenvectors in generalized Laplacian matrices, contributing to foundational research in representation learning and theoretical insights into latent embeddings.

  88. Projection Head is Secretly an Information Bottleneck - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper provides a theoretical understanding of the projection head in contrastive learning, aligning with foundational research in representation learning and offering novel insights into its role as an information bottleneck.

  89. Towards Understanding the Benefit of Multitask Representation Learning in Decision Process - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper provides theoretical insights into multitask representation learning, directly addressing foundational aspects of representation learning.

  90. Steering Large Language Model Activations in Sparse Spaces - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper introduces Sparse Activation Steering (SAS) for guiding LLM behavior using sparse autoencoders. This aligns with foundational research in representation learning and interpretability, offering a novel approach to behavior modulation.

  91. BAnG: Bidirectional Anchored Generation for Conditional RNA Design - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: The paper explores identifiability in mechanistic interpretability, which aligns with emerging trends and foundational research in understanding neural networks.

  92. Dynamical Decoupling of Generalization and Overfitting in Large Two-Layer Networks - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: The paper provides a theoretical analysis of training dynamics in large two-layer networks, uncovering phenomena like time-scale separation and feature unlearning. This aligns with the 'Representation Learning' criterion, focusing on training dynamics and generalization.

  93. Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking) - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: This position paper advocates for using layerwise linear models to understand neural dynamical phenomena like neural collapse and grokking, which directly aligns with foundational research in representation learning and training dynamics.

  94. Learning Dynamics of Deep Linear Networks Beyond the Edge of Stability - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: The paper provides a theoretical analysis of learning dynamics in deep linear networks, contributing to foundational understanding of training dynamics in neural networks.

  95. Brain-Inspired Exploration of Functional Networks and Key Neurons in Large Language Models - Score: 17 (R=9, N=8) - Date: 2025-03-03 - Comment: The paper explores functional networks in LLMs inspired by cognitive neuroscience, providing insights into LLM behavior and interpretability, which aligns with the LLM criterion.

  96. Fundamental Limits of Matrix Sensing: Exact Asymptotics, Universality, and Applications - Score: 17 (R=8, N=9) - Date: 2025-03-19 - Comment: The paper provides theoretical insights into matrix sensing and Bayes-optimal learning, which aligns with foundational research in representation learning and efficiency.

  97. SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability - Score: 16 (R=9, N=7) - Date: 2025-03-13 - Comment: The paper introduces a benchmark for sparse autoencoders, which aligns with the representation learning criterion. The focus on interpretability and feature disentanglement is relevant.

  98. MixFunn: A Neural Network for Differential Equations with Improved Generalization and Interpretability - Score: 16 (R=8, N=8) - Date: 2025-03-31 - Comment: The paper introduces MixFunn, a novel neural network architecture for solving differential equations with enhanced generalization and interpretability. It provides architectural innovations and insights into representation learning, making it relevant to foundational research.

  99. Arch-LLM: Taming LLMs for Neural Architecture Generation via Unsupervised Discrete Representation Learning - Score: 16 (R=8, N=8) - Date: 2025-03-31 - Comment: The paper introduces a discrete representation learning approach for neural architecture generation, which aligns with 'Model Architecture' and 'Representation Learning'. The use of VQ-VAE and LLMs for architecture generation is novel.

  100. Scalable Expectation Estimation with Subtractive Mixture Models - Score: 16 (R=8, N=8) - Date: 2025-03-28 - Comment: The paper introduces subtractive mixture models (SMMs) for scalable expectation estimation, which is a novel contribution to generative modeling and aligns with representation learning through advanced mixture models.

  101. AutoBayes: A Compositional Framework for Generalized Variational Inference - Score: 16 (R=8, N=8) - Date: 2025-03-25 - Comment: The paper proposes a compositional framework for generalized variational inference, which aligns with foundational research in representation learning and model optimization. It introduces new tools and theoretical insights.

  102. Does GCL Need a Large Number of Negative Samples? Enhancing Graph Contrastive Learning with Effective and Efficient Negative Sampling - Score: 16 (R=8, N=8) - Date: 2025-03-25 - Comment: The paper challenges the consensus on negative sampling in Graph Contrastive Learning (GCL) and proposes a new method (E2Neg) for efficient and effective sampling. It aligns with representation learning and introduces novel insights.

  103. CODA: Repurposing Continuous VAEs for Discrete Tokenization - Score: 16 (R=8, N=8) - Date: 2025-03-25 - Comment: Proposes a novel framework for adapting continuous VAEs into discrete tokenizers, which aligns with foundational research in representation learning and autoencoders.

  104. Structure Is Not Enough: Leveraging Behavior for Neural Network Weight Reconstruction - Score: 16 (R=8, N=8) - Date: 2025-03-24 - Comment: The paper introduces a behavioral loss for neural network weight reconstruction, which aligns with representation learning and autoencoders. The focus on combining structural and behavioral signals is a novel approach.

  105. HiQ-Lip: The First Quantum-Classical Hierarchical Method for Global Lipschitz Constant Estimation of ReLU Networks - Score: 16 (R=8, N=8) - Date: 2025-03-21 - Comment: The paper proposes HiQ-Lip, a hybrid quantum-classical method for estimating the global Lipschitz constant of ReLU networks. It aligns with foundational research in neural network robustness and generalization.

  106. Revealing higher-order neural representations with generative artificial intelligence - Score: 16 (R=8, N=8) - Date: 2025-03-19 - Comment: The paper uses generative AI to explore higher-order neural representations, which aligns with emerging trends in representation learning and theoretical insights.

  107. Ensemble Knowledge Distillation for Machine Learning Interatomic Potentials - Score: 16 (R=8, N=8) - Date: 2025-03-19 - Comment: The paper introduces an ensemble knowledge distillation method for improving machine learning interatomic potentials, which is relevant to foundational efficiency and representation learning methods.

  108. TNCSE: Tensor's Norm Constraints for Unsupervised Contrastive Learning of Sentence Embeddings - Score: 16 (R=8, N=8) - Date: 2025-03-18 - Comment: The paper proposes a novel unsupervised contrastive learning framework for sentence embeddings, which aligns with representation learning and introduces tensor norm constraints.

  109. Combining Causal Models for More Accurate Abstractions of Neural Networks - Score: 16 (R=8, N=8) - Date: 2025-03-17 - Comment: The combination of causal models for neural network abstractions offers foundational contributions towards mechanistic interpretability of models.

  110. Classifying Long-tailed and Label-noise Data via Disentangling and Unlearning - Score: 16 (R=8, N=8) - Date: 2025-03-17 - Comment: The paper proposes disentangling and unlearning methods for noisy long-tailed data, aligning with foundational research in representation learning.

  111. Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective - Score: 16 (R=8, N=8) - Date: 2025-03-17 - Comment: The paper provides a unified theoretical framework for backpropagation attribution methods, aligning with foundational research in representation learning.

  112. Langevin Monte-Carlo Provably Learns Depth Two Neural Nets at Any Size and Data - Score: 16 (R=8, N=8) - Date: 2025-03-14 - Comment: The paper provides theoretical insights into learning depth-2 neural networks using Langevin Monte-Carlo, which contributes to foundational research in representation learning and training dynamics.

  113. Multiplicative Learning - Score: 16 (R=8, N=8) - Date: 2025-03-14 - Comment: The paper introduces Expectation Reflection, a novel multiplicative learning approach, aligning with 'Representation Learning' due to its innovative training dynamics.

  114. Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild - Score: 16 (R=8, N=8) - Date: 2025-03-14 - Comment: The paper explores meta-learning activation functions to optimize inductive biases, contributing to architectural innovations and representation learning.

  115. Inter-environmental world modeling for continuous and compositional dynamics - Score: 16 (R=8, N=8) - Date: 2025-03-14 - Comment: The paper introduces WLA for inter-environmental world modeling, contributing to foundational research in representation learning and emerging trends.

  116. Is CLIP ideal? No. Can we fix it? Yes! - Score: 16 (R=8, N=8) - Date: 2025-03-13 - Comment: The paper critiques the geometric limitations of CLIP's latent space and proposes a novel scoring method, aligning with representation learning and foundational model analysis.

  117. The Space Between: On Folding, Symmetries and Sampling - Score: 16 (R=8, N=8) - Date: 2025-03-12 - Comment: The paper explores space folding in neural networks, which aligns with representation learning and provides insights into training dynamics.

  118. CIMAGE: Exploiting the Conditional Independence in Masked Graph Auto-encoders - Score: 16 (R=8, N=8) - Date: 2025-03-12 - Comment: The paper introduces CIMAGE, a novel CI-aware masking strategy for graph autoencoders, contributing to foundational research in representation learning for graphs.

  119. Global graph features unveiled by unsupervised geometric deep learning - Score: 16 (R=8, N=8) - Date: 2025-03-10 - Comment: The paper introduces GAUDI, a novel unsupervised geometric deep learning framework for graph analysis, which aligns with representation learning through its innovative autoencoder architecture.

  120. Riemann$^2$: Learning Riemannian Submanifolds from Riemannian Data - Score: 16 (R=8, N=8) - Date: 2025-03-10 - Comment: The paper introduces a novel approach to learning Riemannian latent representations, which aligns with representation learning and provides theoretical insights into constrained data geometry.

  121. Post-Hoc Concept Disentanglement: From Correlated to Isolated Concept Representations - Score: 16 (R=8, N=8) - Date: 2025-03-10 - Comment: The paper introduces a method for disentangling concept representations in neural networks, which aligns with representation learning and provides insights into latent space interpretability.

  122. Towards Locally Explaining Prediction Behavior via Gradual Interventions and Measuring Property Gradients - Score: 16 (R=8, N=8) - Date: 2025-03-10 - Comment: The paper proposes a novel framework for local interventional explanations, which aligns with representation learning and provides insights into model interpretability.

  123. VQEL: Enabling Self-Developed Symbolic Language in Agents through Vector Quantization in Emergent Language Games - Score: 16 (R=8, N=8) - Date: 2025-03-10 - Comment: The paper introduces a novel method for emergent language learning using vector quantization, which aligns with foundational research in representation learning and symbolic representation.

  124. Sample-Optimal Agnostic Boosting with Unlabeled Data - Score: 16 (R=8, N=8) - Date: 2025-03-07 - Comment: The paper proposes a novel agnostic boosting algorithm leveraging unlabeled data, which aligns with foundational research in representation learning and efficiency improvements.

  125. Generalized Interpolating Discrete Diffusion - Score: 16 (R=8, N=8) - Date: 2025-03-07 - Comment: The paper introduces a generalized interpolating discrete diffusion (GIDD) framework, which provides theoretical insights into diffusion processes and aligns with foundational research in representation learning.

  126. Provable Robust Overfitting Mitigation in Wasserstein Distributionally Robust Optimization - Score: 16 (R=8, N=8) - Date: 2025-03-07 - Comment: The paper proposes a novel robust optimization framework under Wasserstein DRO, which aligns with foundational research in optimization and robustness.

  127. An optimal Petrov-Galerkin framework for operator networks - Score: 16 (R=8, N=8) - Date: 2025-03-07 - Comment: The paper proposes a novel operator network framework (PG-VarMiON) for solving PDEs, embedding Petrov-Galerkin structure into the architecture. This aligns with foundational research in model architecture.

  128. All-atom Diffusion Transformers: Unified generative modelling of molecules and materials - Score: 16 (R=8, N=8) - Date: 2025-03-07 - Comment: The paper proposes a unified generative model for molecules and materials using Transformers, which aligns with foundational research in generative modeling and representation learning.

  129. Generative Learning of Densities on Manifolds - Score: 16 (R=8, N=8) - Date: 2025-03-07 - Comment: The paper combines diffusion models and manifold learning for generative modeling, which aligns with foundational research in representation learning and generative paradigms.

  130. Feature Matching Intervention: Leveraging Observational Data for Causal Representation Learning - Score: 16 (R=8, N=8) - Date: 2025-03-06 - Comment: The paper introduces a novel approach for causal representation learning using feature matching interventions. This aligns with the 'Representation Learning' criterion, focusing on foundational methods for identifying causal features.

  131. A Minimalist Example of Edge-of-Stability and Progressive Sharpening - Score: 16 (R=8, N=8) - Date: 2025-03-05 - Comment: The paper explores Edge-of-Stability and Progressive Sharpening phenomena in optimization, providing theoretical insights into training dynamics, which is relevant to representation learning.

  132. Spike-and-Slab Posterior Sampling in High Dimensions - Score: 16 (R=8, N=8) - Date: 2025-03-05 - Comment: The paper introduces provable algorithms for spike-and-slab posterior sampling in high dimensions, which aligns with foundational research in sparse methods and representation learning.

  133. Weight transport through spike timing for robust local gradients - Score: 16 (R=8, N=8) - Date: 2025-03-05 - Comment: The paper introduces a novel spike-based learning rule for spiking neural networks, which aligns with foundational research in representation learning and training dynamics.

  134. Multi-Level Collaboration in Model Merging - Score: 16 (R=8, N=8) - Date: 2025-03-04 - Comment: The paper explores model merging and its theoretical connection to model ensembling, which aligns with representation learning and architectural innovations by addressing multi-task learning and parameter-level merging.

  135. Understanding Dataset Distillation via Spectral Filtering - Score: 16 (R=8, N=8) - Date: 2025-03-04 - Comment: The paper introduces a spectral filtering framework for dataset distillation, which provides theoretical insights into representation learning.

  136. Learning-Augmented Frequent Directions - Score: 16 (R=8, N=8) - Date: 2025-03-04 - Comment: The paper introduces a learning-augmented variant of the Frequent Directions algorithm, which aligns with representation learning and efficiency breakthroughs in streaming algorithms.

  137. Homomorphism Expressivity of Spectral Invariant Graph Neural Networks - Score: 16 (R=8, N=8) - Date: 2025-03-04 - Comment: The paper provides a theoretical analysis of spectral invariant GNNs, which aligns with foundational research in model architecture and representation learning.

  138. Interpreting CLIP with Hierarchical Sparse Autoencoders - Score: 16 (R=8, N=8) - Date: 2025-03-03 - Comment: The paper introduces a hierarchical sparse autoencoder (MSAE) for interpreting and controlling CLIP, aligning with 'Representation Learning' and sparse methods.

  139. Backpropagation-free Spiking Neural Networks with the Forward-Forward Algorithm - Score: 16 (R=8, N=8) - Date: 2025-03-03 - Comment: The paper explores the Forward-Forward algorithm for training spiking neural networks, which is a novel training methodology with potential foundational impact on neuromorphic computing and representation learning.

  140. ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models - Score: 15 (R=8, N=7) - Date: 2025-03-31 - Comment: The paper provides insights into reasoning length control in LLMs and introduces a weight-editing approach, which aligns with 'Representation Learning' and 'Large Language Models' criteria. The mechanistic understanding of reasoning length is a novel contribution.

  141. Efficient Joint Prediction of Multiple Future Tokens - Score: 15 (R=8, N=7) - Date: 2025-03-31 - Comment: The paper proposes joint multi-token prediction (JTP) to enrich hidden state representations in language models. It aligns with representation learning and introduces a lightweight architectural modification, making it relevant to foundational research.

  142. Outlier dimensions favor frequent tokens in language model - Score: 15 (R=8, N=7) - Date: 2025-03-28 - Comment: The study of outlier dimensions in language models provides insights into token prediction heuristics and their training dynamics, aligning with representation learning and interpretability.

  143. Offline Action-Free Learning of Ex-BMDPs by Comparing Diverse Datasets - Score: 15 (R=8, N=7) - Date: 2025-03-28 - Comment: The paper introduces a novel representation learning algorithm (CRAFT) for action-free environments, aligning with the representation learning criterion. It provides theoretical guarantees, which adds to its foundational relevance.

  144. Including local feature interactions in deep non-negative matrix factorization networks improves performance - Score: 15 (R=8, N=7) - Date: 2025-03-27 - Comment: The paper explores the integration of local feature interactions in deep non-negative matrix factorization (NMF) networks, which aligns with representation learning and architectural insights. The focus on improving performance through biologically plausible mechanisms adds a novel perspective.

  145. TraNCE: Transformative Non-linear Concept Explainer for CNNs - Score: 15 (R=8, N=7) - Date: 2025-03-27 - Comment: This paper introduces a novel concept explainer for CNNs using variational autoencoders and a new evaluation metric. It aligns with representation learning and explainability, particularly in understanding how CNNs encode information, making it relevant to foundational research.

  146. Network Inversion for Generating Confidently Classified Counterfeits - Score: 15 (R=8, N=7) - Date: 2025-03-27 - Comment: The paper explores network inversion techniques to generate confidently classified counterfeits, which provides insights into model behavior and decision boundaries. This aligns with representation learning, particularly in understanding how models encode information and their limitations.

  147. TARDIS: Mitigate Temporal Misalignment via Representation Steering - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: The paper introduces TARDIS, a representation editing method to mitigate temporal misalignment, aligning with representation learning and efficiency improvements.

  148. Towards Human-Understandable Multi-Dimensional Concept Discovery - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: The paper proposes HU-MCD, which enhances concept-based explainability in CNNs, aligning with representation learning and interpretability research.

  149. Exploring Energy Landscapes for Minimal Counterfactual Explanations: Applications in Cybersecurity and Beyond - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: The paper introduces a novel framework for counterfactual explanations using energy minimization, which aligns with representation learning and interpretability research.

  150. Interpretable Feature Interaction via Statistical Self-supervised Learning on Tabular Data - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: The paper proposes a novel self-supervised learning pipeline combining kernel PCA and sparse polynomial representations, which aligns with representation learning and interpretability.

  151. On the Minimax Regret of Sequential Probability Assignment via Square-Root Entropy - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: The paper provides theoretical insights into sequential probability assignment using square-root entropy, which aligns with foundational research in representation learning.

  152. Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: Proposes linear probing for extracting latent knowledge in LLMs, which provides insights into model interpretability and aligns with foundational research in representation learning.

  153. Bayesian generative models can flag performance loss, bias, and out-of-distribution image content - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: The paper proposes a novel uncertainty quantification method for VAEs, which aligns with foundational research in generative models and representation learning.

  154. Generative Modeling of Class Probability for Multi-Modal Representation Learning - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: Proposes a novel generative modeling approach for multi-modal representation learning, which aligns with foundational research in representation learning.

  155. Neural-Guided Equation Discovery - Score: 15 (R=8, N=7) - Date: 2025-03-24 - Comment: The paper explores neural-guided equation discovery, which aligns with emerging trends in foundational AI research. It introduces a modular system and compares architectures, making it relevant to representation learning.

  156. A Learnability Analysis on Neuro-Symbolic Learning - Score: 15 (R=8, N=7) - Date: 2025-03-24 - Comment: The paper provides a learnability analysis for neuro-symbolic tasks, which aligns with foundational research in representation learning and theoretical insights.

  157. A preliminary data fusion study to assess the feasibility of Foundation Process-Property Models in Laser Powder Bed Fusion - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper explores data fusion for process-property modeling in additive manufacturing, which aligns with foundational research in AI for Science and representation learning.

  158. Advances in Protein Representation Learning: Methods, Applications, and Future Directions - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper provides a comprehensive review of Protein Representation Learning (PRL), which aligns with foundational research in representation learning for molecular modeling.

  159. Procrustes Wasserstein Metric: A Modified Benamou-Brenier Approach with Applications to Latent Gaussian Distributions - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper introduces a modified Benamou-Brenier approach for Wasserstein distance with applications to latent Gaussian distributions. It aligns with foundational research in representation learning and metric spaces.

  160. On the Cone Effect in the Learning Dynamics - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper investigates the learning dynamics of neural networks, specifically the evolution of the empirical Neural Tangent Kernel (eNTK) and introduces the cone effect. This aligns with representation learning and training dynamics.

  161. Rethinking Robustness in Machine Learning: A Posterior Agreement Approach - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper proposes a novel robustness metric based on Posterior Agreement theory, which aligns with foundational research in model evaluation and robustness.

  162. Manifold learning in metric spaces - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper generalizes manifold learning to metric spaces, providing theoretical insights into Laplacian-based methods. It aligns with foundational research in representation learning.

  163. Disentangling Uncertainties by Learning Compressed Data Representation - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper introduces CDRM, a framework for disentangling uncertainties in regressive system dynamics models. It aligns with foundational research in representation learning and uncertainty estimation.

  164. Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-03-19 - Comment: The paper presents a framework for unifying text semantics and graph structures using LLMs, which aligns with representation learning and explores the interplay between semantics and structures.

  165. FeNeC: Enhancing Continual Learning via Feature Clustering with Neighbor- or Logit-Based Classification - Score: 15 (R=8, N=7) - Date: 2025-03-19 - Comment: The paper introduces a novel clustering-based approach for continual learning, which aligns with representation learning and foundational methods for adapting to evolving data.

  166. GFSNetwork: Differentiable Feature Selection via Gumbel-Sigmoid Relaxation - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper presents a novel neural architecture for feature selection using Gumbel-Sigmoid relaxation, which aligns with foundational research in model architecture and efficiency.

  167. Can LLMs Formally Reason as Abstract Interpreters for Program Analysis? - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper explores whether LLMs can perform formal reasoning using abstract interpretation, which aligns with foundational research into LLM behavior and interpretability.

  168. S2IL: Structurally Stable Incremental Learning - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper proposes a method for incremental learning that mitigates catastrophic forgetting, which aligns with representation learning and training dynamics in neural networks.

  169. Revisiting Gradient Descent: A Dual-Weight Method for Improved Learning - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper introduces a novel dual-weight gradient descent framework, which aligns with representation learning by providing insights into training dynamics and robustness.

  170. Entropy-regularized Gradient Estimators for Approximate Bayesian Inference - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper proposes entropy-regularized gradient estimators for approximate Bayesian inference, focusing on uncertainty quantification and posterior sampling. This aligns with foundational research in representation learning and Bayesian methods.

  171. GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: GraphEval introduces graph-based methods for idea evaluation, which aligns with foundational research in representation learning and graph neural networks.

  172. Support Collapse of Deep Gaussian Processes with Polynomial Kernels for a Wide Regime of Hyperparameters - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper analyzes the behavior of Deep Gaussian Processes with polynomial kernels, providing theoretical insights into their training dynamics, which aligns with representation learning.

  173. Probabilistic Graph Circuits: Deep Generative Models for Tractable Probabilistic Inference over Graphs - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper introduces a new framework for deep generative models, Probabilistic Graph Circuits, focusing on tractable probabilistic inference rather than application-specific generation. This aligns with insights into representation learning and model architecture.

  174. Weighted Graph Structure Learning with Attention Denoising for Node Classification - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: Proposes a graph structure learning method with attention denoising, aligning with representation learning and sparsity criteria.

  175. Efficient and Privacy-Preserved Link Prediction via Condensed Graphs - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: Proposes a graph condensation method for privacy-preserved link prediction, aligning with representation learning and sparsity criteria.

  176. Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper proposes a universal speech token learning framework, which aligns with foundational research in representation learning and efficiency.

  177. RTD-Lite: Scalable Topological Analysis for Comparing Weighted Graphs in Learning Tasks - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: Proposes RTD-Lite for scalable topological analysis of weighted graphs, aligning with representation learning and sparsity criteria.

  178. Revisiting FastMap: New Applications - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper revisits FastMap for graph embeddings, aligning with foundational research in representation learning.

  179. Semantic-Clipping: Efficient Vision-Language Modeling with Semantic-Guidedd Visual Selection - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: Proposes semantic-guided visual selection for VLMs, aligning with representation learning and architectural innovation.

  180. Class-Level Feature Selection Method Using Feature Weighted Growing Self-Organising Maps - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: Proposes a class-level feature selection method, aligning with representation learning and sparsity criteria.

  181. Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: Proposes a two-stage few-shot adaptation method for vision-language models, aligning with representation learning and architectural innovation.

  182. Implicit Bias-Like Patterns in Reasoning Models - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: Explores implicit bias-like patterns in reasoning models, providing insights into LLM behavior and interpretability.

  183. PARIC: Probabilistic Attention Regularization for Language Guided Image Classification from Pre-trained Vison Language Models - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper proposes probabilistic attention regularization for vision-language models, which aligns with foundational research in representation learning.

  184. Reasoning-Grounded Natural Language Explanations for Language Models - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper proposes reasoning-grounded natural language explanations for LLMs, aligning with foundational research in LLM behavior and interpretability.

  185. Don't Forget It! Conditional Sparse Autoencoder Clamping Works for Unlearning - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper explores sparse autoencoders for unlearning harmful knowledge in LLMs, aligning with foundational research in representation learning and interpretability.

  186. Quantifying Interpretability in CLIP Models with Concept Consistency - Score: 15 (R=8, N=7) - Date: 2025-03-17 - Comment: The paper proposes a metric for interpretability in CLIP models, which is relevant to understanding foundational aspects of representation learning.

  187. HyperDAS: Towards Automating Mechanistic Interpretability with Hypernetworks - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper proposes HyperDAS for automating mechanistic interpretability, contributing to foundational research in representation learning and interpretability.

  188. Fixed-Point RNNs: From Diagonal to Dense in a Few Iterations - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper explores a novel approach to sequence mixing in RNNs, which could provide insights into representation learning and architectural innovations.

  189. Numerical and statistical analysis of NeuralODE with Runge-Kutta time integration - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper provides a detailed analysis of NeuralODE with Runge-Kutta integration, contributing to foundational research in representation learning and generative modeling.

  190. Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper provides a classifier-centric perspective on classifier-free guidance, which aligns with foundational research in representation learning and model behavior.

  191. Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper explores training pathologies in language models, which aligns with foundational research into LLM behavior and interpretability.

  192. Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper introduces a new Transformer-based architecture for molecular representation learning, which aligns with foundational research in representation learning and model architecture.

  193. PIMRL: Physics-Informed Multi-Scale Recurrent Learning for Spatiotemporal Prediction - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: The paper introduces a physics-informed multi-scale learning framework, which aligns with foundational research in representation learning.

  194. SOLA-GCL: Subgraph-Oriented Learnable Augmentation Method for Graph Contrastive Learning - Score: 15 (R=8, N=7) - Date: 2025-03-14 - Comment: SOLA-GCL proposes a subgraph-oriented augmentation method for graph contrastive learning, contributing to representation learning with novel augmentation strategies.

  195. Learning Spatially Adaptive $\ell_1$-Norms Weights for Convolutional Synthesis Regularization - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper proposes an unrolled algorithm for learning spatially adaptive parameters in convolutional synthesis regularization, which aligns with representation learning and sparse methods.

  196. Neural Normalized Cut: A Differential and Generalizable Approach for Spectral Clustering - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper proposes a neural approach to spectral clustering, which aligns with representation learning and introduces a novel method for clustering membership.

  197. A Deep Bayesian Nonparametric Framework for Robust Mutual Information Estimation - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper introduces a Bayesian nonparametric framework for mutual information estimation, which aligns with representation learning and theoretical insights into training dynamics.

  198. Learning Pareto manifolds in high dimensions: How can regularization help? - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper discusses leveraging low-dimensional structure in multi-objective learning, which aligns with representation learning and training dynamics. It introduces a novel framework for regularization in high-dimensional settings.

  199. How Does Overparameterization Affect Machine Unlearning of Deep Neural Networks? - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: This paper investigates the effect of overparameterization on machine unlearning, which aligns with foundational research on training dynamics in neural networks. It provides theoretical insights into how unlearning interacts with model parameterization.

  200. Aligning Text to Image in Diffusion Models is Easier Than You Think - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper proposes a lightweight contrastive fine-tuning strategy for text-to-image diffusion models, which aligns with representation learning and introduces methodological improvements.

  201. Benign Overfitting and the Geometry of the Ridge Regression Solution in Binary Classification - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper investigates ridge regression in overparameterized settings and provides theoretical insights into benign overfitting, which aligns with foundational research in representation learning.

  202. Learning and Evaluating Hierarchical Feature Representations - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper introduces a novel framework for hierarchical feature representation learning, which aligns with the representation learning criterion. The proposed Hier-COS method and new evaluation metric (HOPS) add methodological contributions.

  203. Personalized Convolutional Dictionary Learning of Physiological Time Series - Score: 15 (R=8, N=7) - Date: 2025-03-12 - Comment: The paper extends Convolutional Dictionary Learning (CDL) with a personalized approach, which aligns with representation learning. The focus on local-global structures and formal guarantees adds methodological novelty.

  204. Language Models Fail to Introspect About Their Knowledge of Language - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper investigates introspection in LLMs, which aligns with theoretical insights into LLM behavior and interpretability.

  205. Is My Text in Your AI Model? Gradient-based Membership Inference Test applied to LLMs - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper adapts a membership inference test for LLMs, which aligns with theoretical insights into LLM behavior and interpretability.

  206. DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper introduces a contrastive approach for LLM distillation, which aligns with foundational improvements in LLM training and efficiency.

  207. SemHiTok: A Unified Image Tokenizer via Semantic-Guided Hierarchical Codebook for Multimodal Understanding and Generation - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper introduces a unified image tokenizer for multimodal tasks, which is relevant to foundational research on representation learning and multimodal understanding.

  208. Enhancing CBMs Through Binary Distillation with Applications to Test-Time Intervention - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper proposes a method to enhance concept bottleneck models, which aligns with representation learning and interpretability.

  209. Deep Cut-informed Graph Embedding and Clustering - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper introduces a novel graph clustering framework with a focus on representation learning through graph embeddings. It aligns with foundational research in representation learning.

  210. GIN-Graph: A Generative Interpretation Network for Model-Level Explanation of Graph Neural Networks - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper proposes a generative interpretation network for GNNs, which aligns with foundational research on interpretability and model-level explanations.

  211. Text-Speech Language Models with Improved Cross-Modal Transfer by Aligning Abstraction Levels - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper focuses on text-speech language models and proposes methods to improve cross-modal transfer, which aligns with foundational research on representation learning and model architecture.

  212. SINdex: Semantic INconsistency Index for Hallucination Detection in LLMs - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper proposes a semantic inconsistency index for hallucination detection in LLMs, which is relevant to foundational research on LLM behavior and interpretability.

  213. Uncertainty Quantification From Scaling Laws in Deep Neural Networks - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper investigates uncertainty quantification in neural networks using scaling laws, which aligns with representation learning and theoretical insights.

  214. From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning - Score: 15 (R=8, N=7) - Date: 2025-03-11 - Comment: The paper studies finetuning for knowledge injection in LLMs, providing insights into the limitations and challenges of finetuning, which aligns with foundational research on LLM behavior.

  215. A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-03-10 - Comment: The paper surveys sparse autoencoders for interpreting LLMs, which aligns with foundational research in representation learning and interpretability.

  216. Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs? - Score: 15 (R=8, N=7) - Date: 2025-03-10 - Comment: The paper explores grammar-based code representations in LLMs, which aligns with representation learning and insights into LLM behavior. It provides a novel perspective on incorporating grammar rules into billion-scale models.

  217. Statistical Deficiency for Task Inclusion Estimation - Score: 15 (R=8, N=7) - Date: 2025-03-10 - Comment: The paper introduces a theoretical framework for task inclusion estimation, which aligns with foundational research in representation learning and task structure analysis.

  218. An Analytical Model for Overparameterized Learning Under Class Imbalance - Score: 15 (R=8, N=7) - Date: 2025-03-10 - Comment: The paper provides a theoretical analysis of class-imbalanced learning, which aligns with foundational research in representation learning and training dynamics.

  219. Extracting Symbolic Sequences from Visual Representations via Self-Supervised Learning - Score: 15 (R=8, N=7) - Date: 2025-03-10 - Comment: The paper explores symbolic sequence generation from visual data using self-supervised learning, which aligns with foundational research in representation learning and interpretability.

  220. LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning - Score: 15 (R=8, N=7) - Date: 2025-03-10 - Comment: The paper proposes a new contrastive learning framework for multimodal embedding models, which aligns with representation learning and introduces methodological improvements.

  221. Optimizing Multi-Hop Document Retrieval Through Intermediate Representations - Score: 15 (R=8, N=7) - Date: 2025-03-10 - Comment: The paper proposes a novel method for multi-hop document retrieval using intermediate representations, which aligns with representation learning and introduces architectural insights.

  222. An Information-theoretic Multi-task Representation Learning Framework for Natural Language Understanding - Score: 15 (R=8, N=7) - Date: 2025-03-07 - Comment: The paper introduces a multi-task representation learning framework with theoretical principles for sufficiency and redundancy minimization, aligning with the representation learning criterion.

  223. Quantitative Flow Approximation Properties of Narrow Neural ODEs - Score: 15 (R=8, N=7) - Date: 2025-03-07 - Comment: The paper explores the approximation properties of narrow Neural ODEs, which aligns with foundational research in representation learning and training dynamics of neural networks.

  224. Analogical Reasoning Inside Large Language Models: Concept Vectors and the Limits of Abstraction - Score: 15 (R=8, N=7) - Date: 2025-03-06 - Comment: The paper investigates analogical reasoning and concept vectors in LLMs, providing insights into representation learning and interpretability. The focus on invariant concept vectors and their causal role is relevant.

  225. Conceptualizing Uncertainty - Score: 15 (R=8, N=7) - Date: 2025-03-06 - Comment: The paper proposes concept activation vectors to explain uncertainty in high-dimensional data, aligning with representation learning and interpretability. The focus on global explanations of uncertainty is relevant.

  226. On the Relationship Between Double Descent of CNNs and Shape/Texture Bias Under Learning Process - Score: 15 (R=8, N=7) - Date: 2025-03-05 - Comment: The paper investigates the double descent phenomenon in CNNs, which is relevant to understanding training dynamics and representation learning. It provides experimental insights into shape/texture bias.

  227. Sharpness-Aware Minimization: General Analysis and Improved Rates - Score: 15 (R=8, N=7) - Date: 2025-03-05 - Comment: The paper provides a unified analysis of Sharpness-Aware Minimization (SAM) and introduces a new algorithm with theoretical guarantees, which aligns with representation learning and training dynamics in neural networks.

  228. Elliptic Loss Regularization - Score: 15 (R=8, N=7) - Date: 2025-03-05 - Comment: The paper proposes a novel regularization technique based on elliptic operators, which aligns with foundational research in model regularization and generalization.

  229. Quantifying Overfitting along the Regularization Path for Two-Part-Code MDL in Supervised Classification - Score: 15 (R=8, N=7) - Date: 2025-03-05 - Comment: This paper provides a theoretical analysis of regularization paths in MDL learning rules, which aligns with foundational research in representation learning and training dynamics.

  230. Linear Representations of Political Perspective Emerge in Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-03-05 - Comment: The paper explores linear representations of political perspectives in LLMs, which aligns with interpretability and representation learning in LLMs. It provides novel insights into high-level representations.

  231. Mathematical Foundation of Interpretable Equivariant Surrogate Models - Score: 15 (R=8, N=7) - Date: 2025-03-05 - Comment: The paper proposes a mathematical framework for explainability in equivariant operators, which aligns with foundational research in model interpretability and architecture analysis.

  232. SAKE: Steering Activations for Knowledge Editing - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: SAKE introduces a method for knowledge editing in LLMs, focusing on robustness and generalization. This aligns with foundational research in LLM behavior and interpretability.

  233. Hypergraph Foundation Model - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper proposes a hypergraph foundation model, which is relevant to architectural innovations and representation learning, particularly in the context of hypergraphs.

  234. Improve Representation for Imbalanced Regression through Geometric Constraints - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper introduces geometric constraints for representation learning in imbalanced regression, which aligns with foundational research in representation learning.

  235. Re-Imagining Multimodal Instruction Tuning: A Representation View - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper introduces a novel approach to multimodal instruction tuning, which is relevant to representation learning and parameter-efficient methods.

  236. Convergence of energy-based learning in linear resistive networks - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper provides theoretical insights into energy-based learning algorithms, which is foundational and relevant to representation learning.

  237. Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper proposes BaFT, a method for knowledge editing in LLMs using basis-level representation fine-tuning. This aligns with foundational research in representation learning and model editing, offering a novel approach to the editing-locality trade-off.

  238. Generalization Bounds for Equivariant Networks on Markov Data - Score: 15 (R=8, N=7) - Date: 2025-03-04 - Comment: The paper provides generalization bounds for equivariant networks on Markov data, which aligns with foundational research in model behavior and theoretical insights.

  239. An Algebraic Framework for Hierarchical Probabilistic Abstraction - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: The paper introduces a hierarchical probabilistic abstraction framework, which aligns with foundational research in representation learning and abstraction methodologies.

  240. Tuning-Free Structured Sparse PCA via Deep Unfolding Networks - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: The paper introduces a deep unfolding network for structured sparse PCA, aligning with the 'Representation Learning' criterion by addressing unsupervised feature selection and dimensionality reduction.

  241. Discovering Global False Negatives On the Fly for Self-supervised Contrastive Learning - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: The paper addresses false negatives in self-supervised contrastive learning, which is a relevant topic in representation learning, particularly in improving training dynamics and embedding quality.

  242. Data Distributional Properties As Inductive Bias for Systematic Generalization - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: The paper investigates data distributional properties as inductive biases for systematic generalization, which is relevant to representation learning and provides insights into training dynamics.

  243. Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: The paper provides insights into how LLMs answer one-to-many factual queries, which aligns with foundational research into LLM behavior and interpretability.

  244. Transfer Learning through Enhanced Sufficient Representation: Enriching Source Domain Knowledge with Target Data - Score: 15 (R=8, N=7) - Date: 2025-03-03 - Comment: The paper proposes a novel transfer learning method (TESR) with theoretical contributions to representation learning by enhancing sufficient representations, aligning with the 'Representation Learning' criterion.

  245. Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures - Score: 15 (R=7, N=8) - Date: 2025-03-14 - Comment: The paper introduces a novel generative modeling approach using conjugate moment measures, which could be relevant to representation learning and emerging trends.

  246. NFIG: Autoregressive Image Generation with Next-Frequency Prediction - Score: 15 (R=7, N=8) - Date: 2025-03-11 - Comment: The paper proposes a novel autoregressive framework for image generation using frequency-guided stages, which aligns with representation learning and architectural insights. However, the focus on image generation makes it partially relevant.

  247. Integrating Predictive and Generative Capabilities by Latent Space Design via the DKL-VAE Model - Score: 15 (R=7, N=8) - Date: 2025-03-06 - Comment: The paper introduces a novel framework combining VAE and DKL, which aligns with representation learning by enhancing latent space organization for generative and predictive tasks. However, the focus on application to material discovery makes it slightly less foundational.

  248. Cauchy Random Features for Operator Learning in Sobolev Space - Score: 15 (R=7, N=8) - Date: 2025-03-04 - Comment: The paper proposes a random feature method for operator learning with theoretical guarantees, which could be relevant to representation learning due to its focus on kernel-based methods and error bounds.

  249. Neuro-Symbolic Learning for Galois Groups: Unveiling Probabilistic Trends in Polynomials - Score: 15 (R=7, N=8) - Date: 2025-03-03 - Comment: The paper combines neural networks with symbolic reasoning to classify Galois groups, which is an emerging trend in AI for science with potential foundational implications.

  250. Survey on Evaluation of LLM-based Agents - Score: 14 (R=8, N=6) - Date: 2025-03-21 - Comment: The paper surveys evaluation methodologies for LLM-based agents, which aligns with foundational research in LLM behavior and interpretability.

  251. Emergent Abilities in Large Language Models: A Survey - Score: 14 (R=8, N=6) - Date: 2025-03-11 - Comment: The paper surveys emergent abilities in LLMs, which aligns with theoretical insights into LLM behavior and interpretability.

  252. VAEs and GANs: Implicitly Approximating Complex Distributions with Simple Base Distributions and Deep Neural Networks -- Principles, Necessity, and Limitations - Score: 14 (R=8, N=6) - Date: 2025-03-05 - Comment: The paper discusses fundamental principles and limitations of VAEs and GANs, which are relevant to representation learning and foundational generative modeling.

  253. Do Your Best and Get Enough Rest for Continual Learning - Score: 14 (R=7, N=7) - Date: 2025-03-25 - Comment: The paper introduces a novel view-batch model inspired by human memory theories for continual learning, which aligns with representation learning and training dynamics.

  254. Neural Network Approach to Stochastic Dynamics for Smooth Multimodal Density Estimation - Score: 14 (R=7, N=7) - Date: 2025-03-25 - Comment: The paper introduces a stochastic dynamics-based sampling method for density estimation, which aligns with representation learning by addressing sampling in high-dimensional spaces. However, its focus on specific sampling methods limits its broader impact.

  255. Do regularization methods for shortcut mitigation work as intended? - Score: 14 (R=7, N=7) - Date: 2025-03-24 - Comment: The paper analyzes regularization methods for mitigating shortcuts, providing theoretical insights into their mechanisms. This aligns with representation learning and training dynamics.

  256. Rethinking the Role of Spatial Mixing - Score: 14 (R=7, N=7) - Date: 2025-03-24 - Comment: The paper investigates the role of spatial mixing in deep learning architectures, providing insights into the training dynamics and robustness of models. While it does not directly address representation learning or model architecture innovation, its analysis of spatial and channel mixing offers foundational insights into existing architectures.

  257. End-to-End Optimal Detector Design with Mutual Information Surrogates - Score: 14 (R=7, N=7) - Date: 2025-03-19 - Comment: The paper focuses on mutual information as a task-agnostic optimization metric, which aligns with representation learning and foundational insights into optimization frameworks.

  258. Quantification of Uncertainties in Probabilistic Deep Neural Network by Implementing Boosting of Variational Inference - Score: 14 (R=7, N=7) - Date: 2025-03-19 - Comment: The paper introduces Boosted Bayesian Neural Networks for better uncertainty quantification, which aligns with representation learning and training dynamics in neural networks.

  259. OODD: Test-time Out-of-Distribution Detection with Dynamic Dictionary - Score: 14 (R=7, N=7) - Date: 2025-03-14 - Comment: The paper introduces a novel OOD detection method, which could provide insights into representation learning and model robustness.

  260. Towards Graph Foundation Models: A Transferability Perspective - Score: 14 (R=7, N=7) - Date: 2025-03-13 - Comment: The paper provides a taxonomy of Graph Foundation Models (GFMs) with a focus on transferability, which aligns with the 'Emerging Trends' criterion due to its foundational perspective on graph models.

  261. Birds look like cars: Adversarial analysis of intrinsically interpretable deep learning - Score: 14 (R=7, N=7) - Date: 2025-03-12 - Comment: The paper critiques the interpretability of prototype-based networks and highlights their vulnerabilities, which aligns with emerging trends in interpretability but lacks a strong foundational contribution.

  262. Median Consensus Embedding for Dimensionality Reduction - Score: 14 (R=7, N=7) - Date: 2025-03-12 - Comment: The paper introduces a novel method for reducing variability in dimensionality reduction techniques, which is relevant to representation learning but focuses on a specific embedding stability issue.

  263. Lifelong Learning with Task-Specific Adaptation: Addressing the Stability-Plasticity Dilemma - Score: 14 (R=7, N=7) - Date: 2025-03-11 - Comment: The paper proposes a novel adapter-based framework for lifelong learning, which touches on representation learning and model architecture but is more application-driven.

  264. What I cannot execute, I do not understand: Training and Evaluating LLMs on Program Execution Traces - Score: 14 (R=7, N=7) - Date: 2025-03-11 - Comment: The paper explores training LLMs using program execution traces, which provides insights into representation learning and training dynamics. However, it is more focused on code understanding and generation rather than foundational representation learning.

  265. A kinetic-based regularization method for data science applications - Score: 14 (R=7, N=7) - Date: 2025-03-10 - Comment: The paper introduces a physics-inspired regularization method for function learning, which aligns with foundational research in representation learning, particularly in improving interpolation and regression tasks.

  266. Bi-Lipschitz Ansatz for Anti-Symmetric Functions - Score: 14 (R=7, N=7) - Date: 2025-03-07 - Comment: The paper introduces a new ansatz for approximating anti-symmetric functions, which is relevant to representation learning due to its focus on function approximation and theoretical insights.

  267. On the Saturation Effects of Spectral Algorithms in Large Dimensions - Score: 14 (R=7, N=7) - Date: 2025-03-04 - Comment: This paper explores saturation effects in spectral algorithms, providing theoretical insights into their behavior in large dimensions. While not directly tied to representation learning or model architecture, it contributes to foundational understanding of algorithmic behavior.

  268. Line Space Clustering (LSC): Feature-Based Clustering using K-medians and Dynamic Time Warping for Versatility - Score: 13 (R=7, N=6) - Date: 2025-03-21 - Comment: The paper introduces a novel clustering method combining K-medians and Dynamic Time Warping, which is relevant to representation learning but lacks broader foundational insights.

  269. The Shape of Attraction in UMAP: Exploring the Embedding Forces in Dimensionality Reduction - Score: 13 (R=7, N=6) - Date: 2025-03-13 - Comment: The paper provides an analysis of UMAP's embedding forces, offering insights into dimensionality reduction methods, which aligns with representation learning.

  270. Learning to Match Unpaired Data with Minimum Entropy Coupling - Score: 13 (R=7, N=6) - Date: 2025-03-12 - Comment: The paper addresses unpaired data matching using Minimum Entropy Coupling and diffusion models, which is relevant to representation learning but not highly novel.

  271. Strengthening the Internal Adversarial Robustness in Lifted Neural Networks - Score: 13 (R=7, N=6) - Date: 2025-03-12 - Comment: The paper explores adversarial robustness in lifted neural networks, which aligns with representation learning and training dynamics but is not a major breakthrough.

  272. Gender Encoding Patterns in Pretrained Language Model Representations - Score: 13 (R=7, N=6) - Date: 2025-03-11 - Comment: The paper analyzes gender encoding in pretrained language models, which aligns with representation learning by exploring how biases are encoded in model representations.

  273. Unnatural Languages Are Not Bugs but Features for LLMs - Score: 13 (R=7, N=6) - Date: 2025-03-05 - Comment: The paper explores the concept of unnatural languages in LLMs, which aligns with foundational research on LLM behavior and interpretability. However, it is more focused on empirical findings rather than theoretical breakthroughs.

Other Foundational Research (23)

  1. Assessing SAM for Tree Crown Instance Segmentation from Drone Imagery - Score: 20.0 (R=0, N=0) - Date: 2025-03-27 - Comment: Author match

  2. Counterfactual Realizability - Score: 18 (R=9, N=9) - Date: 2025-03-18 - Comment: The paper explores counterfactual realizability, which is a cutting-edge theoretical contribution with potential implications for foundational research.

  3. Empirical Computation - Score: 18 (R=9, N=9) - Date: 2025-03-14 - Comment: The paper introduces 'empirical computation,' a novel paradigm that challenges classical computational frameworks. This aligns with the 'Emerging Trends' criterion as it proposes a cutting-edge theoretical direction.

  4. CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners - Score: 17 (R=9, N=8) - Date: 2025-03-21 - Comment: The paper proposes CaKE, a circuit-aware knowledge editing method for LLMs, which aligns with foundational research in LLM behavior and reasoning circuits.

  5. Non-convergence to the optimal risk for Adam and stochastic gradient descent optimization in the training of deep neural networks - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: The paper provides theoretical insights into the limitations of SGD optimization in deep learning, which aligns with foundational research on training dynamics.

  6. CE-U: Cross Entropy Unlearning - Score: 17 (R=9, N=8) - Date: 2025-03-04 - Comment: CE-U proposes a novel loss function for unlearning in LLMs, which aligns with foundational research in LLM behavior and theoretical insights. The focus on gradient stability and theoretical analysis is a strong match.

  7. Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment - Score: 16 (R=8, N=8) - Date: 2025-03-31 - Comment: The paper analyzes inference-time alignment and introduces a new algorithm, InferenceTimePessimism, which aligns with 'Large Language Models' and 'Emerging Trends'. The theoretical insights into reward hacking and scaling are significant.

  8. What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models - Score: 16 (R=8, N=8) - Date: 2025-03-25 - Comment: The paper introduces a framework for evaluating the steerability of generative models, which is a novel perspective on foundational aspects of generative model evaluation.

  9. From Demonstrations to Rewards: Alignment Without Explicit Human Preferences - Score: 16 (R=8, N=8) - Date: 2025-03-19 - Comment: The paper introduces a novel approach to alignment using inverse reinforcement learning principles, which is relevant to foundational LLM research and offers a fresh perspective on reward modeling.

  10. An Algebraic Approach to Moralisation and Triangulation of Probabilistic Graphical Models - Score: 16 (R=8, N=8) - Date: 2025-03-17 - Comment: Proposes an algebraic approach to probabilistic graphical models, aligning with emerging trends and foundational research.

  11. Characterizing Learning in Spiking Neural Networks with Astrocyte-Like Units - Score: 16 (R=8, N=8) - Date: 2025-03-11 - Comment: The paper explores the impact of astrocyte-like units in spiking neural networks, which is an emerging trend in foundational research.

  12. Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models - Score: 15 (R=8, N=7) - Date: 2025-03-29 - Comment: The paper provides a theoretical framework for understanding the safety-capability trade-offs in fine-tuning LLMs, which aligns with the 'theoretical insights into LLM behavior' criterion.

  13. Language Models May Verbatim Complete TextThey Were Not Explicitly Trained On - Score: 15 (R=8, N=7) - Date: 2025-03-25 - Comment: Analyzes membership definitions in LLM training datasets, which provides theoretical insights into LLM behavior and aligns with foundational research.

  14. The Lighthouse of Language: Enhancing LLM Agents via Critique-Guided Improvement - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper introduces a critique-guided improvement framework for LLM agents, which aligns with foundational research in LLM behavior and iterative improvement.

  15. Entropy-based Exploration Conduction for Multi-step Reasoning - Score: 15 (R=8, N=7) - Date: 2025-03-21 - Comment: The paper introduces Entro-duction, a method for dynamically adjusting exploration depth in LLM reasoning, which aligns with foundational research in LLM behavior and reasoning capabilities.

  16. Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence - Score: 15 (R=8, N=7) - Date: 2025-03-20 - Comment: The paper proposes a method for teaching LLMs to express calibrated semantic confidence, which aligns with the 'Large Language Models' criterion by providing theoretical insights into uncertainty quantification.

  17. Improving Complex Reasoning with Dynamic Prompt Corruption: A soft prompt Optimization Approach - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper proposes a novel method for improving prompt-tuning in LLMs for complex reasoning tasks, which aligns with foundational research in LLM behavior and optimization.

  18. Understanding Gradient Orthogonalization for Deep Learning via Non-Euclidean Trust-Region Optimization - Score: 15 (R=8, N=7) - Date: 2025-03-18 - Comment: The paper provides a theoretical analysis of gradient orthogonalization and introduces a novel perspective on trust-region optimization. This aligns with foundational research in optimization and training dynamics of neural networks.

  19. Benefits of Learning Rate Annealing for Tuning-Robustness in Stochastic Optimization - Score: 15 (R=8, N=7) - Date: 2025-03-13 - Comment: The paper analyzes learning rate annealing for tuning robustness, which aligns with training dynamics in neural networks. It provides theoretical insights into optimization.

  20. Boosting Offline Optimizers with Surrogate Sensitivity - Score: 15 (R=8, N=7) - Date: 2025-03-07 - Comment: The paper develops a sensitivity-informed regularizer for offline optimization, which aligns with foundational research in optimization and robustness.

  21. Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows - Score: 15 (R=8, N=7) - Date: 2025-03-07 - Comment: The paper proposes a generative modeling approach for protein dynamics, which aligns with foundational research in AI for Science, particularly in molecular modeling.

  22. Simulation-based Bayesian inference under model misspecification - Score: 15 (R=7, N=8) - Date: 2025-03-17 - Comment: Focuses on simulation-based Bayesian inference under misspecification, introducing theoretical strategies to mitigate flawed models, aligning moderately with foundational AI for science.

  23. Empirical Privacy Variance - Score: 14 (R=7, N=7) - Date: 2025-03-17 - Comment: The paper introduces the concept of empirical privacy variance in DP-SGD, which provides theoretical insights into privacy and optimization.