Personalized Daily ArXiv Papers 2026-05-18

Model	Metric	Usage			Papers
Model	Metric	Prompt	Completion	Total	Total arXiv	Scanned	Relevant
`gpt-5.4`	Tokens	257771	29041	286812	712	396	19
`gpt-5.4`	Cost	$0.64	$0.44	$1.08	712	396	19

Topic Coverage:

Topic	Papers
Architecture and Training Dynamics	5
Efficiency, Compression, and Large-Scale Training	3
Representation Learning Theory and Structure	5
Memory Structures and Agent Memory Systems	2
World Models, Exploration, and Open-Ended Reinforcement Learning	4

Table of contents by topic:

Architecture and Training Dynamics (5)

On the Stability of Growth in Structural Plasticity Authors: Lute Lillo, Nick Cheney
GQA-{\mu}P: The maximal parameterization update for grouped query attention Authors: Kyle R. Chickering, Huijuan Wang, Mengxi Wu, Alexander Moreno, Muhao Chen, Xuezhe Ma, Daria Soboleva, Joel Hestness, Zhengzhong Liu, Eric Xing
$\phi$-Balancing for Mixture-of-Experts Training Authors: Lizhang Chen, Jonathan Li, Qi Wang, Runlong Liao, Shuozhe Li, Chen Liang, Ni Lao, Qiang Liu
Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations Authors: Arthur Fyon, Julien Brandoit, Loris Mendolia, Damien Ernst, Jean-Michel Redout\'e, Guillaume Drion
Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design Authors: Alberto Pepe, Chien-Yu Lin, Despoina Magka, Bilge Acun, Yannan Nellie Wu, Anton Protopopov, Carole-Jean Wu, Yoram Bachrach

Efficiency, Compression, and Large-Scale Training (3)

GQLA: Group-Query Latent Attention for Hardware-Adaptive Large Language Model Decoding Authors: Fanxu Meng
A Few GPUs, A Whole Lotta Scale: Faithful LLM Training Emulation with PrismLLM Authors: Shaoke Xi, ChonLam Lao, Boyi Jia, Jiaqi Gao, Zhipeng Zhang, Jiamin Cao, Brian Sutioso, Erci Xu, Minlan Yu, Kui Ren, Yong Li, Zhengping Qian, Ennan Zhai, Jingren Zhou
Going Beyond the Edge: Distributed Inference of Transformer Models on Ultra-Low-Power Wireless Devices Authors: Alexander Gr\"afe, Ding Huo, Johannes Berger, Marco Zimmerling, Sebastian Trimpe

Representation Learning Theory and Structure (5)

Characterizing Learning in Deep Neural Networks using Tractable Algorithmic Complexity Analysis Authors: Pedram Bakhtiarifard, Sophia N. Wilson, Mahmoud Afifi, Jonathan Wensh{\o}j, Raghavendra Selvan
Entropic Auto-Encoding via Implicit Free-Energy Minimization Authors: Hazhir Aliahmadi, Irina Babayan, Greg van Anders
Syntax Without Semantics: Teaching Large Language Models to Code in an Unseen Language Authors: Vinayshekhar Bannihatti Kumar, Disha Makhija, Manoj Ghuhan Arivazhagan, Rashmi Gangadharaiah
Judge Circuits Authors: Nils Feldhus, Tanja Baeumel, Elena Golimblevskaia, Qianli Wang, Van Bach Nguyen, Aaron Louis Eidt, Christopher Ebert, Wojciech Samek, Jing Yang, Vera Schmitt, Sebastian M\"oller, Simon Ostermann
Bounded-Rationality, Hedging, and Generalization Authors: Pedro A. Ortega

Memory Structures and Agent Memory Systems (2)

Bridging Silicon and the Hippocampus: Algebro-Deterministic Memory "VaCoAl" as a Substrate for Vector-HaSH and TEM Authors: Hiroyuki Chuma, Kanji Otsuka, Yoichi Sato
Hidden in Memory: Sleeper Memory Poisoning in LLM Agents Authors: Sidharth Pulipaka, Stanislau Hlebik, Leonidas Raghav, Sahar Abdelnabi, Vyas Raina, Ivaxi Sheth, Mario Fritz

World Models, Exploration, and Open-Ended Reinforcement Learning (4)

BAPR: Bayesian amnesic piecewise-robust reinforcement learning for non-stationary continuous control Authors: Yifan Zhang, Liang Zheng
Feedback World Model Enables Precise Guidance of Diffusion Policy Authors: Tuo An, Jindou Jia, Gen Li, Jingliang Li, Chuhao Zhou, Pengfei Liu, Bofan Lyu, Jiaqi Bai, Xinying Guo, Geng Li, Jianfei Yang
Imperfect World Models are Exploitable Authors: Logan Mondal Bhamidipaty (University of Edinburgh), Esmeralda S. Whitammer (University of Edinburgh), David Abel (University of Edinburgh), Mykel J. Kochenderfer (Stanford University), Subramanian Ramamoorthy (University of Edinburgh)
Deterministic Event-Graph Substrates as World Models for Counterfactual Reasoning Authors: Fabio Rovai

Architecture and Training Dynamics (5)

1. On the Stability of Growth in Structural Plasticity

ArXiv ID: 2605.15435

Primary Topic: Architecture and Training Dynamics

Authors: Lute Lillo, Nick Cheney

Abstract: Standard deep-learning pipelines usually choose the network architecture before training and keep it fixed throughout optimization. In contrast, a model can also be adapted by editing its structure during training, for example by pruning existing hidden-neuron units or growing new ones. Although growth is appealing for adaptive and continual systems, we show that it is not simply the inverse of pruning. Pruning selects among units that have participated in training from the start, whereas growth inserts new units into an already specialized optimization trajectory. We isolate this insertion problem and show that newborn units are often forward-active but backward-starved: they participate in the forward computation, yet receive much weaker gradient signal than incumbent units. This disadvantage is minor in small MLP benchmarks, but becomes clear in harder image-classification settings with a convolutional trunk. In these settings, \textsc{Grow} can achieve high final accuracy during the structural-editing procedure, while \textsc{Prune} is stronger when performance is averaged over the training trajectory or when the final sparse network is retrained from scratch. Interventions targeting optimizer state, insertion, selection, and trainability show that improving the integration of newborn units can improve adaptive performance, but does not automatically produce better final subnetworks. In continual-learning benchmarks stressing plasticity loss, \textsc{Grow} becomes competitive mainly when new units have enough time to integrate. Together, these results suggest that \textsc{Grow} should be evaluated not only as an architecture-search operator, but as a time-sensitive optimization process whose success depends on insertion stability.

Comment: Shows structurally grown units are backward-starved after insertion, reframing network growth as a time-sensitive optimization-stability problem.

Topic Match: This is directly about dynamic architecture editing and training stability mechanisms for growth during optimization.

Relevance: 9 Novelty: 8

2. GQA-{\mu}P: The maximal parameterization update for grouped query attention

ArXiv ID: 2605.15290

Primary Topic: Architecture and Training Dynamics

Authors: Kyle R. Chickering, Huijuan Wang, Mengxi Wu, Alexander Moreno, Muhao Chen, Xuezhe Ma, Daria Soboleva, Joel Hestness, Zhengzhong Liu, Eric Xing

Abstract: Hyperparameter transfer across model architectures dramatically reduces the amount of compute necessary for tuning large language models (LLMs). The maximal update parameterization ({\mu}P) ensures transfer through principled mathematical analysis but can be challenging to derive for new model architectures. Building on the spectral feature-learning view of Yang et al. (2023a), we make two advances. First, we promote spectral norm conditions on the weights from a heuristic to the definition of feature learning, and as a consequence arrive at the Complete-P depth and weight-decay scalings without recourse to lazy-learning. Second, we consider a modified spectral norm that preserves the valid scaling law of network weights when weight matrices are not full rank. This enables (to our knowledge, the first) derivation of {\mu}P scalings for grouped-query attention (GQA). We demonstrate the efficacy of our theoretical derivations by showing learning rate transfer across the GQA repetition hyperparameter as well as experiments regarding transfer over weight decay.

Comment: Derives maximal-update scaling laws for grouped-query attention, extending principled hyperparameter transfer to GQA architectures.

Topic Match: It directly targets training dynamics and scaling-law parameterization for a core transformer architectural variant.

Relevance: 9 Novelty: 8

3. $\phi$-Balancing for Mixture-of-Experts Training

ArXiv ID: 2605.15403

Primary Topic: Architecture and Training Dynamics

Also Matches: Efficiency, Compression, and Large-Scale Training

Authors: Lizhang Chen, Jonathan Li, Qi Wang, Runlong Liao, Shuozhe Li, Chen Liang, Ni Lao, Qiang Liu

Abstract: Mixture-of-Experts (MoE) models rely on balanced expert utilization to fully realize their scalability. However, existing load-balancing methods are largely heuristic and operate on noisy mini-batch assignment statistics, introducing bias relative to population-level objectives. We propose $\phi$-balancing, a principled framework that directly targets population-level expert balance by minimizing a strictly convex, symmetric, and differentiable potential of the expected routing distribution. Using convex duality, we derive an equivalent min-max formulation and obtain a simple online algorithm via mirror descent, yielding an efficient EMA-based routing adjustment with negligible overhead. Across large-scale pretraining and downstream fine-tuning, $\phi$-balancing consistently outperforms prior Switch-style and loss-free baselines, demonstrating more stable and effective expert utilization.

Comment: Replaces heuristic MoE load balancing with a population-level convex potential objective and mirror-descent routing adjustment.

Topic Match: Best fit is architecture and training dynamics because the paper introduces a principled new routing and stability mechanism for MoE training, not just a systems optimization.

Relevance: 9 Novelty: 8

4. Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations

ArXiv ID: 2605.15216

Primary Topic: Architecture and Training Dynamics

Also Matches: Efficiency, Compression, and Large-Scale Training

Authors: Arthur Fyon, Julien Brandoit, Loris Mendolia, Damien Ernst, Jean-Michel Redout\'e, Guillaume Drion

Abstract: Always-on AI applications, from environmental sensors to biomedical implants, require ultra-low power consumption. Analog circuits offer a path to sub-microwatt inference, yet existing analog implementations are limited to feedforward architectures: extending them to recurrent dynamics has been considered impractical due to noise accumulation through temporal feedback. We demonstrate that this barrier can be overcome through hardware-software co-design. Specifically, we identify that Bistable Memory Recurrent Units (BMRUs), a class of Recurrent Neural Networks (RNNs) with discrete-valued outputs and hysteretic dynamics, admit an ultra-low power current-mode analog implementation which we design from first principles. The resulting circuit establishes a one-to-one correspondence between each learned parameter and a circuit element. The discrete outputs suppress analog noise by at least 20-fold at each cell boundary, breaking the noise accumulation that prevents analog recurrence. We reformulate BMRUs for first-quadrant operation with fixed thresholds, enabling the direct correspondence while preserving expressivity and trainability. Transistor-level simulations in 180 nm Complementary Metal-Oxide-Semiconductor (CMOS) show near-perfect agreement between software predictions and circuit-level behavior, with the software model thereby serving as a high-fidelity simulator of the physical hardware at low computational cost. We leverage this fidelity to conduct large-scale noise immunity and power scaling analyses: the power cost of adding recurrence scales linearly with state dimension, while the feedforward layers dominating total power scale quadratically, meaning recurrence is added at linear marginal cost relative to the feedforward backbone. End-to-end keyword spotting achieves sub-microwatt inference at the RNN core.

Comment: Introduces a recurrent analog-computing design where bistable discrete outputs suppress temporal noise accumulation, enabling practical analog recurrence.

Topic Match: The central idea is a new recurrent computational mechanism co-designed with hardware, rather than a generic efficiency tweak.

Relevance: 8 Novelty: 8

5. Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design

ArXiv ID: 2605.15871

Primary Topic: Architecture and Training Dynamics

Also Matches: Efficiency, Compression, and Large-Scale Training

Authors: Alberto Pepe, Chien-Yu Lin, Despoina Magka, Bilge Acun, Yannan Nellie Wu, Anton Protopopov, Carole-Jean Wu, Yoram Bachrach

Abstract: Toward recursive self-improvement, we investigate LLM agents autonomously designing foundation models beyond standard Transformers. We introduce a dual-framework approach: AIRA-Compose for high-level architecture search, and AIRA-Design for low-level mechanistic implementation. AIRA-Compose uses 11 agents to explore fundamental computational primitives under a 24-hour budget. Agents evaluate million-parameter candidates, extrapolating top designs to 350M, 1B, and 3B scales. This yields 14 architectures across two families: AIRAformers (Transformer-based) and AIRAhybrids (Transformer-Mamba). Pre-trained at 1B scale, these consistently outperform Llama 3.2 and Composer-found baselines. On downstream tasks, AIRAformer-D and AIRAhybrid-D improve accuracy by 2.4% and 3.8% over Llama 3.2. Furthermore, AIRA-Compose finds models with highly efficient scaling frontiers: AIRAformer-C scales 54% and 71% faster than Llama 3.2 and Composer's best Transformer, while AIRAhybrid-C outscales Nemotron-2 by 23% and Composer's best hybrid by 37%. AIRA-Design tasks 20 agents with writing novel attention mechanisms for long-range dependencies and high-performing training scripts. On the Long Range Arena benchmark, agent-designed architectures reach within 2.3% and 2.6% of human state-of-the-art on document matching and text classification. On the Autoresearch benchmark, Greedy Opus 4.5 achieves 0.968 validation bits-per-byte under a fixed time budget, surpassing the published minimum. Together, these frameworks show AI agents can autonomously discover architectures and algorithmic optimizations matching or surpassing hand-designed baselines. This establishes a powerful paradigm for discovering next-generation foundation models, marking a clear step toward recursive self-improvement.

Comment: Agent-driven discovery of new Transformer and Transformer-Mamba architectures, including attention mechanisms and scaling-efficient designs.

Topic Match: The strongest fit is architecture and training dynamics because the paper's central output is new architecture families and mechanistic design choices beyond standard Transformers.

Relevance: 8 Novelty: 8

Efficiency, Compression, and Large-Scale Training (3)

1. GQLA: Group-Query Latent Attention for Hardware-Adaptive Large Language Model Decoding

ArXiv ID: 2605.15250

Primary Topic: Efficiency, Compression, and Large-Scale Training

Also Matches: Architecture and Training Dynamics

Authors: Fanxu Meng

Abstract: Multi-head Latent Attention (MLA), the attention used in DeepSeek-V2/V3, jointly compresses keys and values into a low-rank latent and matches the H100 roofline almost perfectly. Its trained weights, however, expose only one decoding path - an absorbed MQA form - which ties efficient inference to H100-class compute-bandwidth ratios, forfeits tensor parallelism along the head axis, and yields no Multi-Token Prediction (MTP) gain on commodity inference GPUs such as the export-restricted H20. We propose Group-Query Latent Attention (GQLA), a minimal modification of MLA whose trained weights expose two algebraically equivalent decoding paths over the same parameters: an MQA-absorb path identical to MLA's, and a GQA path with a per-group expanded cache. The runtime picks the path that matches the target hardware - no retraining, no custom kernels - so a single set of GQLA weights pins the rooflines of both H100 (MQA-absorb, s_q=1) and H20 (GQA + MTP, s_q=2), while supporting up to 8-way zero-redundancy tensor parallelism on the GQA path. To avoid pretraining from scratch we extend TransMLA into TransGQLA, which converts a pretrained GQA checkpoint into a GQLA model; on LLaMA-3-8B it compresses the per-token KV cache to 28.125% of the GQA baseline on the MQA-absorb path while structurally preserving GQA-level traffic on the per-group path.

Comment: A dual-path latent attention design that exposes hardware-adaptive MQA and GQA decoding from one parameterization while shrinking KV cache.

Topic Match: Its main contribution is a new attention/KV-cache design for materially better inference efficiency across hardware targets, making efficiency and scaling the best fit.

Relevance: 9 Novelty: 8

2. A Few GPUs, A Whole Lotta Scale: Faithful LLM Training Emulation with PrismLLM

ArXiv ID: 2605.15617

Primary Topic: Efficiency, Compression, and Large-Scale Training

Authors: Shaoke Xi, ChonLam Lao, Boyi Jia, Jiaqi Gao, Zhipeng Zhang, Jiamin Cao, Brian Sutioso, Erci Xu, Minlan Yu, Kui Ren, Yong Li, Zhengping Qian, Ennan Zhai, Jingren Zhou

Abstract: Large language model (LLM) training today runs on clusters spanning thousands of GPUs. While this scale enables rapid model advances, developing, debugging, and performance-tuning the training framework inevitably becomes complex and costly. This is because engineers often need to reproduce production behaviors to diagnose failures or evaluate optimizations, thereby demanding frequent and even exclusive access to production-scale clusters -- which becomes increasingly hard given that the majority of GPUs are already committed to production workloads. Simulation relies on complex performance models that are difficult to maintain, and downscaled experiments often fail to capture scale-dependent behaviors. We present PrismLLM to decouple large-scale execution from the need to access large clusters, enabling engineers to run and observe ranks of interest under faithful large-scale behavior using only a few GPUs. PrismLLM constructs a high-fidelity execution graph via a slicing-based approach that captures computation, communication, and dependencies of the target scale. Then, PrismLLM performs hybrid emulation where selected ranks execute the original program while the remaining ranks are replayed as virtual participants. Experiments on large-scale LLM training workloads show that PrismLLM accurately reproduces performance and memory behavior, achieving only 0.58\% average error in iteration time and less than 0.01\% error in peak GPU memory usage. PrismLLM can emulate clusters of up to 8192 GPUs using fewer than 1\% of the physical GPUs required by the original deployment.

Comment: Faithful hybrid emulation of thousand-GPU LLM training behavior using only a few physical GPUs.

Topic Match: This is directly about large-scale training systems that change how engineers can study and optimize training cost and behavior.

Relevance: 9 Novelty: 8

3. Going Beyond the Edge: Distributed Inference of Transformer Models on Ultra-Low-Power Wireless Devices

ArXiv ID: 2605.15694

Primary Topic: Efficiency, Compression, and Large-Scale Training

Authors: Alexander Gr\"afe, Ding Huo, Johannes Berger, Marco Zimmerling, Sebastian Trimpe

Abstract: Transformer models are rapidly becoming a cornerstone of modern Internet of Things (IoT) applications, yet their computational and memory demands far exceed the capabilities of a single typical ultra-low-power IoT device. We present CATS, a framework for distributed transformer inference on ultra-low-power wireless devices, enabling multiple devices to collaboratively execute models far larger than what a single device can sustain. At its core, CATS is a communication-aware distributed transformer inference scheme co-designed across transformer partitioning, wireless communication and training. It employs SomeGather, a new pruned communication primitive that selectively broadcasts activation columns to reduce communication bandwidth and RAM usage without sacrificing model accuracy. Building on SomeGather, we design a partitioning method that exploits this primitive for efficient model parallelism. To cope with unreliable wireless communication, CATS employs message-dropout during training, which mimics packet losses and yields models that are robust to message loss during inference. In real-world experiments, we show that CATS brings distributed transformer inference to ultra-low-power wireless devices for the first time, with deployments on up to 16 devices that collaboratively execute transformer models up to 14 times larger than what a single device can run.

Comment: Introduces communication-aware distributed transformer inference with a new pruned communication primitive for ultra-low-power devices.

Topic Match: Best fit is efficiency_scaling because the main contribution is a new systems method that changes inference cost and feasibility through partitioning and communication design.

Relevance: 8 Novelty: 8

Representation Learning Theory and Structure (5)

1. Characterizing Learning in Deep Neural Networks using Tractable Algorithmic Complexity Analysis

ArXiv ID: 2605.15551

Primary Topic: Representation Learning Theory and Structure

Also Matches: Efficiency, Compression, and Large-Scale Training

Authors: Pedram Bakhtiarifard, Sophia N. Wilson, Mahmoud Afifi, Jonathan Wensh{\o}j, Raghavendra Selvan

Abstract: Training large-scale deep neural networks (DNNs) is resource-intensive, making model compression a practical necessity. The widely accepted ''learning as compression'' hypothesis posits that training induces structure in network weights, which enables compression. Measuring this structure through Kolmogorov-Chaitin-Solomonoff (KCS) complexity is appealing, but existing estimators based on the Coding Theorem Method (CTM) and the Block Decomposition Method (BDM) are limited to small binary objects and do not scale to modern DNNs. We introduce the Quantized Block Decomposition method (QuBD), which extends algorithmic complexity estimation to any $k$-ary object. QuBD first quantizes the network weights to a finite alphabet, then estimates the KCS complexity by aggregating per bit-plane CTM estimates. We show theoretically that QuBD yields a strictly tighter estimation gap with respect to true KCS complexity than binarization-based methods. Using QuBD, we study how the algorithmic complexity of neural network weights evolves during training, showing that it decreases as models learn, scales with data budget, increases during overfitting, follows the delayed generalization observed during grokking, and correlates with generalization performance. We further show that algorithmic information resides predominantly in the most significant bit-planes, which can serve as a practical diagnostic for determining appropriate post-training quantization levels. This work offers novel insights into learning mechanisms in DNNs by providing the first scalable, tractable estimates of KCS complexity for large, non-binary objects such as DNN weights.

Comment: Provides a scalable algorithmic-complexity estimator to analyze how weight structure forms during training and relates to generalization.

Topic Match: Its central contribution is mechanistic analysis of learned structure in network weights and how representation complexity evolves during learning.

Relevance: 9 Novelty: 8

2. Entropic Auto-Encoding via Implicit Free-Energy Minimization

ArXiv ID: 2605.16164

Primary Topic: Representation Learning Theory and Structure

Authors: Hazhir Aliahmadi, Irina Babayan, Greg van Anders

Abstract: Despite their ubiquity, variational autoencoders (VAEs) inherently suffer from posterior collapse, a failure mode in which latent variables are effectively ignored. This failure arises because explicit prior imposition drives optimization toward loss landscape regions corresponding to uninformative latent representations. Here, we introduce Entropic Autoencoders (EAEs), a framework in which reconstruction loss is the only explicit objective, and entropy generates the latent variables' prior implicitly through a free energy-minimizing ensemble of encoders. This ensemble biases learning toward high-volume regions of near-optimal solutions, while decoder updates direct the search trajectories toward informative latent representations. We demonstrate that EAEs mitigate posterior collapse by learning non-Gaussian, multimodal latent distributions that yield diverse, data-consistent generations and preserve different forms of underlying structure in the data. As a proof-of-concept, we show that an EAE captures a superposition of the known low-dimensional dynamics of a reaction-diffusion process. Then, we show that an EAE identifies implicit categorical distinctions in MNIST latent representations, and displays a hierarchical understanding of facial structure on the CelebA dataset, from an "all-human" face to individual-dependent features.

Comment: Avoids explicit priors by inducing latent structure through free-energy minimization over an encoder ensemble, mitigating posterior collapse.

Topic Match: The paper is centrally about how informative latent representations arise and how to avoid collapse in autoencoding.

Relevance: 8 Novelty: 8

3. Syntax Without Semantics: Teaching Large Language Models to Code in an Unseen Language

ArXiv ID: 2605.15607

Primary Topic: Representation Learning Theory and Structure

Authors: Vinayshekhar Bannihatti Kumar, Disha Makhija, Manoj Ghuhan Arivazhagan, Rashmi Gangadharaiah

Abstract: Large language models (LLMs) achieve high pass rates on code generation benchmarks, yet whether they can transfer this ability to languages absent from pretraining remains poorly understood. We introduce PyLang, a minimal imperative language absent from all pretraining corpora, and evaluate frontier models zero-shot and fine-tuned Qwen3 (4B, 8B, 32B) on 352 problems. We find that fine-tuning quickly teaches syntax but fails to transfer semantic competence: Python outperforms PyLang by up to 19% across all configurations, and no intervention (multi-task learning, preference tuning, code infilling, or latent-space objectives) closes the gap. An LLM judge reveals that frontier models select an identical algorithm to Python 80% of the time, yet cannot translate it into a working PyLang implementation., and CKA analysis confirms that fine-tuned models converge to nearly identical internal representations across languages (CKA > 0.97) while diverging at the output stage. We term this the implementation fidelity gap: models possess language-agnostic algorithmic understanding but cannot express it in an unfamiliar language. Our findings highlight the need for training methods that decouple reasoning from language-specific realization.

Comment: The paper identifies an implementation fidelity gap: models retain language-agnostic algorithmic representations but fail to realize them in an unseen programming language.

Topic Match: Its main contribution is mechanistic understanding of what internal representations do and do not transfer across languages, fitting representation structure well.

Relevance: 8 Novelty: 8

4. Judge Circuits

ArXiv ID: 2605.16023

Primary Topic: Representation Learning Theory and Structure

Authors: Nils Feldhus, Tanja Baeumel, Elena Golimblevskaia, Qianli Wang, Van Bach Nguyen, Aaron Louis Eidt, Christopher Ebert, Wojciech Samek, Jing Yang, Vera Schmitt, Sebastian M\"oller, Simon Ostermann

Abstract: LLM-as-a-judge has become the dominant paradigm for grading model outputs at scale, yet the same model assigns systematically different scores when its output format changes (e.g., a 1-5 rating vs. a True/False label). Existing diagnoses of these format-induced inconsistencies stop at the input-output level. Using Position-aware Edge Attribution Patching (PEAP), we causally investigate the internal mechanism in Gemma-3, Qwen2.5, and Llama-3. We find that judgments across structured understanding and open-ended preference tasks share a sparse, generalized Latent Evaluator sub-graph in the mid-to-late multi-layer perceptrons (MLPs); zero-ablating it collapses judgment while preserving world knowledge in architecturally modular models. By structurally decoupling abstract judging from output formatting, we provide a mechanistic account of format-induced inconsistency on the open-weight models we study: a continuous judgment signal computed in the shared trunk is mapped through fragile, format-specific terminal branches, enabling format-independent preference to be isolated downstream of the requested output format. Our findings imply that benchmark-level reliability comparisons across formats are partially measuring formatter geometry rather than evaluation quality.

Comment: Identifies sparse latent evaluator circuits that separate abstract judgment computation from format-specific output branches.

Topic Match: Best fit is representation structure because the paper gives mechanistic insight into how an internal evaluation representation is formed and routed within model circuits.

Relevance: 8 Novelty: 8

5. Bounded-Rationality, Hedging, and Generalization

ArXiv ID: 2605.15340

Primary Topic: Representation Learning Theory and Structure

Authors: Pedro A. Ortega

Abstract: A learner does not only fit data; it also determines how strongly the training sample may shape its output and how much distortion it can hedge. We study this relation as a bounded-rational decision problem whose primitive object is the induced channel from samples to outputs. The learner's response law determines which changes in this channel are cheap or costly, and therefore induces both a lower tradeoff curve between training loss and sample dependence and a matched upper certificate curve. When the response law is represented by an $f$-divergence regularizer, these curves live in the regularizer's native information geometry, with KL as the special case corresponding to Shannon mutual information. We show how the hedge and the two curves can be recovered from black-box behavior by observing responses to scaled losses and local loss perturbations. In learning, population loss is empirical loss plus the distortion induced by the particular training sample. The recovered hedge gives a practical certificate when it covers that distortion. Thus generalization is treated as a testable hedging property of the learner's own response law.

Comment: Recasts generalization as a bounded-rational hedging property of the learner's response law with recoverable certificate curves.

Topic Match: Best fit is representation_structure because it contributes theoretical understanding of learning behavior and generalization through an information-geometric lens.

Relevance: 8 Novelty: 8

Memory Structures and Agent Memory Systems (2)

1. Bridging Silicon and the Hippocampus: Algebro-Deterministic Memory "VaCoAl" as a Substrate for Vector-HaSH and TEM

ArXiv ID: 2605.15652

Primary Topic: Memory Structures and Agent Memory Systems

Also Matches: Representation Learning Theory and Structure

Authors: Hiroyuki Chuma, Kanji Otsuka, Yoichi Sato

Abstract: Vector-HaSH and the Tolman-Eichenbaum Machine (TEM) propose that the hippocampal-entorhinal circuit factorizes content from a prestructured grid-cell scaffold and supports compositional memory via ripple-mediated replay. Human iEEG shows that hippocampal sharp-wave ripples (SWRs) gate episodic recall, ripple-locked cortical reactivation recapitulates encoding-time patterns, and multi-hop replay fidelity decays multiplicatively along sequence length. These literatures have advanced in parallel without a shared algebraic object. We show that VaCoAl, an algebro-deterministic hyperdimensional memory architecture built from Galois-field LFSRs, supplies that object. Specifically, deterministic Galois-field diffusion provides a substrate-level alternative to Vector-HaSH's random scaffold-to-hippocampus projection that satisfies the same quasi-orthogonality requirement, with matched second-moment statistics, stronger avalanche behavior, and bit-exact reproducibility. The path-integral Confidence Ratio $CR_2$, the product of per-step $CR_1$ values along an $n$-hop chain, is the natural functional form for multi-hop replay-fidelity decay under conditional independence of per-step reactivation, providing the first algebraically tractable model of reported multiplicative decay. STDP-like path selection in VaCoAl follows from architectural demands -- similarity preservation, compositional reversibility, and bounded-frontier search -- that also constrain hippocampal computation. We further argue that VaCoAl operating regimes share architectural commitments with the EC--CA3 and EC--DG--CA3 pathways, motivating an energy-capacity-plasticity reading of why both are conserved across $>$520 Myr of evolution and primate dentate-gyrus elaboration. We prove formal correspondences, derive testable iEEG predictions, and bridge computational neuroscience and hyperdimensional engineering.

Comment: Proposes an algebraically tractable memory substrate linking hyperdimensional memory with hippocampal replay and multi-hop recall decay.

Topic Match: Its core contribution is a new memory mechanism and replay formalism for storage, recall, and compositional episodic memory, making memory systems the clearest fit.

Relevance: 9 Novelty: 8

2. Hidden in Memory: Sleeper Memory Poisoning in LLM Agents

ArXiv ID: 2605.15338

Primary Topic: Memory Structures and Agent Memory Systems

Authors: Sidharth Pulipaka, Stanislau Hlebik, Leonidas Raghav, Sahar Abdelnabi, Vyas Raina, Ivaxi Sheth, Mario Fritz

Abstract: Large language models are increasingly augmented with persistent memory, allowing assistants to store user-specific information across sessions for personalization and continuity. This statefulness introduces a new security risk: adversarial content can corrupt what an assistant remembers and thereby influence future interactions. We propose and study sleeper memory poisoning, a delayed attack in which an adversary manipulates external context, such as a document, webpage, or repository, to cause the assistant to store a fabricated memory about the user. Unlike conventional prompt injection, the attack can remain dormant and re-emerge across multiple later conversations. We evaluate the full attack pipeline: whether poisoned memories are written, later retrieved, and ultimately used to steer the following conversations. Across stateful LLM assistants, poisoned memories were added up to 99.8% on GPT-5.5 and 95% on Kimi-K2.6. Crucially, among successful retrievals, poisoned memories cause attacker-intended agentic actions in 60-89% of evaluations across models. These results show that persistent memory can act as a long-term attack surface across multiple future conversations.

Comment: Shows persistent memory in LLM agents as a distinct long-horizon attack surface by poisoning what gets stored, retrieved, and later acted on.

Topic Match: Best fit is memory_systems because the paper's core contribution is analysis of write-retrieve-use failures in persistent agent memory mechanisms.

Relevance: 8 Novelty: 8

World Models, Exploration, and Open-Ended Reinforcement Learning (4)

1. BAPR: Bayesian amnesic piecewise-robust reinforcement learning for non-stationary continuous control

ArXiv ID: 2605.16170

Primary Topic: World Models, Exploration, and Open-Ended Reinforcement Learning

Authors: Yifan Zhang, Liang Zheng

Abstract: Real-world control systems frequently operate under \emph{piecewise stationary} conditions, where dynamics remain stable for extended periods before undergoing abrupt regime changes. Standard robust RL methods face a fundamental dilemma: a globally conservative policy wastes performance during stable periods, while a locally adaptive policy risks catastrophic failure when the regime changes undetected. We propose \textbf{BAPR} (Bayesian Amnesic Piecewise-Robust SAC), which unifies Bayesian Online Change Detection (BOCD) with robust ensemble RL. The BAPR operator -- a convex combination of mode-conditional Bellman operators weighted by a frozen belief distribution -- is a $\gamma$-contraction. A complementary counterexample, machine-verified in Lean~4, establishes a \emph{sharp boundary}: when beliefs depend on the Q-function, the contraction factor becomes $\gamma + \lambda\Delta$ (where $\Delta$ is the mode reward gap), and contraction fails exactly when $\gamma + \lambda\Delta \geq 1$. We derive a \emph{component-wise} formal error budget for the abstract operator -- every component machine-verified -- bounding post-switch recovery; the budget applies to the abstract mode-mixture operator and inherits to the implemented shared-critic algorithm only through the frozen-parameter design intuition. All results are formally verified with no \texttt{sorry} (1,145 lines across 3 Lean~4 files, 22 machine-verified theorems). BOCD drives an adaptive conservatism mechanism: the policy becomes maximally conservative after detected change-points and smoothly relaxes as confidence grows, with detection delay $O(\log(1/\delta))$. A context-conditioning module trained via RMDM loss provides mode-aware representations from simulator-provided mode IDs at training time and requires no mode labels at deployment.

Comment: Bayesian online change detection is integrated with robust SAC and backed by machine-verified contraction and recovery guarantees for non-stationary control.

Topic Match: Its core contribution is a new RL principle for handling piecewise-stationary environments and post-switch adaptation, which fits foundational continual/non-stationary RL rather than LLM post-training.

Relevance: 9 Novelty: 8

2. Feedback World Model Enables Precise Guidance of Diffusion Policy

ArXiv ID: 2605.15705

Primary Topic: World Models, Exploration, and Open-Ended Reinforcement Learning

Also Matches: Memory Structures and Agent Memory Systems

Authors: Tuo An, Jindou Jia, Gen Li, Jingliang Li, Chuhao Zhou, Pengfei Liu, Bofan Lyu, Jiaqi Bai, Xinying Guo, Geng Li, Jianfei Yang

Abstract: World models aim to improve robotic decision making by predicting the consequences of actions. However, in practice, their predictions often become unreliable once the robot encounters states outside the training distribution, limiting their effectiveness at deployment. We observe that execution itself provides a natural but underutilized signal: after each action, the robot directly observes the true next state, revealing the mismatch between predicted and actual outcomes. Building on this insight, we propose feedback world model, a new paradigm that closes the loop between prediction and observation at inference time. Instead of treating the world model as a static open-loop predictor, our method maintains a lightweight feedback state that is updated online to iteratively correct future predictions, compensating for model errors using real-time observations without additional training data or parameter updates. We show that this process can be interpreted as a latent-space observer and admits convergence guarantees under mild conditions. We further introduce action-aware guidance to better translate corrected predictions into control by emphasizing action-controllable components while suppressing irrelevant variations. Experiments on LIBERO-Plus, Robomimic, and real-world manipulation tasks demonstrate that our method substantially improves both prediction accuracy and policy performance under distribution shift. In particular, it reduces world model prediction error by up to 76.4% and improves out-of-distribution (OOD) success rate by 30%. These results show that incorporating real-time feedback at inference time provides a simple yet powerful alternative to static world modeling.

Comment: Introduces inference-time feedback state updates that correct world-model predictions from observed next-state errors under distribution shift.

Topic Match: Best fit is world models and open-ended RL because the core idea is a new closed-loop world-model principle that updates predictive state online from execution feedback.

Relevance: 9 Novelty: 8

3. Imperfect World Models are Exploitable

ArXiv ID: 2605.15960

Primary Topic: World Models, Exploration, and Open-Ended Reinforcement Learning

Authors: Logan Mondal Bhamidipaty (University of Edinburgh), Esmeralda S. Whitammer (University of Edinburgh), David Abel (University of Edinburgh), Mykel J. Kochenderfer (Stanford University), Subramanian Ramamoorthy (University of Edinburgh)

Abstract: We propose a novel definition of model exploitation in reinforcement learning. Informally, a world model is exploitable if it implies that one policy should be strictly preferred over another while the environment's true transition model implies the reverse. We analogize our definition with a prior characterization of reward hacking but show that the associated proof of inevitability does not transfer to exploitation. To overcome this obstruction, we develop a general theory of reward hacking and model exploitation that proves that exploitation is essentially unavoidable on large policy sets and yields the corresponding claim for hacking as a special case. Unfortunately, we also find that the conditions that guarantee unhackability in finite policy sets have no counterpart that precludes exploitation. Consequently, we introduce a relaxed notion of exploitation and derive a safe horizon within which it can be avoided. Taken together, our results establish a formal bridge between reward hacking and model exploitation and elucidate the limits of safe planning in world models.

Comment: A formal theory showing that planning with imperfect world models is intrinsically vulnerable to exploitation.

Topic Match: This is directly about the limits and failure modes of world models for planning, which is central to the target RL/world-model topic.

Relevance: 9 Novelty: 8

4. Deterministic Event-Graph Substrates as World Models for Counterfactual Reasoning

ArXiv ID: 2605.15967

Primary Topic: World Models, Exploration, and Open-Ended Reinforcement Learning

Also Matches: Memory Structures and Agent Memory Systems

Authors: Fabio Rovai

Abstract: We study event-graph substrates: a class of world models that represent agent state as an append-only log of typed RDF triples and answer counterfactual queries by forking the log under a structured intervention vocabulary. Substrates are inspectable at the triple level, support exact counterfactuals, and transfer across domains without learned components. We formalize the class, prove a duality between explanatory and counterfactual queries that reduces both to the same causal-ancestor traversal, and evaluate a 1,400-line CLEVRER-DSL interpreter atop a domain-agnostic substrate runtime at full CLEVRER validation scale (n=75,618). The substrate exceeds the NS-DR symbolic oracle on all four per-question categories (by 9.89, 20.26, 17.65, and 0.80 percentage points), and exceeds the parametric ALOE baseline on descriptive and explanatory while lagging on predictive and counterfactual. We also introduce twin-EventLog, a 500-specification Park-canonical Smallville counterfactual benchmark on which the substrate exceeds Llama-3.1-8B with full context by 18.80 points joint accuracy.

Comment: An append-only event-graph substrate answers exact counterfactual queries by log forking, framing a symbolic world model for explanatory and counterfactual reasoning.

Topic Match: Its strongest match is world models because it proposes an explicit, inspectable substrate for modeling state and counterfactual dynamics across domains.

Relevance: 8 Novelty: 8

Paper Selection Prompt

System Prompt

You are a helpful paper reading assistant whose job is to read daily posts from ArXiv and identify a few papers that your friend will enjoy reading. Your job is to carefully read the paper titles and abstracts below and find the ones that match the criteria below.

User Prompt

Relevant Topics

Focus on specialized foundational research that remains worth reading even when it is not a daily hotspot.

Do not keep papers only because they are broadly frontier-relevant, widely discussed, or part of a major launch cycle. Broad daily frontier movement belongs in the hotspot digest unless the core contribution strongly matches the specialized topics below.

Architecture and Training Dynamics - Keep: work that introduces or analyzes core architectural or computational mechanisms such as MoE routing, attention variants, normalization or residual design, recurrent or state-space sequence modeling, dynamic or modular computation, or training-stability mechanisms. - Filter: papers that mainly apply an existing architecture to a new task or benchmark without new mechanistic insight.

Efficiency, Compression, and Large-Scale Training - Keep: quantization, sparsity, pruning, low-rank adaptation, KV-cache or cache design, memory-efficient inference or training, distributed training algorithms, communication or optimizer improvements, and training-system designs that materially change large-model training cost or behavior. - Filter: routine infrastructure optimization, deployment work, or straightforward tuning of standard efficiency methods without a clear new algorithmic or systems idea.

Representation Learning Theory and Structure - Keep: work on feature formation, sparse or dictionary learning, contrastive or self-supervised representation structure, training dynamics, identifiability, or other mechanistic understanding of learned representations. - Filter: papers that use representation-learning methods as standard components in downstream applications without new theoretical or methodological content.

Memory Structures and Agent Memory Systems - Keep: internal or external memory mechanisms, differentiable memory, recurrent or latent memory, long-context memory organization, memory compression or eviction, retrieval as a learned memory mechanism, episodic or semantic memory for agents, memory consolidation, forgetting, and agent memory systems whose core contribution is a new principle for storing, updating, recalling, or reasoning over memory. - Filter: standard RAG pipelines, vector-database plumbing, context stuffing, chat-history management, or agent products that add memory without a new memory mechanism, learning principle, or analysis.

World Models, Exploration, and Open-Ended Reinforcement Learning - Keep: model-based RL, action-conditioned world models, imagination or planning-based agents, open-ended exploration, automatic curriculum or environment generation, continual RL, reward-free skill discovery, and RL methods aimed at learning new behaviors or transferable knowledge through interaction. Also keep foundational work on pre-training agents or world models, foundation world models, generative interactive environments, or theoretical arguments about why world models or exploration are necessary for general-purpose agents. - Filter: RLHF, DPO, GRPO, RFT, instruction-following or alignment fine-tuning for LLMs; papers where RL is mainly a post-training optimizer for language models, reasoning traces, or tool-use agents without a new world-model, exploration, or generalization contribution; routine benchmark gains on a fixed environment without a new learning principle.

Usually leave these to the hotspot digest unless the core contribution is clearly foundational: - major model or product releases - broadly trendy agent or tooling launches - benchmark, leaderboard, or evaluation-only papers - downstream applications in medical imaging, segmentation, 3D vision, video understanding, information retrieval, summarization, recommendation, machine translation, speech recognition, time series, knowledge graphs, and similar domains

Scoring Criteria

Relevance and Novelty are independent axes. Score both from 1 to 10.

Relevance Scoring

9-10: directly centered on the target foundational topics; highest when the core contribution is clearly within them.

7-8: substantially related, but partly peripheral or focused on a narrower aspect.

5-6: touches the target topics, but the main contribution is elsewhere.

3-4: largely outside the target topics, often application-focused or domain-specific.

1-2: unrelated.

Important: Broad frontier relevance, major launch status, or daily buzz is not enough for a high Relevance score here. Those cases belong in the hotspot digest unless the paper strongly matches the specialized paper topics.

Novelty Scoring

9-10: new paradigm, theory, or major methodological breakthrough.

7-8: substantial methodological advance or strong new insight.

5-6: meaningful but incremental extension or refinement.

3-4: minor, narrow, or mostly engineering or domain-specific improvement.

1-2: little originality; mainly standard application of existing methods.

Topic Registry

Use exactly one PRIMARY_TOPIC_ID chosen from the stable topic IDs below. - architecture_training: Architecture and Training Dynamics - Core architectural or computational mechanisms, dynamic computation, and training-stability dynamics. - efficiency_scaling: Efficiency, Compression, and Large-Scale Training - Compression, sparsity, memory or cache efficiency, and large-scale training systems that materially change cost or behavior. - representation_structure: Representation Learning Theory and Structure - How learned representations form, organize, and support generalization or mechanistic understanding. - memory_systems: Memory Structures and Agent Memory Systems - Internal or external memory mechanisms, learned retrieval memory, consolidation, forgetting, and agent memory systems. - world_models_open_ended_rl: World Models, Exploration, and Open-Ended Reinforcement Learning - World models, model-based RL, exploration, continual learning, and RL for transferable knowledge acquisition rather than LLM post-training.

Papers

[PAPER LIST HERE]

Instructions

Respond in JSONL. Output exactly one JSON object per paper, one per line:

{"ARXIVID":"...","COMMENT":"...","RELEVANCE":0,"NOVELTY":0,"PRIMARY_TOPIC_ID":"...","MATCHED_TOPIC_IDS":[],"TOPIC_MATCH_COMMENT":"...","HOTSPOT_PAPER_TAGS":[],"HOTSPOT_PAPER_COMMENT":"..."}

Rules: - ARXIVID: the arXiv ID. - COMMENT: identify the single strongest matching criterion. Be brief and specific. Do not rely on generic phrases like "language modeling" or "advancement". Do not mention non-matching criteria. - RELEVANCE: integer from 1 to 10. - NOVELTY: integer from 1 to 10. - PRIMARY_TOPIC_ID: exactly one stable topic ID from the allowed topic registry. - MATCHED_TOPIC_IDS: zero or more stable topic IDs from the same allowed set. Include PRIMARY_TOPIC_ID when there are multiple matches. - TOPIC_MATCH_COMMENT: briefly explain why the primary topic is the best fit. - HOTSPOT_PAPER_TAGS: zero or more tags from this exact set only: daily_hot, new_frontier. - HOTSPOT_PAPER_COMMENT: briefly explain why the paper belongs in the daily hotspot paper feed when HOTSPOT_PAPER_TAGS is non-empty; otherwise use an empty string. - Use HOTSPOT_PAPER_TAGS sparingly. Most papers should return []. - daily_hot means the paper feels broadly important to the day and belongs in the daily hotspot paper section even if it is not part of the personalized foundational reading list. - new_frontier means the paper appears to open a genuinely new direction, paradigm, or field, even if the work is still early. - Do not output markdown, code fences, or any extra text.