Contributors Forks Stargazers Issues

Updated on 2026.05.06

Usage instructions: here

world model

Publish Date Title Authors PDF Code
2026-05-05 Implementing True MPI Sessions and Evaluating MPI Initialization Scalability Hui Zhou et.al. 2605.03983 null
2026-05-05 A Benchmark for Interactive World Models with a Unified Action Generation Framework Jianjie Fang et.al. 2605.03941 null
2026-05-05 RoboAlign-R1: Distilled Multimodal Reward Alignment for Robot Video World Models Hao Wu et.al. 2605.03821 null
2026-05-05 What You Think is What You See: Driving Exploration in VLM Agents via Visual-Linguistic Curiosity Haoxi Li et.al. 2605.03782 null
2026-05-05 AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics Tencent HY Team et.al. 2605.03652 null
2026-05-05 Learning to Theorize the World from Observation Doojin Baek et.al. 2605.03413 null
2026-05-04 Existence, Asymptotic Behavior, and Numerical Analysis of a Generalized Abel Differential Equation with Applications in Financial Modeling Dragos-Patru Covei et.al. 2605.02831 null
2026-05-04 DynoSLAM: Dynamic SLAM with Generative Graph Neural Networks for Real-World Social Navigation Danil Tokhchukov et.al. 2605.02759 null
2026-05-04 Shadow-Loom: Causal Reasoning over Graphical World Model of Narratives David Wilmot et.al. 2605.02475 null
2026-05-04 Video Generation with Predictive Latents Yian Zhao et.al. 2605.02134 null
2026-05-03 TRAP: Tail-aware Ranking Attack for World-Model Planning Siyuan Duan et.al. 2605.01950 null
2026-05-03 Divide and Conquer: Decoupled Representation Alignment for Multimodal World Models Junyuan Xiao et.al. 2605.01896 null
2026-05-03 Embody4D: A Generalist 4D World Model for Embodied AI Peiyan Tu et.al. 2605.01799 null
2026-05-03 SignVerse-2M: A Two-Million-Clip Pose-Native Universe of 25+ Sign Languages Sen Fang et.al. 2605.01720 null
2026-05-03 Latent State Design for World Models under Sufficiency Constraints Keon Woo Kim et.al. 2605.01694 null
2026-05-03 Video Active Perception: Effective Inference-Time Long-Form Video Understanding with Vision-Language Models Martin Q. Ma et.al. 2605.01662 null
2026-05-01 Physically Native World Models: A Hamiltonian Perspective on Generative World Modeling Sen Cui et.al. 2605.00412 null
2026-04-30 World Model for Robot Learning: A Comprehensive Survey Bohan Hou et.al. 2605.00080 null
2026-04-30 Being-H0.7: A Latent World-Action Model from Egocentric Videos Hao Luo et.al. 2605.00078 null
2026-04-30 HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation Xin Zhou et.al. 2604.28196 null
2026-04-30 LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models Hao Chen et.al. 2604.28192 null
2026-04-30 Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Keming Wu et.al. 2604.28185 null
2026-04-30 Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces Andrew Bond et.al. 2604.28122 null
2026-04-30 Dreaming Across Towns: Semantic Rollout and Town-Adversarial Regularization for Zero-Shot Held-Out-Town Fixed-Route Driving in CARLA Feeza Khan Khanzada et.al. 2604.27994 null
2026-04-30 GUI Agents with Reinforcement Learning: Toward Digital Inhabitants Junan Hu et.al. 2604.27955 null
2026-04-30 Flying by Inference: Active Inference World Models for Adaptive UAV Swarms Kaleem Arshid et.al. 2604.27935 null
2026-04-30 Simulating clinical interventions with a generative multimodal model of human physiology Guy Lutsker et.al. 2604.27899 null
2026-04-30 Graph World Models: Concepts, Taxonomy, and Future Directions Jiawei Liu et.al. 2604.27895 null
2026-04-30 MotuBrain: An Advanced World Action Model for Robot Control MotuBrain Team et.al. 2604.27792 null
2026-04-29 World2VLM: Distilling World Model Imagination into VLMs for Dynamic Spatial Reasoning Wanyue Zhang et.al. 2604.26934 null
2026-04-29 STARRY: Spatial-Temporal Action-Centric World Modeling for Robotic Manipulation Yuxuan Tian et.al. 2604.26848 null
2026-04-29 Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising Jun Guo et.al. 2604.26694 null
2026-04-29 AGEL-Comp: A Neuro-Symbolic Framework for Compositional Generalization in Interactive Agents Mahnoor Shahid et.al. 2604.26522 null
2026-04-29 DepthPilot: From Controllability to Interpretability in Colonoscopy Video Generation Junhu Fu et.al. 2604.26232 null
2026-04-28 Lifting Embodied World Models for Planning and Control Alex N. Wang et.al. 2604.26182 null
2026-04-28 HuM-Eval: A Coarse-to-Fine Framework for Human-Centric Video Evaluation Bingzi Zhang et.al. 2604.25361 null
2026-04-28 ProDrive: Proactive Planning for Autonomous Driving via Ego-Environment Co-Evolution Chuyao Fu et.al. 2604.25329 null
2026-04-27 Unfolding an Atomistic World: Atomistic Simulation of Reactor Pressure Vessel Steel Across Year-and-Meter Scales Haozhi Han et.al. 2604.24091 null
2026-04-26 From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation Jiafeng Wu et.al. 2604.23629 null
2026-04-26 Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling Zhen Ye et.al. 2604.23586 null
2026-04-26 Emotion-Conditioned Short-Horizon Human Pose Forecasting with a Lightweight Predictive World Model Jingni Huang et.al. 2604.23532 null
2026-04-25 Active Inference: A method for Phenotyping Agency in AI systems? Philip Wilson et.al. 2604.23278 null
2026-04-24 Beyond Single-Agent Alignment: Preventing Context-Fragmented Violations in Multi-Agent Systems Jie Wu et.al. 2604.22879 null
2026-04-24 Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Meng Chu et.al. 2604.22748 null
2026-04-24 Beyond Patient Invariance: Learning Cardiac Dynamics via Action-Conditioned JEPAs Jose Geraldo Fernandes et.al. 2604.22618 null
2026-04-24 Video Analysis and Generation via a Semantic Progress Function Gal Metzer et.al. 2604.22554 null
2026-04-24 OccDirector: Language-Guided Behavior and Interaction Generation in 4D Occupancy Space Zhuding Liang et.al. 2604.22240 null
2026-04-24 A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism, Governance, and Dynamics in Complex Societies Somyajit Chakraborty et.al. 2604.22227 null
2026-04-24 dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model Yaxuan Li et.al. 2604.22152 null
2026-04-23 Causality and Semantic Separation Anna Zhang et.al. 2604.22041 null
2026-04-23 Seeing Fast and Slow: Learning the Flow of Time in Videos Yen-Siang Wu et.al. 2604.21931 null
2026-04-23 Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions Jiseon Kim et.al. 2604.21871 null
2026-04-23 Hi-WM: Human-in-the-World-Model for Scalable Robot Post-Training Yaxuan Li et.al. 2604.21741 null
2026-04-22 Building a Precise Video Language with Human-AI Oversight Zhiqiu Lin et.al. 2604.21718 null
2026-04-23 WorldMark: A Unified Benchmark Suite for Interactive Video World Models Xiaojie Xu et.al. 2604.21686 null
2026-04-22 Agentic AI for Personalized Physiotherapy: A Multi-Agent Framework for Generative Video Training and Real-Time Pose Correction Abhishek Dharmaratnakar et.al. 2604.21154 null
2026-04-22 Open-H-Embodiment: A Large-Scale Dataset for Enabling Foundation Models in Medical Robotics Open-H-Embodiment Consortium et.al. 2604.21017 null
2026-04-22 DeVI: Physics-based Dexterous Human-Object Interaction via Synthetic Video Imitation Hyeonwoo Kim et.al. 2604.20841 null
2026-04-22 Occupancy Reward Shaping: Improving Credit Assignment for Offline Goal-Conditioned Reinforcement Learning Aravind Venugopal et.al. 2604.20627 null
2026-04-22 CCTVBench: Contrastive Consistency Traffic VideoQA Benchmark for Multimodal LLMs Xingcheng Zhou et.al. 2604.20460 null
2026-04-22 X-Cache: Cross-Chunk Block Caching for Few-Step Autoregressive World Models Inference Yixiao Zeng et.al. 2604.20289 null
2026-04-22 Cortex 2.0: Grounding World Models in Real-World Industrial Deployment Adriana Aida et.al. 2604.20246 null
2026-04-22 Toward Safe Autonomous Robotic Endovascular Interventions using World Models Harry Robertshaw et.al. 2604.20151 null
2026-04-21 ChipCraftBrain: Validation-First RTL Generation via Multi-Agent Orchestration Cagri Eryilmaz et.al. 2604.19856 null
2026-04-21 CityRAG: Stepping Into a City via Spatially-Grounded Video Generation Gene Chou et.al. 2604.19741 null
2026-04-21 UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling Boyu Chen et.al. 2604.19734 null
2026-04-22 Mask World Model: Predicting What Matters for Robust Robot Policy Learning Yunfan Lou et.al. 2604.19683 null
2026-04-21 Safety-Critical Contextual Control via Online Riemannian Optimization with World Models Tongxin Li et.al. 2604.19639 null
2026-04-21 LASER: Learning Active Sensing for Continuum Field Reconstruction Huayu Deng et.al. 2604.19355 null
2026-04-21 RoboWM-Bench: A Benchmark for Evaluating World Models in Robotic Manipulation Feng Jiang et.al. 2604.19092 null
2026-04-20 Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training Vin Bhaskara et.al. 2604.18701 null
2026-04-21 MultiWorld: Scalable Multi-Agent Multi-View Video World Models Haoyu Wu et.al. 2604.18564 null
2026-04-20 OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation Jinghui Lu et.al. 2604.18486 null
2026-04-20 Sonata: A Hybrid World Model for Inertial Kinematics under Clinical Data Scarcity Blaise Delaney et.al. 2604.18058 null
2026-04-20 The Umwelt Representation Hypothesis: Rethinking Universality Victoria Bosch et.al. 2604.17960 null
2026-04-20 Scaling Human-AI Coding Collaboration Requires a Governable Consensus Layer Tianfu Wang et.al. 2604.17883 null
2026-04-19 Infrastructure-Centric World Models: Bridging Temporal Depth and Spatial Breadth for Roadside Perception Siyuan Meng et.al. 2604.17651 null
2026-04-19 Dual-Anchoring: Addressing State Drift in Vision-Language Navigation Kangyi Wu et.al. 2604.17473 null
2026-04-19 Long-CODE: Isolating Pure Long-Context as an Orthogonal Dimension in Video Evaluation Zhijiang Tang et.al. 2604.17428 null
2026-04-19 DreamShot: Personalized Storyboard Synthesis with Video Diffusion Prior Junjia Huang et.al. 2604.17195 null
2026-04-18 TensorHub: Rethinking AI Model Hub with Tensor-Centric Compression Tingfeng Lan et.al. 2604.17104 null
2026-04-18 LIVE: Leveraging Image Manipulation Priors for Instruction-based Video Editing Weicheng Wang et.al. 2604.17021 null
2026-04-18 SafeDream: Safety World Model for Proactive Early Jailbreak Detection Bo Yan et.al. 2604.16824 null
2026-04-16 POMDP-based Object Search with Growing State Space and Hybrid Action Domain Yongbo Chen et.al. 2604.14965 null
2026-04-16 Learning Ad Hoc Network Dynamics via Graph-Structured World Models Can Karacelebi et.al. 2604.14811 null
2026-04-16 World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems Runze Li et.al. 2604.14732 null
2026-04-15 HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Team HY-World et.al. 2604.14268 null
2026-04-15 Seedance 2.0: Advancing Video Generation for World Complexity Team Seedance et.al. 2604.14148 null
2026-04-15 Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective Weijie Wang et.al. 2604.14025 null
2026-04-15 Beyond State Consistency: Behavior Consistency in Text-Based World Models Youling Huang et.al. 2604.13824 null
2026-04-15 Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap Hanxuan Chen et.al. 2604.13654 null
2026-04-15 DiT as Real-Time Rerenderer: Streaming Video Stylization with Autoregressive Diffusion Transformer Hengye Lyu et.al. 2604.13509 null
2026-04-15 VibeFlow: Versatile Video Chroma-Lux Editing through Self-Supervised Learning Yifan Li et.al. 2604.13425 null
2026-04-14 Robotic Manipulation is Vision-to-Geometry Mapping ( $f(v) \rightarrow G$ ): Vision-Geometry Backbones over Language and Video Models Zijian Song et.al. 2604.12908 null
2026-04-14 ArtifactWorld: Scaling 3D Gaussian Splatting Artifact Restoration via Video Generation Models Xinliang Wang et.al. 2604.12251 null
2026-04-13 Grounded World Model for Semantically Generalizable Planning Quanyi Li et.al. 2604.11751 null
2026-04-13 Dyadic Partnership(DP): A Missing Link Towards Full Autonomy in Medical Robotics Nassir Navab et.al. 2604.11423 null
2026-04-13 ComSim: Building Scalable Real-World Robot Data Generation via Compositional Simulation Yiran Qin et.al. 2604.11386 null
2026-04-13 WM-DAgger: Enabling Efficient Data Aggregation for Imitation Learning with World Models Anlan Yu et.al. 2604.11351 null
2026-04-13 3D-Anchored Lookahead Planning for Persistent Robotic Scene Memory via World-Model-Based MCTS Bronislav Sidik et.al. 2604.11302 null
2026-04-13 AIM: Intent-Aware Unified world action Modeling with Spatial Value Maps Liaoyuan Fan et.al. 2604.11135 null
2026-04-13 From Topology to Trajectory: LLM-Driven World Models For Supply Chain Resilience Jia Luo et.al. 2604.11041 null
2026-04-13 OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models Xiaomeng Hu et.al. 2604.10866 null
2026-04-12 Do LLMs Build Spatial World Models? Evidence from Grid-World Maze Tasks Weijiang Li et.al. 2604.10690 null
2026-04-11 Zero-shot World Models Are Developmentally Efficient Learners Khai Loong Aw et.al. 2604.10333 null
2026-04-11 VGA-Bench: A Unified Benchmark and Multi-Model Framework for Video Aesthetics and Generation Quality Evaluation Longteng Jiang et.al. 2604.10127 null
2026-04-10 EgoTL: Egocentric Think-Aloud Chains for Long-Horizon Tasks Lulin Liu et.al. 2604.09535 null
2026-04-10 Toward World Models for Epidemiology Zeeshan Memon et.al. 2604.09519 null
2026-04-10 PhysInOne: Visual Physics Learning and Reasoning in One Suite Siyuan Zhou et.al. 2604.09415 null
2026-04-10 VAG: Dual-Stream Video-Action Generation for Embodied Data Synthesis Xiaolei Lang et.al. 2604.09330 null
2026-04-10 Learning Vision-Language-Action World Models for Autonomous Driving Guoqing Wang et.al. 2604.09059 null
2026-04-10 Advantage-Guided Diffusion for Model-Based Reinforcement Learning Daniele Foffano et.al. 2604.09035 null
2026-04-10 Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory Zile Wang et.al. 2604.08995 null
2026-04-10 WOMBET: World Model-based Experience Transfer for Robust and Sample-efficient Reinforcement Learning Mintae Kim et.al. 2604.08958 null
2026-04-10 Multi-Agent Decision-Focused Learning via Value-Aware Sequential Communication Benjamin Amoh et.al. 2604.08944 null
2026-04-09 Toward Hardware-Agnostic Quadrupedal World Models via Morphology Conditioning Mohamad H. Danesh et.al. 2604.08780 null
2026-04-09 Phantom: Physics-Infused Video Generation via Joint Modeling of Visual and Latent Physical Dynamics Ying Shen et.al. 2604.08503 null
2026-04-09 Grounding Clinical AI Competency in Human Cognition Through the Clinical World Model and Skill-Mix Framework Seyed Amir Ahmad Safavi-Naini et.al. 2604.08226 null
2026-04-09 Beyond Static Forecasting: Unleashing the Power of World Models for Mobile Traffic Extrapolation Xiaoqian Qi et.al. 2604.08199 null
2026-04-09 ViVa: A Video-Generative Value Model for Robot Reinforcement Learning Jindi Lv et.al. 2604.08168 null
2026-04-09 MotionScape: A Large-Scale Real-World Highly Dynamic UAV Video Dataset for World Models Zile Guo et.al. 2604.07991 null
2026-04-09 WorldMAP: Bootstrapping Vision-Language Navigation Trajectory Prediction with Generative World Models Hongjin Chen et.al. 2604.07957 null
2026-04-09 DailyArt: Discovering Articulation from Single Static Images via Latent Dynamics Hang Zhang et.al. 2604.07758 null
2026-04-09 CausalVAE as a Plug-in for World Models: Towards Reliable Counterfactual Dynamics Ziyi Ding et.al. 2604.07712 null
2026-04-08 Grasp as You Dream: Imitating Functional Grasping from Generated Human Demonstrations Chao Tang et.al. 2604.07517 null
2026-04-08 GIRL: Generative Imagination Reinforcement Learning via Information-Theoretic Hallucination Control Prakul Sunil Hiremath et.al. 2604.07426 null
2026-04-08 How Much LLM Does a Self-Revising Agent Actually Need? Seongwoo Jeong et.al. 2604.07236 null
2026-04-08 PhyEdit: Towards Real-World Object Manipulation via Physically-Grounded Image Editing Ruihang Xu et.al. 2604.07230 null
2026-04-08 INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling InSpatio Team et.al. 2604.07209 null
2026-04-08 Radio-Frequency Inverse Rendering for Wireless Environment Modeling Fuhai Wang et.al. 2604.07086 null
2026-04-08 Telecom World Models: Unifying Digital Twins, Foundation Models, and Predictive Planning for 6G Hang Zou et.al. 2604.06882 null
2026-04-08 The Rhetoric of Machine Learning Robert C. Williamson et.al. 2604.06754 null
2026-04-08 Controllable Generative Video Compression Ding Ding et.al. 2604.06655 null
2026-04-07 Neural Computers Mingchen Zhuge et.al. 2604.06425 null
2026-04-07 Evolution of Video Generative Foundations Teng Hu et.al. 2604.06339 null
2026-04-07 Action Images: End-to-End Policy Learning via Multiview Video Generation Haoyu Zhen et.al. 2604.06168 null
2026-04-07 Toward Consistent World Models with Multi-Token Prediction and Latent Semantic Enhancement Qimin Zhong et.al. 2604.06155 null
2026-04-07 SEM-ROVER: Semantic Voxel-Guided Diffusion for Large-Scale Driving Scene Generation Hiba Dahmani et.al. 2604.06113 null
2026-04-06 Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Chaoyou Fu et.al. 2604.05015 null
2026-04-06 StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing StarVLA Community et.al. 2604.05014 null
2026-04-06 A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens Tommie Kerssies et.al. 2604.04913 null
2026-04-06 Individual and Combined Effects of English as a Second Language and Typos on LLM Performance Serena Liu et.al. 2604.04723 null
2026-04-06 OpenWorldLib: A Unified Codebase and Definition of Advanced World Models DataFlow Team et.al. 2604.04707 null
2026-04-06 Preserving Forgery Artifacts: AI-Generated Video Detection at Native Scale Zhengcen Li et.al. 2604.04634 null
2026-04-06 Veo-Act: How Far Can Frontier Video Models Advance Generalizable Robot Manipulation? Zhongru Zhang et.al. 2604.04502 null
2026-04-06 UENR-600K: A Large-Scale Physically Grounded Dataset for Nighttime Video Deraining Pei Yang et.al. 2604.04402 null
2026-04-05 DriveVA: Video Action Models are Zero-Shot Drivers Mengmeng Liu et.al. 2604.04198 null
2026-04-05 ATSS: Detecting AI-Generated Videos via Anomalous Temporal Self-Similarity Hang Wang et.al. 2604.04029 null
2026-04-04 Rethinking Position Embedding as a Context Controller for Multi-Reference and Multi-Shot Video Generation Binyuan Huang et.al. 2604.03738 null
2026-04-04 VidNum-1.4K: A Comprehensive Benchmark for Video-based Numerical Reasoning Shaoyang Cui et.al. 2604.03701 null

embodied AI

Publish Date Title Authors PDF Code
2026-05-04 Channel-Level Relation to Attentive Aggregation with Neighborhood-Homogeneity Constraint for Point Cloud Analysis Jiaqi Shi et.al. 2605.02357 null
2026-05-03 Embody4D: A Generalist 4D World Model for Embodied AI Peiyan Tu et.al. 2605.01799 null
2026-05-02 ESARBench: A Benchmark for Agentic UAV Embodied Search and Rescue Daoxuan Zhang et.al. 2605.01371 null
2026-05-02 VUDA: Breaking CUDA-Vulkan Isolation for Spatial Sharing of Compute and Graphics on the Same GPU Bin Xu et.al. 2605.01352 null
2026-05-01 Split and Aggregation Learning for Foundation Models Over Mobile Embodied AI Network (MEAN): A Comprehensive Survey Qianzhou Chen et.al. 2605.00970 null
2026-05-01 Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning Chengshuai Shi et.al. 2605.00347 null
2026-04-30 World Model for Robot Learning: A Comprehensive Survey Bohan Hou et.al. 2605.00080 null
2026-04-30 Bridging Values and Behavior: A Hierarchical Framework for Proactive Embodied Agents Chunhui Zhang et.al. 2604.27699 null
2026-04-30 Robot Learning from Human Videos: A Survey Junyi Ma et.al. 2604.27621 null
2026-04-30 SpaAct: Spatially-Activated Transition Learning with Curriculum Adaptation for Vision-Language Navigation Pengna Li et.al. 2604.27620 null
2026-04-30 World2Minecraft: Occupancy-Driven Simulated Scenes Construction Lechao Zhang et.al. 2604.27578 null
2026-04-30 SpatialGrammar: A Domain-Specific Language for LLM-Based 3D Indoor Scene Generation Song Tang et.al. 2604.27555 null
2026-04-30 Context as Prior: Bayesian-Inspired Intent Inference for Non-Speaking Agents with a Household Cat Testbed Wenqian Zhang et.al. 2604.27445 null
2026-04-29 3D Generation for Embodied AI and Robotic Simulation: A Survey Tianwei Ye et.al. 2604.26509 null
2026-04-29 Multiple Consistent 2D-3D Mappings for Robust Zero-Shot 3D Visual Grounding Yufei Yin et.al. 2604.26261 null
2026-04-28 Lifting Embodied World Models for Planning and Control Alex N. Wang et.al. 2604.26182 null
2026-04-28 GS-Playground: A High-Throughput Photorealistic Simulator for Vision-Informed Robot Learning Yufei Jia et.al. 2604.25459 null
2026-04-28 Where Did It Go Wrong? Capability-Oriented Failure Attribution for Vision-and-Language Navigation Agents Jianming Chen et.al. 2604.25161 null
2026-04-27 Interoceptive machine framework: Toward interoception-inspired regulatory architectures in artificial intelligence Diego Candia-Rivera et.al. 2604.24527 null
2026-04-27 AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents Hojoon Kim et.al. 2604.24039 null
2026-04-26 From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation Jiafeng Wu et.al. 2604.23629 null
2026-04-26 PhysCodeBench: Benchmarking Physics-Aware Symbolic Simulation of 3D Scenes via Self-Corrective Multi-Agent Refinement Tianyidan Xie et.al. 2604.23580 null
2026-04-24 AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI Mohammad Sadegh Salehi et.al. 2604.23018 null
2026-04-22 EgoDyn-Bench: Evaluating Ego-Motion Understanding in Vision-Centric Foundation Models for Autonomous Driving Finn Rasmus Schäfer et.al. 2604.22851 null
2026-04-27 A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism, Governance, and Dynamics in Complex Societies Somyajit Chakraborty et.al. 2604.22227 null
2026-04-23 A Replicable Robotics Awareness Method Using LLM-Enabled Robotics Interaction: Evidence from a Corporate Challenge S. A. Prieto et.al. 2604.21377 null
2026-04-23 ReCAPA: Hierarchical Predictive Correction to Mitigate Cascading Failures Xiyin Zeng et.al. 2604.21232 null
2026-04-23 Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment Jingkun Chen et.al. 2604.21160 null
2026-04-22 Planetary Exploration 3.0: A Roadmap for Software-Defined, Radically Adaptive Space Systems Masahiro Ono et.al. 2604.20910 null
2026-04-22 LLM-Guided Safety Agent for Edge Robotics with an ISO-Compliant Perception-Compute-Control Architecture Xu Huang et.al. 2604.20193 null
2026-04-21 Environmental Understanding Vision-Language Model for Embodied Agent Jinsik Bang et.al. 2604.19839 null
2026-04-21 InHabit: Leveraging Image Foundation Models for Scalable 3D Human Placement Nikita Kister et.al. 2604.19673 null
2026-04-21 SafetyALFRED: Evaluating Safety-Conscious Planning of Multimodal Large Language Models Josue Torres-Fonseca et.al. 2604.19638 null
2026-04-21 RoboWM-Bench: A Benchmark for Evaluating World Models in Robotic Manipulation Feng Jiang et.al. 2604.19092 null
2026-04-21 Explore Like Humans: Autonomous Exploration with Online SG-Memo Construction for Embodied Agents Xu Chen et.al. 2604.19034 null
2026-04-20 Will People Enjoy a Robot Trainer? A Case Study with Snoopie the Pacerbot Maximilian Du et.al. 2604.18331 null
2026-04-20 EmbodiedLGR: Integrating Lightweight Graph Representation and Retrieval for Semantic-Spatial Memory in Robotic Agents Paolo Riva et.al. 2604.18271 null
2026-04-20 E3VS-Bench: A Benchmark for Viewpoint-Dependent Active Perception in 3D Gaussian Splatting Scenes Koya Sakamoto et.al. 2604.17969 null
2026-04-20 StableIDM: Stabilizing Inverse Dynamics Model against Manipulator Truncation via Spatio-Temporal Refinement Kerui Li et.al. 2604.17887 null
2026-04-20 OmniVLA-RL: A Vision-Language-Action Model with Spatial Understanding and Online RL Haoxiang Jie et.al. 2604.17706 null
2026-04-19 Seeing Isn’t Believing: Mitigating Belief Inertia via Active Intervention in Embodied Agents Hanlin Wang et.al. 2604.17252 null
2026-04-19 GaLa: Hypergraph-Guided Visual Language Models for Procedural Planning Kun Wang et.al. 2604.17241 null
2026-04-18 Mini-BEHAVIOR-Gran: Revealing U-Shaped Effects of Instruction Granularity on Language-Guided Embodied Agents Sukai Huang et.al. 2604.17019 null
2026-04-18 Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification Jiawen Wen et.al. 2604.16993 null
2026-04-18 Chain Of Interaction Benchmark (COIN): When Reasoning meets Embodied Interaction Xianhao Wang et.al. 2604.16886 null
2026-04-16 GIST: Multimodal Knowledge Extraction and Spatial Grounding via Intelligent Semantic Topology Shivendra Agrawal et.al. 2604.15495 null
2026-04-20 ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints Pei-An Chen et.al. 2604.14902 null
2026-04-16 World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems Runze Li et.al. 2604.14732 null
2026-04-16 Model-Based Reinforcement Learning Exploits Passive Body Dynamics for High-Performance Biped Robot Locomotion Tomoya Kamimura et.al. 2604.14565 null
2026-04-15 SpaceMind: A Modular and Self-Evolving Embodied Vision-Language Agent Framework for Autonomous On-orbit Servicing Aodi Wu et.al. 2604.14399 null
2026-04-15 [Emerging Ideas] Artificial Tripartite Intelligence: A Bio-Inspired, Sensor-First Architecture for Physical AI You Rim Choi et.al. 2604.13959 null
2026-04-15 EmbodiedClaw: Conversational Workflow Execution for Embodied AI Development Xueyang Zhou et.al. 2604.13800 null
2026-04-15 ESCAPE: Episodic Spatial Memory and Adaptive Execution Policy for Long-Horizon Mobile Manipulation Jingjing Qian et.al. 2604.13633 null
2026-04-16 VGGT-Segmentor: Geometry-Enhanced Cross-View Segmentation Yulu Gao et.al. 2604.13596 null
2026-04-15 AgentComm: Semantic Communication for Embodied Agents Peiwen Jiang et.al. 2604.13558 null
2026-04-15 Evolvable Embodied Agent for Robotic Manipulation via Long Short-Term Reflection and Optimization Jianzong Wang et.al. 2604.13533 null
2026-04-14 Exploration and Exploitation Errors Are Measurable for Language Model Agents Jaden Park et.al. 2604.13151 null
2026-04-14 Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting Ziyuan Xia et.al. 2604.12626 null
2026-04-15 Reading Between the Pixels: Linking Text-Image Embedding Alignment to Typographic Attack Success on Vision-Language Models Ravikumar Balakrishnan et.al. 2604.12371 null
2026-04-13 Human-Inspired Context-Selective Multimodal Memory for Social Robots Hangyeol Kang et.al. 2604.12081 null
2026-04-13 GeomPrompt: Geometric Prompt Learning for RGB-D Semantic Segmentation Under Missing and Degraded Depth Krishna Jaganathan et.al. 2604.11585 null
2026-04-13 DA-PTQ: Drift-Aware Post-Training Quantization for Efficient Vision-Language-Action Models Siyuan Xu et.al. 2604.11572 null
2026-04-13 Efficient Emotion-Aware Iconic Gesture Prediction for Robot Co-Speech Edwin C. Montiel-Vazquez et.al. 2604.11417 null
2026-04-13 EmbodiedGovBench: A Benchmark for Governance, Recovery, and Upgrade Safety in Embodied Agent Systems Xue Qin et.al. 2604.11174 null
2026-04-13 EgoFun3D: Modeling Interactive Objects from Egocentric Videos using Function Templates Weikun Peng et.al. 2604.11038 null
2026-04-13 Federated Single-Agent Robotics: Multi-Robot Coordination Without Intra-Robot Multi-Agent Fragmentation Xue Qin et.al. 2604.11028 null
2026-04-14 ArtiCAD: Articulated CAD Assembly Design via Multi-Agent Code Generation Yuan Shui et.al. 2604.10992 null
2026-04-13 ScoRe-Flow: Complete Distributional Control via Score-Based Reinforcement Learning for Flow Matching Xiaotian Qiu et.al. 2604.10962 null
2026-04-12 ReplicateAnyScene: Zero-Shot Video-to-3D Composition via Textual-Visual-Spatial Alignment Mingyu Dong et.al. 2604.10789 null
2026-04-12 HOG-Layout: Hierarchical 3D Scene Generation, Optimization and Editing via Vision-Language Models Haiyan Jiang et.al. 2604.10772 null
2026-04-10 PhysInOne: Visual Physics Learning and Reasoning in One Suite Siyuan Zhou et.al. 2604.09415 null
2026-04-10 V-CAGE: Vision-Closed-Loop Agentic Generation Engine for Robotic Manipulation Yaru Liu et.al. 2604.09036 null
2026-04-10 PilotBench: A Benchmark for General Aviation Agents with Safety Constraints Yalun Wu et.al. 2604.08987 null
2026-04-10 AssemLM: Spatial Reasoning Multimodal Large Language Models for Robotic Assembly Zhi Jing et.al. 2604.08983 null
2026-04-09 AniGen: Unified $S^3$ Fields for Animatable 3D Asset Generation Yi-Hua Huang et.al. 2604.08746 null
2026-04-09 3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding Makanjuola Ogunleye et.al. 2604.08645 null
2026-04-09 Visually-grounded Humanoid Agents Hang Ye et.al. 2604.08509 null
2026-04-10 PolySLGen: Online Multimodal Speaking-Listening Reaction Generation in Polyadic Interaction Zhi-Yi Lin et.al. 2604.08125 null
2026-04-10 Governed Capability Evolution for Embodied Agents: Safe Upgrade, Compatibility Checking, and Runtime Rollback for Embodied Capability Modules Xue Qin et.al. 2604.08059 null
2026-04-09 DP-DeGauss: Dynamic Probabilistic Gaussian Decomposition for Egocentric 4D Scene Reconstruction Tingxi Chen et.al. 2604.07986 null
2026-04-09 PanoSAM2: Lightweight Distortion- and Memory-aware Adaptions of SAM2 for 360 Video Object Segmentation Dingwen Xiao et.al. 2604.07901 null
2026-04-09 Object-Attribute-Relation Model Driven Adaptive Hierarchical Transmission for Multimodal Semantic Communication Chenxing Li et.al. 2604.07859 null
2026-04-09 Harnessing Embodied Agents: Runtime Governance for Policy-Constrained Execution Xue Qin et.al. 2604.07833 null
2026-04-09 Learning Without Losing Identity: Capability Evolution for Embodied Agents Xue Qin et.al. 2604.07799 null
2026-04-09 DailyArt: Discovering Articulation from Single Static Images via Latent Dynamics Hang Zhang et.al. 2604.07758 null
2026-04-08 Spatio-Temporal Grounding of Large Language Models from Perception Streams Jacob Anderson et.al. 2604.07592 null
2026-04-08 Infrastructure First: Enabling Embodied AI for Science in the Global South Shaoshan Liu et.al. 2604.06722 null
2026-04-07 Hazard Management in Robot-Assisted Mammography Support Ioannis Stefanakos et.al. 2604.05749 null
2026-04-07 Rectified Schrödinger Bridge Matching for Few-Step Visual Navigation Wuyang Luan et.al. 2604.05673 null
2026-04-07 Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming Baoshun Tong et.al. 2604.05595 null
2026-04-07 CoEnv: Driving Embodied Multi-Agent Collaboration via Compositional Environment Li Kang et.al. 2604.05484 null
2026-04-06 StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing StarVLA Community et.al. 2604.05014 null
2026-04-06 InfBaGel: Human-Object-Scene Interaction Generation with Dynamic Perception and Iterative Refinement Yude Zou et.al. 2604.04843 null
2026-04-06 Toward Self-Organizing Production Logistics in Circular Factories: A Multi-Agent Approach Jan-Felix Klein et.al. 2604.04753 null
2026-04-06 ROSClaw: A Hierarchical Semantic-Physical Framework for Heterogeneous Multi-Agent Collaboration Rongfeng Zhao et.al. 2604.04664 null
2026-04-05 Hypothesis Graph Refinement: Hypothesis-Driven Exploration with Cascade Error Correction for Embodied Navigation Peixin Chen et.al. 2604.04108 null
2026-04-04 From Prompt to Physical Action: Structured Backdoor Attacks on LLM-Mediated Robotic Control Systems Mingyang Xie et.al. 2604.03890 null
2026-04-03 Learning Additively Compositional Latent Actions for Embodied AI Hangxing Wei et.al. 2604.03340 null
2026-04-03 OMNI-PoseX: A Fast Vision Model for 6D Object Pose Estimation in Embodied Tasks Michael Zhang et.al. 2604.02759 null
2026-04-02 Reliability-Aware Geometric Fusion for Robust Audio-Visual Navigation Teng Liu et.al. 2604.02391 null
2026-04-02 Hi-LOAM: Hierarchical Implicit Neural Fields for LiDAR Odometry and Mapping Zhiliu Yang et.al. 2604.01720 null
2026-03-31 Benchmarking Interaction, Beyond Policy: a Reproducible Benchmark for Collaborative Instance Object Navigation Edoardo Zorzi et.al. 2604.00265 null

image generation

Publish Date Title Authors PDF Code
2026-05-05 Large Language Models are Universal Reasoners for Visual Generation Sucheng Ren et.al. 2605.04040 null
2026-05-05 Flow Sampling: Learning to Sample from Unnormalized Densities via Denoising Conditional Processes Aaron Havens et.al. 2605.03984 null
2026-05-05 DMGD: Train-Free Dataset Distillation with Semantic-Distribution Matching in Diffusion Models Qichao Wang et.al. 2605.03877 null
2026-05-05 Phase-Corrected Near-Field Microwave Imaging via Inverse Source Reconstruction with Modulated Signals Quanfeng Wang et.al. 2605.03875 null
2026-05-05 Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation Bin Wu et.al. 2605.03849 null
2026-05-05 Towards accurate extreme event likelihoods from diffusion model climate emulators Peter Manshausen et.al. 2605.03802 null
2026-05-05 GeoTopoDiff: Learning Geometry–Topology Graph Priors through Boundary-Constrained Mixed Diffusion for Sparse-Slice 3D Porous Reconstruction Yue Shi et.al. 2605.03764 null
2026-05-05 Agent-Based Modeling of Low-Emission Fertilizer Adoption for Dairy Farm Decarbonisation using Empirical Farm Data Surya Jayakumar et.al. 2605.03648 null
2026-05-05 Diffusion Masked Pretraining for Dynamic Point Cloud Zhuoyue Zhang et.al. 2605.03639 null
2026-05-05 Bridging the Embodiment Gap: Disentangled Cross-Embodiment Video Editing Zhiyuan Li et.al. 2605.03637 null
2026-05-04 Active Sampling for Ultra-Low-Bit-Rate Video Compression via Conditional Controlled Diffusion Amirhosein Javadi et.al. 2605.02849 null
2026-05-04 TOC-SR: Task-Optimal Compact diffusion for Image Super Resolution Sowmya Vajrala et.al. 2605.02767 null
2026-05-04 SIAM: Head and Brain MRI Segmentation from Few High-Quality Templates via Synthetic Training Romain Valabregue et.al. 2605.02737 null
2026-05-04 Stylistic Attribute Control in Latent Diffusion Models Max Reimann et.al. 2605.02583 null
2026-05-04 MooD: An Efficient VA-Driven Affective Image Editing Framework via Fine-Grained Semantic Control Xinyi Yin et.al. 2605.02521 null
2026-05-04 Anomaly-Preference Image Generation Fuyun Wang et.al. 2605.02439 null
2026-05-04 DirectEdit: Step-Level Accurate Inversion for Flow-Based Image Editing Desong Yang et.al. 2605.02417 null
2026-05-04 DriftDecode: One-Step Wireless Image Decoding via Drifting-Inspired Detail Recovery Jingwen Fu et.al. 2605.02325 null
2026-05-04 Anon: Extrapolating Optimizer Adaptivity Across the Real Spectrum Yiheng Zhang et.al. 2605.02317 null
2026-05-04 A Hybrid Approach for Closing the Sim2real Appearance Gap in Game Engine Synthetic Datasets Stefanos Pasios et.al. 2605.02291 null
2026-05-01 Repurposing Image Diffusion Models for Adversarial Synthetic Structured Data: A Case Study of Ground Truth Drift Adam Arthur et.al. 2605.00788 null
2026-05-01 Reconstruction of glymphatic transport fields from subject-specific imaging data, with particular emphasis on cerebrospinal fluid flow and tracer conservation A. Derya Bakiler et.al. 2605.00730 null
2026-05-01 PhysEdit: Physically-Consistent Region-Aware Image Editing via Adaptive Spatio-Temporal Reasoning Guandong Li et.al. 2605.00707 null
2026-05-01 STARE: Step-wise Temporal Alignment and Red-teaming Engine for Multi-modal Toxicity Attack Xutao Mao et.al. 2605.00699 null
2026-05-01 UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors Houyuan Chen et.al. 2605.00658 null
2026-05-01 Faithful Extreme Image Rescaling with Learnable Reversible Transformation and Semantic Priors Hao Wei et.al. 2605.00605 null
2026-05-01 Colorful-Noise: Training-Free Low-Frequency Noise Manipulation for Color-Based Conditional Image Generation Nadav Z. Cohen et.al. 2605.00548 null
2026-05-01 End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer Wenda Chu et.al. 2605.00503 null
2026-05-01 Trees to Flows and Back: Unifying Decision Trees and Diffusion Models Sai Niranjan Ramachandran et.al. 2605.00414 null
2026-05-01 Binomial flows: Denoising and flow matching for discrete ordinal data Yair Shenfeld et.al. 2605.00360 null
2026-04-30 PhyCo: Learning Controllable Physical Priors for Generative Motion Sriram Narayanan et.al. 2604.28169 null
2026-04-29 AdvDMD: Adversarial Reward Meets DMD For High-Quality Few-Step Generation Xu Wang et.al. 2604.28126 null
2026-04-30 From LLM-Driven Trading Card Generation to Procedural Relatedness: A Pokémon Case Study Johannes Pfau et.al. 2604.27972 null
2026-04-30 Diffusion-OAMP for Joint Image Compression and Wireless Transmission Wentao Hou et.al. 2604.27952 null
2026-04-30 Noise2Map: End-to-End Diffusion Model for Semantic Segmentation and Change Detection Ali Shibli et.al. 2604.27889 null
2026-04-30 Machine Unlearning for Class Removal through SISA-based Deep Neural Network Architectures Ishrak Hamim Mahi et.al. 2604.27804 null
2026-04-30 Leveraging Verifier-Based Reinforcement Learning in Image Editing Hanzhong Guo et.al. 2604.27505 null
2026-04-30 Electrothermal Dynamics of Cold Front in Impure Tokamak Plasmas S. Oshiro et.al. 2604.27444 null
2026-04-30 ABC: Any-Subset Autoregression via Non-Markovian Diffusion Bridges in Continuous Time and Space Gabe Guo et.al. 2604.27443 null
2026-04-30 Sparse-View 3D Gaussian Splatting in the Wild Wongi Park et.al. 2604.27422 null
2026-04-29 SEAL: Semantic-aware Single-image Sticker Personalization with a Large-scale Sticker-tag Dataset Changhyun Roh et.al. 2604.26883 null
2026-04-29 Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data Bao Pham et.al. 2604.26841 null
2026-04-29 Conditional diffusion denoising probabilistic model for super-resolution of atmospheric boundary layer large eddy simulation Omar Sallam et.al. 2604.26776 null
2026-04-29 Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising Jun Guo et.al. 2604.26694 null
2026-04-29 Delta Score Matters! Spatial Adaptive Multi Guidance in Diffusion Models Haosen Li et.al. 2604.26503 null
2026-04-29 Probabilistic data quality assessment for structural monitoring data via outlier-resistant conditional diffusion model Qi Li et.al. 2604.26366 null
2026-04-29 Beyond Fixed Formulas: Data-Driven Linear Predictor for Efficient Diffusion Models Zhirong Shen et.al. 2604.26365 null
2026-04-29 ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance Yang Yang et.al. 2604.26348 null
2026-04-29 SpatialFusion: Endowing Unified Image Generation with Intrinsic 3D Geometric Awareness Haiyi Qiu et.al. 2604.26341 null
2026-04-28 Charge diffusion and modulation transfer function in a Nancy Grace Roman Space Telescope detector Emily Macbeth et.al. 2604.26114 null
2026-04-28 DDA-Thinker: Decoupled Dual-Atomic Reinforcement Learning for Reasoning-Driven Image Editing Hanqing Yang et.al. 2604.25477 null
2026-04-28 A Systematic Post-Train Framework for Video Generation Zeyue Xue et.al. 2604.25427 null
2026-04-28 Benchmarking Layout-Guided Diffusion Models through Unified Semantic-Spatial Evaluation in Closed and Open Settings Luca Parolari et.al. 2604.25358 null
2026-04-28 Edge-Cloud Collaborative Reconstruction via Structure-Aware Latent Diffusion for Downstream Remote Sensing Perception Yun Li et.al. 2604.25319 null
2026-04-28 Golden RPG: Confidence-Adaptive Region-Aware Noise for Compositional Text-to-Image Generation Hao Li et.al. 2604.25314 null
2026-04-28 The Thinking Pixel: Recursive Sparse Reasoning in Multimodal Diffusion Latents Yuwei Sun et.al. 2604.25299 null
2026-04-28 Exploring Time Conditioning in Diffusion Generative Models from Disjoint Noisy Data Manifolds Liuzhuozheng Li et.al. 2604.25289 null
2026-04-28 ResetEdit: Precise Text-guided Editing of Generated Image via Resettable Starting Latent Hanyi Wang et.al. 2604.25128 null
2026-04-27 Generative diffusion models for spatiotemporal influenza forecasting Joseph Lemaitre et.al. 2604.24913 null
2026-04-27 VibeToken: Scaling 1D Image Tokenizers and Autoregressive Models for Dynamic Resolution Generations Maitreya Patel et.al. 2604.24885 null
2026-04-27 DiffQEC: A versatile diffusion model for quantum error correction Tianyi Xu et.al. 2604.24640 null
2026-04-27 Meta-CoT: Enhancing Granularity and Generalization in Image Editing Shiyi Zhang et.al. 2604.24625 null
2026-04-27 Diffusion Model as a Generalist Segmentation Learner Haoxiao Wang et.al. 2604.24575 null
2026-04-27 CA-IDD: Cross-Attention Guided Identity-Conditional Diffusion for Identity-Consistent Face Swapping Md Shohel Rana et.al. 2604.24493 null
2026-04-27 Guiding Vector Field Generation via Score-based Diffusion Model Zirui Chen et.al. 2604.24487 null
2026-04-27 TextGround4M: A Prompt-Aligned Dataset for Layout-Aware Text Rendering Dongxing Mao et.al. 2604.24459 null
2026-04-27 Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion Zhongjie Duan et.al. 2604.24351 null
2026-04-27 GeoEdit: Local Frames for Fast, Training-Free On-Manifold Editing in Diffusion Models Yiming Zhang et.al. 2604.24238 null
2026-04-27 Seeing Is No Longer Believing: Frontier Image Generation Models, Synthetic Visual Evidence, and Real-World Risk Shuai Wu et.al. 2604.24197 null
2026-04-27 Bridging Restoration and Generation Manifolds in One-Step Diffusion for Real-World Super-Resolution Shyang-En Weng et.al. 2604.24136 null
2026-04-24 Statistical Analysis of Markovian Generative Modeling Eddie Aamari et.al. 2604.22712 null
2026-04-24 Generative Modeling of Neurodegenerative Brain Anatomy with 4D Longitudinal Diffusion Model Nivetha Jayakumar et.al. 2604.22700 null
2026-04-24 Structure-Guided Diffusion Model for EEG-Based Visual Cognition Reconstruction Yongxiang Lian et.al. 2604.22649 null
2026-04-24 Efficient Diffusion Distillation via Embedding Loss Jincheng Ying et.al. 2604.22379 null
2026-04-24 TabSCM: A practical Framework for Generating Realistic Tabular Data Sven Jacob et.al. 2604.22337 null
2026-04-24 Knowledge Visualization: A Benchmark and Method for Knowledge-Intensive Text-to-Image Generation Ran Zhao et.al. 2604.22302 null
2026-04-24 Evaluation of image simulation open source solutions for simulation of synthetic images in lunar environment Jai G Singla et.al. 2604.22296 null
2026-04-24 AI-Driven Performance-to-Design Generation and Optimization of Marine Propellers Leah Chen et.al. 2604.22224 null
2026-04-24 Breaking Watermarks in the Frequency Domain: A Modulated Diffusion Attack Framework Chunpeng Wang et.al. 2604.22220 null
2026-04-24 Multimodal Diffusion to Mutually Enhance Polarized Light and Low Resolution EBSD Data Harry Dong et.al. 2604.22212 null
2026-04-23 VistaBot: View-Robust Robot Manipulation via Spatiotemporal-Aware View Synthesis Songen Gu et.al. 2604.21914 null
2026-04-23 UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection Yanran Zhang et.al. 2604.21904 null
2026-04-23 A Scale-Adaptive Framework for Joint Spatiotemporal Super-Resolution with Diffusion Models Max Defez et.al. 2604.21903 null
2026-04-23 Causality-Encoded Diffusion Models for Interventional Sampling and Edge Inference Li Chen et.al. 2604.21843 null
2026-04-23 Quotient-Space Diffusion Models Yixian Xu et.al. 2604.21809 null
2026-04-23 DCMorph: Face Morphing via Dual-Stream Cross-Attention Diffusion Tahar Chettaoui et.al. 2604.21627 null
2026-04-23 Generative Learning Enhanced Intelligent Resource Management for Cell-Free Delay Deterministic Communications Shuangbo Xiong et.al. 2604.21587 null
2026-04-23 DiffNR: Diffusion-Enhanced Neural Representation Optimization for Sparse-View 3D Tomographic Reconstruction Shiyan Su et.al. 2604.21518 null
2026-04-23 VARestorer: One-Step VAR Distillation for Real-World Image Super-Resolution Yixuan Zhu et.al. 2604.21450 null
2026-04-23 TopoStyle: Supporting Iterative Design with Generative AI for 2.5D Topology Optimization Shuyue Feng et.al. 2604.21315 null
2026-04-22 ParetoSlider: Diffusion Models Post-Training for Continuous Reward Control Shelly Golan et.al. 2604.20816 null
2026-04-22 LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Inclusion AI et.al. 2604.20796 null
2026-04-22 Geometric Renyi Differential Privacy: Ricci Curvature Characterized by Heat Diffusion Mechanisms Xiaotian Chang et.al. 2604.20761 null
2026-04-22 GeoRelight: Learning Joint Geometrical Relighting and Reconstruction with Flexible Multi-Modal Diffusion Transformers Yuxuan Xue et.al. 2604.20715 null
2026-04-22 Physics-Informed Conditional Diffusion for Motion-Robust Retinal Temporal Laser Speckle Contrast Imaging Qian Chen et.al. 2604.20594 null
2026-04-22 Exploring Spatial Intelligence from a Generative Perspective Muzhi Zhu et.al. 2604.20570 null
2026-04-22 Near-Field Wideband Channel Estimation for XL-MIMO Systems via Denoising Diffusion Model Qingxia Feng et.al. 2604.20494 null
2026-04-22 Conditional Monte Carlo Tree Diffusion for Designing Cell-Type-Specific and Biologically Faithful Regulatory DNA Animesh Awasthi et.al. 2604.20488 null
2026-04-22 Discrete Preference Learning for Personalized Multimodal Generation Yuting Zhang et.al. 2604.20434 null
2026-04-22 Cold-Start Forecasting of New Product Life-Cycles via Conditional Diffusion Models Ruihan Zhou et.al. 2604.20370 null
2026-04-21 Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items Mengting Chen et.al. 2604.19748 null
2026-04-21 AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model Yutian Chen et.al. 2604.19747 null
2026-04-21 Generative Drifting for Conditional Medical Image Generation Zirong Li et.al. 2604.19736 null
2026-04-21 ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis Zhengwentai Sun et.al. 2604.19720 null
2026-04-21 MedFlowSeg: Flow Matching for Medical Image Segmentation with Frequency-Aware Attention Zhi Chen et.al. 2604.19675 null
2026-04-21 InHabit: Leveraging Image Foundation Models for Scalable 3D Human Placement Nikita Kister et.al. 2604.19673 null
2026-04-21 Budgeted Online Influence Maximization Pierre Perrault et.al. 2604.19672 null
2026-04-21 Multi-Cycle Spatio-Temporal Adaptation in Human-Robot Teaming Alex Cuellar et.al. 2604.19670 null
2026-04-21 CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation Xiangyang Luo et.al. 2604.19636 null
2026-04-21 SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing Ying Zeng et.al. 2604.19587 null
2026-04-20 PlankFormer: Robust Plankton Instance Segmentation via MAE-Pretrained Vision Transformers and Pseudo Community Image Generation Masaharu Miyazaki et.al. 2604.17856 null
2026-04-20 UniCSG: Unified High-Fidelity Content-Constrained Style-Driven Generation via Staged Semantic and Frequency Disentanglement Jingwei Yang et.al. 2604.17850 null
2026-04-20 Efficient Diffusion Models under Nonconvex Equality and Inequality constraints via Landing Kijung Jeon et.al. 2604.17838 null
2026-04-20 AnyLift: Scaling Motion Reconstruction from Internet Videos via 2D Diffusion Hongjie Li et.al. 2604.17818 null
2026-04-20 Optimally Bridging Semantics and Data: Generative Semantic Communication via Schrödinger Bridge Dahua Gao et.al. 2604.17802 null
2026-04-20 Structure-Adaptive Sparse Diffusion in Voxel Space for 3D Medical Image Enhancement Hongxu Jiang et.al. 2604.17773 null
2026-04-20 Grokking of Diffusion Models: Case Study on Modular Addition Joon Hyeok Kim et.al. 2604.17673 null
2026-04-19 ViPS: Video-informed Pose Spaces for Auto-Rigged Meshes Honglin Chen et.al. 2604.17623 null
2026-04-19 DGSSM: Diffusion guided state-space models for multimodal salient object detection Suklav Ghosh et.al. 2604.17585 null
2026-04-19 Target Parameterization in Diffusion Models for Nonlinear Spatiotemporal System Identification Achraf El Messaoudi et.al. 2604.17566 null
2026-04-17 Repurposing 3D Generative Model for Autoregressive Layout Generation Haoran Feng et.al. 2604.16299 null
2026-04-17 Enhancing Hazy Wildlife Imagery: AnimalHaze3k and IncepDehazeGan Shivarth Rai et.al. 2604.16284 null
2026-04-17 Motion-Adapter: A Diffusion Model Adapter for Text-to-Motion Generation of Compound Actions Yue Jiang et.al. 2604.16135 null
2026-04-17 Elucidating the SNR-t Bias of Diffusion Probabilistic Models Meng Yu et.al. 2604.16044 null
2026-04-17 From Competition to Coopetition: Coopetitive Training-Free Image Editing Based on Text Guidance Jinhao Shen et.al. 2604.15948 null
2026-04-17 Making Image Editing Easier via Adaptive Task Reformulation with Agentic Executions Bo Zhao et.al. 2604.15917 null
2026-04-17 Efficient Video Diffusion Models: Advancements and Challenges Shitong Shao et.al. 2604.15911 null
2026-04-17 Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration Jun Li et.al. 2604.15829 null
2026-04-17 Neural Continuous-Time Markov Chain: Discrete Diffusion via Decoupled Jump Timing and Direction Jingyuan Li et.al. 2604.15694 null
2026-04-17 CLIMB: Controllable Longitudinal Brain Image Generation using Mamba-based Latent Diffusion Model and Gaussian-aligned Autoencoder Duy-Phuong Dao et.al. 2604.15611 null
2026-04-16 TokenLight: Precise Lighting Control in Images using Attribute Tokens Sumit Chaturvedi et.al. 2604.15310 null
2026-04-16 An Analysis of Regularization and Fokker-Planck Residuals in Diffusion Models for Image Generation Onno Niemann et.al. 2604.15171 null
2026-04-16 Towards Faster Language Model Inference Using Mixture-of-Experts Flow Matching Aihua Li et.al. 2604.15009 null
2026-04-16 Diffusion Crossover: Defining Evolutionary Recombination in Diffusion Models via Noise Sequence Interpolation Chisatao Kumada et.al. 2604.14790 null
2026-04-16 Constraint-based Pre-training: From Structured Constraints to Scalable Model Initialization Fu Feng et.al. 2604.14769 null
2026-04-16 SynHAT: A Two-stage Coarse-to-Fine Diffusion Framework for Synthesizing Human Activity Traces Rongchao Xu et.al. 2604.14705 null
2026-04-16 Mean Flow Policy Optimization Xiaoyi Dong et.al. 2604.14698 null
2026-04-16 Seen-to-Scene: Keep the Seen, Generate the Unseen for Video Outpainting Inseok Jeon et.al. 2604.14648 null
2026-04-16 Uncertainty-aware Generative Learning Path Recommendation with Cognition-Adaptive Diffusion Xiangrui Xiong et.al. 2604.14613 null
2026-04-16 Prompt-Guided Image Editing with Masked Logit Nudging in Visual Autoregressive Models Amir El-Ghoussani et.al. 2604.14591 null
2026-04-15 Diffusion Language Models for Speech Recognition Davyd Naveriani et.al. 2604.14001 null
2026-04-15 Creo: From One-Shot Image Generation to Progressive, Co-Creative Ideation Zoe De Simone et.al. 2604.13956 null
2026-04-15 ASTRA: Enhancing Multi-Subject Generation with Retrieval-Augmented Pose Guidance and Disentangled Position Embedding Tianze Xia et.al. 2604.13938 null
2026-04-15 Three-dimensional photon transport in spinodal photocatalytic aerogels: how bicontinuous morphology controls kinetic rate constants Renaud A. L. Vallée et.al. 2604.13929 null
2026-04-15 Blind Bitstream-corrupted Video Recovery via Metadata-guided Diffusion Model Shuyun Wang et.al. 2604.13906 null
2026-04-15 PostureObjectstitch: Anomaly Image Generation Considering Assembly Relationships in Industrial Scenarios Zebei Tong et.al. 2604.13863 null
2026-04-15 DiffMagicFace: Identity Consistent Facial Editing of Real Videos Huanghao Yin et.al. 2604.13841 null
2026-04-15 EMGFlow: Robust and Efficient Surface Electromyography Synthesis via Flow Matching Boxuan Jiang et.al. 2604.13685 null
2026-04-15 Reconstruction of a 3D wireframe from a single line drawing via generative depth estimation Elton Cao et.al. 2604.13549 null
2026-04-15 LEGO-MOF: Equivariant Latent Manipulation for Editable, Generative, and Optimizable MOF Design Chaoran Zhang et.al. 2604.13520 null
2026-04-14 Generative Refinement Networks for Visual Synthesis Jian Han et.al. 2604.13030 null
2026-04-14 Causal Diffusion Models for Counterfactual Outcome Distributions in Longitudinal Data Farbod Alinezhad et.al. 2604.12992 null
2026-04-14 Turbulent pair dispersion with Stochastic Generative Diffusion Models Andrei Pantea et.al. 2604.12932 null
2026-04-14 Transformer Based Machine Fault Detection From Audio Input Kiran Voderhobli Holla et.al. 2604.12733 null
2026-04-14 OFA-Diffusion Compression: Compressing Diffusion Model in One-Shot Manner Haoyang Jiang et.al. 2604.12668 null
2026-04-14 SOAR: Self-Correction for Optimal Alignment and Refinement in Diffusion Models You Qin et.al. 2604.12617 null
2026-04-14 StructDiff: A Structure-Preserving and Spatially Controllable Diffusion Model for Single-Image Generation Yinxi He et.al. 2604.12575 null
2026-04-14 T2I-BiasBench: A Multi-Metric Framework for Auditing Demographic and Cultural Bias in Text-to-Image Models Nihal Jaiswal et.al. 2604.12481 null
2026-04-14 Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling Zida Li et.al. 2604.12446 null
2026-04-14 Bridging the Micro–Macro Gap: Frequency-Aware Semantic Alignment for Image Manipulation Localization Xiaojie Liang et.al. 2604.12341 null
2026-04-13 Diffusing diffusivity model with dichotomous noise Dongho Lee et.al. 2604.11800 null
2026-04-13 LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling Yuxin Chen et.al. 2604.11748 null
2026-04-13 On the Robustness of Watermarking for Autoregressive Image Generation Andreas Müller et.al. 2604.11720 null
2026-04-13 Representations Before Pixels: Semantics-Guided Hierarchical Video Prediction Efstathios Karypidis et.al. 2604.11707 null
2026-04-13 Dual-Control Frequency-Aware Diffusion Model for Depth-Dependent Optical Microrobot Microscopy Image Generation Lan Wei et.al. 2604.11680 null
2026-04-13 RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Haozhe Wang et.al. 2604.11626 null
2026-04-13 Progressively Texture-Aware Diffusion for Contrast-Enhanced Sparse-View CT Tianqi Wang et.al. 2604.11559 null
2026-04-13 Continuous Adversarial Flow Models Shanchuan Lin et.al. 2604.11521 null
2026-04-13 Anthropogenic Regional Adaptation in Multimodal Vision-Language Model Samuel Cahyawijaya et.al. 2604.11490 null
2026-04-13 Degradation-Aware and Structure-Preserving Diffusion for Real-World Image Super-Resolution Yang Ji et.al. 2604.11470 null
2026-04-13 One Scale at a Time: Scale-Autoregressive Modeling for Fluid Flow Distributions Mario Lino et.al. 2604.11403 null
2026-04-13 DiLO: Decoupling Generative Priors and Neural Operators via Diffusion Latent Optimization for Inverse Problems Haibo Liu et.al. 2604.11375 null
2026-04-13 Any 3D Scene is Worth 1K Tokens: 3D-Grounded Representation for Scene Generation at Scale Dongxu Wei et.al. 2604.11331 null
2026-04-13 Learning Discrete Diffusion of Graphs via Free-Energy Gradient Flows Dario Rancati et.al. 2604.11311 null
2026-04-13 Structured State-Space Regularization for Compact and Generation-Friendly Image Tokenization Jinsung Lee et.al. 2604.11089 null
2026-04-13 LaDA-Band: Language Diffusion Models for Vocal-to-Accompaniment Generation Qi Wang et.al. 2604.11052 null
2026-04-10 Envisioning the Future, One Step at a Time Stefan Andreas Baumann et.al. 2604.09527 null
2026-04-10 Gardening on the Moon: An Advection-Diffusion Model to Guide the Search for Supernova Debris in the Lunar Regolith Emily S. Costello et.al. 2604.09524 null
2026-04-10 SCoRe: Clean Image Generation from Diffusion Models Trained on Noisy Images Yuta Matsuzaki et.al. 2604.09436 null
2026-04-10 Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories Wonbong Jang et.al. 2604.09429 null
2026-04-10 EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure Junyeong Ahn et.al. 2604.09405 null
2026-04-10 Region-Constrained Group Relative Policy Optimization for Flow-Based Image Editing Zhuohan Ouyang et.al. 2604.09386 null
2026-04-10 Hitem3D 2.0: Multi-View Guided Native 3D Texture Generation Huiang He et.al. 2604.09231 null
2026-04-10 Training-free, Perceptually Consistent Low-Resolution Previews with High-Resolution Image for Efficient Workflows of Diffusion Models Wongi Jeong et.al. 2604.09227 null
2026-04-10 SHIFT: Steering Hidden Intermediates in Flow Transformers Nina Konovalova et.al. 2604.09213 null
2026-04-10 CT-1: Vision-Language-Camera Models Transfer Spatial Reasoning Knowledge to Camera-Controllable Video Generation Haoyu Zhao et.al. 2604.09201 null
2026-04-09 When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models Zhengyang Sun et.al. 2604.08546 null
2026-04-09 RewardFlow: Generate Images by Optimizing What You Reward Onkar Susladkar et.al. 2604.08536 null
2026-04-09 Novel View Synthesis as Video Completion Qi Wu et.al. 2604.08500 null
2026-04-09 LAMP: Lift Image-Editing as General 3D Priors for Open-world Manipulation Jingjing Wang et.al. 2604.08475 null
2026-04-09 Bias-Constrained Diffusion Schedules for PDE Emulations: Reconstruction Error Minimization and Efficient Unrolled Training Constantin Le Cleï et.al. 2604.08357 null
2026-04-09 Controlling the rain fall statistics using Mean-Reverting Jump Diffusion model Joya GhoshDastider et.al. 2604.08338 null
2026-04-09 DiV-INR: Extreme Low-Bitrate Diffusion Video Compression with INR Conditioning Eren Çetin et.al. 2604.08329 null
2026-04-09 HistDiT: A Structure-Aware Latent Conditional Diffusion Model for High-Fidelity Virtual Staining in Histopathology Aasim Bin Saleem et.al. 2604.08305 null
2026-04-09 GroundingAnomaly: Spatially-Grounded Diffusion for Few-Shot Anomaly Synthesis Yishen Liu et.al. 2604.08301 null
2026-04-09 EditCaption: Human-Aligned Instruction Synthesis for Image Editing via Supervised Fine-Tuning and Direct Preference Optimization Xiangyuan Wang et.al. 2604.08213 null
2026-04-08 Distilling Photon-Counting CT into Routine Chest CT through Clinically Validated Degradation Modeling Junqi Liu et.al. 2604.07329 null
2026-04-08 GenLCA: 3D Diffusion for Full-Body Avatars from In-the-Wild Videos Yiqian Wu et.al. 2604.07273 null
2026-04-08 PhyEdit: Towards Real-World Object Manipulation via Physically-Grounded Image Editing Ruihang Xu et.al. 2604.07230 null
2026-04-08 VersaVogue: Visual Expert Orchestration and Preference Alignment for Unified Fashion Synthesis Jian Yu et.al. 2604.07210 null
2026-04-08 SurFITR: A Dataset for Surveillance Image Forgery Detection and Localisation Qizhou Wang et.al. 2604.07101 null
2026-04-08 Granular mixing and flow dynamics in horizontal stirred bed reactors Sahar Pourandi et.al. 2604.07082 null
2026-04-08 Not all tokens contribute equally to diffusion learning Guoqing Zhang et.al. 2604.07026 null
2026-04-08 MAR-GRPO: Stabilized GRPO for AR-diffusion Hybrid Image Generation Xiaoxiao Ma et.al. 2604.06966 null
2026-04-08 FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling Yitong Li et.al. 2604.06916 null
2026-04-08 RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details Dewei Zhou et.al. 2604.06870 null
2026-04-08 FVD: Inference-Time Alignment of Diffusion Models via Fleming-Viot Resampling Shivanshu Shekhar et.al. 2604.06779 null
2026-04-08 FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching Junchao Yi et.al. 2604.06757 null
2026-04-07 DiffHDR: Re-Exposing LDR Videos with Video Diffusion Models Zhengming Yu et.al. 2604.06161 null
2026-04-07 Learning-Guided Force-Feedback Model Predictive Control with Obstacle Avoidance for Robotic Deburring Krzysztof Wojciechowski et.al. 2604.06133 null
2026-04-07 PoM: A Linear-Time Replacement for Attention with the Polynomial Mixer David Picard et.al. 2604.06129 null
2026-04-07 SEM-ROVER: Semantic Voxel-Guided Diffusion for Large-Scale Driving Scene Generation Hiba Dahmani et.al. 2604.06113 null
2026-04-07 Graph-PiT: Enhancing Structural Coherence in Part-Based Image Synthesis via Graph Priors Junbin Zhang et.al. 2604.06074 null
2026-04-07 Beyond Black-Scholes: A Computational Framework for Option Pricing Using Heston, GARCH, and Jump Diffusion Models Karmanpartap Singh Sidhu et.al. 2604.06068 null
2026-04-07 Lipschitz regularity in Flow Matching and Diffusion Models: sharp sampling rates and functional inequalities Arthur Stéphanovitch et.al. 2604.06065 null
2026-04-07 HumANDiff: Articulated Noise Diffusion for Motion-Consistent Human Video Generation Tao Hu et.al. 2604.05961 null
2026-04-07 Leveraging Image Editing Foundation Models for Data-Efficient CT Metal Artifact Reduction Ahmet Rasim Emirdagi et.al. 2604.05934 null
2026-04-07 Improving Controllable Generation: Faster Training and Better Performance via $x_0$ -Supervision Amadou S. Sangare et.al. 2604.05761 null
2026-04-06 Your Pre-trained Diffusion Model Secretly Knows Restoration Sudarshan Rajagopalan et.al. 2604.04924 null
2026-04-06 Diffusion of PeV Cosmic Rays in the Turbulent and Multiphase Interstellar Medium Yue Hu et.al. 2604.04814 null
2026-04-06 Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning Lei Zhang et.al. 2604.04746 null
2026-04-06 ZeD-MAP: Bundle Adjustment Guided Zero-Shot Depth Maps for Real-Time Aerial Imaging Selim Ahmet Iz et.al. 2604.04667 null
2026-04-06 Training-Free Refinement of Flow Matching with Divergence-based Sampling Yeonwoo Cha et.al. 2604.04646 null
2026-04-06 Beyond Semantics: Uncovering the Physics of Fakes via Universal Physical Descriptors for Cross-Modal Synthetic Detection Mei Qiu et.al. 2604.04608 null
2026-04-06 PR-IQA: Partial-Reference Image Quality Assessment for Diffusion-Based Novel View Synthesis Inseong Choi et.al. 2604.04576 null
2026-04-06 Erasure or Erosion? Evaluating Compositional Degradation in Unlearned Text-To-Image Diffusion Models Arian Komaei Koma et.al. 2604.04575 null
2026-04-06 Training-Free Image Editing with Visual Context Integration and Concept Alignment Rui Song et.al. 2604.04487 null
2026-04-06 Beyond Few-Step Inference: Accelerating Video Diffusion Transformer Model Serving with Inter-Request Caching Reuse Hao Liu et.al. 2604.04451 null

LLM training

Publish Date Title Authors PDF Code
2026-05-05 Audio-Visual Intelligence in Large Foundation Models You Qin et.al. 2605.04045 null
2026-05-05 Stayin’ Aligned Over Time: Towards Longitudinal Human-LLM Alignment via Contextual Reflection and Privacy-Preserving Behavioral Data Simret Araya Gebreegziabher et.al. 2605.04029 null
2026-05-05 On Adaptivity in Zeroth-Order Optimization Hassan Dbouk et.al. 2605.03869 null
2026-05-05 Natural Language Processing: A Comprehensive Practical Guide from Tokenisation to RLHF Mullosharaf K. Arabov et.al. 2605.03799 null
2026-05-05 AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics Tencent HY Team et.al. 2605.03652 null
2026-05-05 Revisiting Graph-Tokenizing Large Language Models: A Systematic Evaluation of Graph Token Understanding Zhongjian Zhang et.al. 2605.03514 null
2026-05-04 Moral Sensitivity in LLMs: A Tiered Evaluation of Contextual Bias via Behavioral Profiling and Mechanistic Interpretability Yash Aggarwal et.al. 2605.03217 null
2026-05-04 Enwar 3.0: An Agentic Multi-Modal LLM Orchestrator for Situation-Aware Beamforming, Blockage Prediction, and Handover Management Ahmad M. Nazar et.al. 2605.03215 null
2026-05-04 Geometric Deviation as an Unsupervised Pre-Generation Reliability Signal: Probing LLM Representations for Answerability Yucheng Du et.al. 2605.03196 null
2026-05-04 Bolek: A Multimodal Language Model for Molecular Reasoning Frederic Grabowski et.al. 2605.02745 null
2026-05-04 Gradient-Gated DPO: Stabilizing Preference Optimization in Language Models Inoussa Mouiche et.al. 2605.02626 null
2026-05-04 Efficient Preference Poisoning Attack on Offline RLHF Chenye Yang et.al. 2605.02495 null
2026-05-04 Anomaly-Preference Image Generation Fuyun Wang et.al. 2605.02439 null
2026-05-04 Reliability-Oriented Multilingual Orthopedic Diagnosis: A Domain-Adaptive Modeling and a Conceptual Validation Framework Danish Ali et.al. 2605.02266 null
2026-05-03 Maistros: A Greek Large Language Model Adapted Through Knowledge Distillation From Large Reasoning Models Nikolaos Giarelis et.al. 2605.01870 null
2026-05-03 RMGAP: Benchmarking the Generalization of Reward Models across Diverse Preferences Yangyang Zhou et.al. 2605.01831 null
2026-05-02 LLM Output Detectability and Task Performance Can be Jointly Optimized Koshiro Saito et.al. 2605.01350 null
2026-05-02 Addressing Data Scarcity in Bangla Fake News Detection: An LLM-Based Dataset Augmentation Approach Ahmed Alfey Sani et.al. 2605.01292 null
2026-05-02 GIFT: Guided Fine-Tuning and Transfer for Enhancing Instruction-Tuned Language Models Zhiwen Ruan et.al. 2605.01256 null
2026-05-01 Let ViT Speak: Generative Language-Image Pre-training Yan Fang et.al. 2605.00809 null
2026-05-01 AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments Zhijie Cai et.al. 2605.00650 null
2026-05-01 H-RAG at SemEval-2026 Task 8: Hierarchical Parent-Child Retrieval for Multi-Turn RAG Conversations Passant Elchafei et.al. 2605.00631 null
2026-05-01 DynamicPO: Dynamic Preference Optimization for Recommendation Xingyu Hu et.al. 2605.00327 null
2026-05-01 Online Self-Calibration Against Hallucination in Vision-Language Models Minghui Chen et.al. 2605.00323 null
2026-04-30 Attention Is Where You Attack Aviral Srivastava et.al. 2605.00236 null
2026-04-30 TUR-DPO: Topology- and Uncertainty-Aware Direct Preference Optimization Abdulhady Abas Abdullah et.al. 2605.00224 null
2026-04-30 Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback Yikai Wang et.al. 2605.00155 null
2026-04-30 ViLegalNLI: Natural Language Inference for Vietnamese Legal Texts Nhung Thi-Hong Duong et.al. 2605.00116 null
2026-04-30 FiLMMeD: Feature-wise Linear Modulation for Cross-Problem Multi-Depot Vehicle Routing Arthur Corrêa et.al. 2604.28102 null
2026-04-30 Learning from Disagreement: Clinician Overrides as Implicit Preference Signals for Clinical AI in Value-Based Care Prabhjot Singh et.al. 2604.28010 null
2026-04-30 ZipCCL: Efficient Lossless Data Compression of Communication Collectives for Accelerating LLM Training Wenxiang Lin et.al. 2604.27844 null
2026-04-30 Mind the Gap: Structure-Aware Consistency in Preference Learning Mehryar Mohri et.al. 2604.27733 null
2026-04-30 Language Ideologies in a Multilingual Society: An LLM-based Analysis of Luxembourgish News Comments Emilia Milano et.al. 2604.27661 null
2026-04-30 HAVEN: Hybrid Automated Verification ENgine for UVM Testbench Synthesis with LLMs Chang-Chih Meng et.al. 2604.27643 null
2026-04-30 SecGoal: A Benchmark for Security Goal Extraction and Formalization from Protocol Documents Dawei Huang et.al. 2604.27601 null
2026-04-30 Leveraging Verifier-Based Reinforcement Learning in Image Editing Hanzhong Guo et.al. 2604.27505 null
2026-04-30 Secret Stealing Attacks on Local LLM Fine-Tuning through Supply-Chain Model Code Backdoors Zi Li et.al. 2604.27426 null
2026-04-29 Instruction Complexity Induces Positional Collapse in Adversarial LLM Evaluation Jon-Paul Cacioli et.al. 2604.27249 null
2026-04-29 Zero-Shot to Full-Resource: Cross-lingual Transfer Strategies for Aspect-Based Sentiment Analysis Jakob Fehle et.al. 2604.26619 null
2026-04-29 Translating Under Pressure: Domain-Aware LLMs for Crisis Communication Antonio Castaldo et.al. 2604.26597 null
2026-04-29 SplitFT: An Adaptive Federated Split Learning System For LLMs Fine-Tuning Yimeng Shan et.al. 2604.26388 null
2026-04-28 Hierarchical Multi-Persona Induction from User Behavioral Logs: Learning Evidence-Grounded and Truthful Personas Nayoung Choi et.al. 2604.26120 null
2026-04-28 When Errors Can Be Beneficial: A Categorization of Imperfect Rewards for Policy Gradient Shuning Shang et.al. 2604.25872 null
2026-04-28 From Soliloquy to Agora: Memory-Enhanced LLM Agents with Decentralized Debate for Optimization Modeling Jianghao Lin et.al. 2604.25847 null
2026-04-28 Step-Audio-R1.5 Technical Report Yuxin Zhang et.al. 2604.25719 null
2026-04-28 Backtranslation Augmented Direct Preference Optimization for Neural Machine Translation Mehrdad Ghassabi et.al. 2604.25702 null
2026-04-28 Health System Scale Semantic Search Across Unstructured Clinical Notes Faith Wavinya Mutinda et.al. 2604.25605 null
2026-04-28 A Systematic Post-Train Framework for Video Generation Zeyue Xue et.al. 2604.25427 null
2026-04-28 FED-FSTQ: Fisher-Guided Token Quantization for Communication-Efficient Federated Fine-Tuning of LLMs on Edge Devices Changyu Li et.al. 2604.25421 null
2026-04-28 Below-Chance Blindness: Prompted Underperformance in Small LLMs Produces Positional Bias Rather than Answer Avoidance Jon-Paul Cacioli et.al. 2604.25249 null
2026-04-28 Frictive Policy Optimization for LLMs: Epistemic Intervention, Risk-Sensitive Control, and Reflective Alignment James Pustejovsky et.al. 2604.25136 null
2026-04-28 What Makes Good Instruction-Tuning Data? An In-Context Learning Perspective Guangzeng Han et.al. 2604.25132 null
2026-04-27 A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations Zihan Liu et.al. 2604.24468 null
2026-04-27 A Multi-Dimensional Audit of Politically Aligned Large Language Models Lisa Korver et.al. 2604.24429 null
2026-04-27 Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment Wenzhe Xu et.al. 2604.24178 null
2026-04-27 TACO: Efficient Communication Compression of Intermediate Tensors for Scalable Tensor-Parallel LLM Training Man Liu et.al. 2604.24088 null
2026-04-27 Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B Jon-Paul Cacioli et.al. 2604.24070 null
2026-04-27 Disagreement as Signals: Dual-view Calibration for Sequential Recommendation Denoising Sijia Li et.al. 2604.24048 null
2026-04-27 FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training Rezaul Karim et.al. 2604.24013 null
2026-04-27 Hindsight Preference Optimization for Financial Time Series Advisory Yanwei Cui et.al. 2604.23988 null
2026-04-27 Continual Calibration: Coverage Can Collapse Before Accuracy in Lifelong LLM Fine-Tuning Ibne Farabi Shihab et.al. 2604.23987 null
2026-04-27 MatchRDMA: A Segmented and Rate-Matched Long-Haul RDMA Scheme for Geo-distributed LLM Training over OTN Jun Dai et.al. 2604.23932 null
2026-04-24 CAGE-SGG: Counterfactual Active Graph Evidence for Open-Vocabulary Scene Graph Generation Suiyang Guang et.al. 2604.22274 null
2026-04-24 TTS-PRISM: A Perceptual Reasoning and Interpretable Speech Model for Fine-Grained Diagnosis Xi Wang et.al. 2604.22225 null
2026-04-24 Verbal Confidence Saturation in 3-9B Open-Weight Instruction-Tuned LLMs: A Pre-Registered Psychometric Validity Screen Jon-Paul Cacioli et.al. 2604.22215 null
2026-04-23 PermaFrost-Attack: Stealth Pretraining Seeding(SPS) for planting Logic Landmines During LLM Training Harsh Kumar et.al. 2604.22117 null
2026-04-23 When Cow Urine Cures Constipation on YouTube: Limits of LLMs in Detecting Culture-specific Health Misinformation Anamta Khan et.al. 2604.22002 null
2026-04-23 When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs Pegah Khayatan et.al. 2604.21911 null
2026-04-23 Why are all LLMs Obsessed with Japanese Culture? On the Hidden Cultural and Regional Biases of LLMs Joseba Fernandez de Landa et.al. 2604.21751 null
2026-04-23 Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation Nikita Severin et.al. 2604.21536 null
2026-04-23 Generalizing Numerical Reasoning in Table Data through Operation Sketches and Self-Supervised Learning Hanjun Cho et.al. 2604.21495 null
2026-04-23 Reasoning Primitives in Hybrid and Non-Hybrid LLMs Shivam Rawat et.al. 2604.21454 null
2026-04-23 CAP: Controllable Alignment Prompting for Unlearning in LLMs Zhaokun Wang et.al. 2604.21251 null
2026-04-23 Reasoning About Traversability: Language-Guided Off-Road 3D Trajectory Planning Byounggun Park et.al. 2604.21249 null
2026-04-23 Zero-Shot Detection of LLM-Generated Text via Implicit Reward Model Runheng Liu et.al. 2604.21223 null
2026-04-23 On Reasoning Behind Next Occupation Recommendation Shan Dong et.al. 2604.21204 null
2026-04-22 TabSHAP Aryan Chaudhary et.al. 2604.21120 null
2026-04-22 MGDA-Decoupled: Geometry-Aware Multi-Objective Optimisation for DPO-based LLM Alignment Andor Vári-Kakas et.al. 2604.20685 null
2026-04-22 The Effect of Idea Elaboration on the Automatic Assessment of Idea Originality Umberto Domanti et.al. 2604.20569 null
2026-04-22 Where Reasoning Breaks: Logic-Aware Path Selection by Controlling Logical Connectives in LLMs Reasoning Chains Seunghyun Park et.al. 2604.20564 null
2026-04-22 Evian: Towards Explainable Visual Instruction-tuning Data Auditing Zimu Jia et.al. 2604.20544 null
2026-04-22 Surrogate modeling for interpreting black-box LLMs in medical predictions Changho Han et.al. 2604.20331 null
2026-04-22 Image Generators are Generalist Vision Learners Valentin Gabeur et.al. 2604.20329 null
2026-04-22 LLM-guided phase diagram construction through high-throughput experimentation Ryo Tamura et.al. 2604.20304 null
2026-04-22 HiPO: Hierarchical Preference Optimization for Adaptive Reasoning in LLMs Darsh Kachroo et.al. 2604.20140 null
2026-04-21 Bootstrapping Post-training Signals for Open-ended Tasks via Rubric-based Self-play on Pre-training Text Chengyu Huang et.al. 2604.20051 null
2026-04-21 Super Apriel: One Checkpoint, Many Speeds SLAM Labs et.al. 2604.19877 null
2026-04-21 Exploring Language-Agnosticity in Function Vectors: A Case Study in Machine Translation Nurkhan Laiyk et.al. 2604.19678 null
2026-04-21 HP-Edit: A Human-Preference Post-Training Framework for Image Editing Fan Li et.al. 2604.19406 null
2026-04-21 Location Not Found: Exposing Implicit Local and Global Biases in Multilingual LLMs Guy Mor-Lan et.al. 2604.19292 null
2026-04-21 HarDBench: A Benchmark for Draft-Based Co-Authoring Jailbreak Attacks for Safe Human-LLM Collaborative Writing Euntae Kim et.al. 2604.19274 null
2026-04-21 UniEP: Unified Expert-Parallel MoE MegaKernel for LLM Training Size Zheng et.al. 2604.19241 null
2026-04-21 The Rise of Verbal Tics in Large Language Models: A Systematic Analysis Across Frontier Models Shuai Wu et.al. 2604.19139 null
2026-04-21 SAHM: A Benchmark for Arabic Financial and Shari’ah-Compliant Reasoning Rania Elbadry et.al. 2604.19098 null
2026-04-21 STK-Adapter: Incorporating Evolving Graph and Event Chain for Temporal Knowledge Graph Extrapolation Shuyuan Zhao et.al. 2604.19042 null
2026-04-21 Policy Gradient Primal-Dual Method for Safe Reinforcement Learning from Human Feedback Qiang Liu et.al. 2604.19024 null
2026-04-21 Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control Julian Skifstad et.al. 2604.19018 null
2026-04-20 JudgeMeNot: Personalizing Large Language Models to Emulate Judicial Reasoning in Hebrew Itay Razumenko et.al. 2604.18041 null
2026-04-20 Architecture Matters More Than Scale: A Comparative Study of Retrieval and Memory Augmentation for Financial QA Under SME Compute Constraints Jianan Liu et.al. 2604.17979 null
2026-04-20 Efficient Federated RLHF via Zeroth-Order Policy Optimization Deyi Wang et.al. 2604.17747 null
2026-04-19 PBSBench: A Multi-Level Vision-Language Framework and Benchmark for Hematopathology Whole Slide Image Interpretation Yuanlong Wang et.al. 2604.17570 null
2026-04-19 PoliLegalLM: A Technical Report on a Large Language Model for Political and Legal Affairs Yuting Huang et.al. 2604.17543 null
2026-04-19 E2E-GMNER: End-to-End Generative Grounded Multimodal Named Entity Recognition Meng Zhang et.al. 2604.17319 null
2026-04-19 Cat-DPO: Category-Adaptive Safety Alignment Tiankai Yang et.al. 2604.17299 null
2026-04-19 HeadRank: Decoding-Free Passage Reranking via Preference-Aligned Attention Heads Juyuan Wang et.al. 2604.17237 null
2026-04-19 Guardrails in Logit Space: Safety Token Regularization for LLM Alignment Thong Bach et.al. 2604.17210 null
2026-04-18 Complementing Self-Consistency with Cross-Model Disagreement for Uncertainty Quantification Kimia Hamidieh et.al. 2604.17112 null
2026-04-17 Sketching the Readout of Large Language Models for Scalable Data Attribution and Valuation Yide Ran et.al. 2604.16197 null
2026-04-17 CiPO: Counterfactual Unlearning for Large Reasoning Models through Iterative Preference Optimization Junyi Li et.al. 2604.15847 null
2026-04-17 Into the Gray Zone: Domain Contexts Can Blur LLM Safety Boundaries Ki Sen Hung et.al. 2604.15717 null
2026-04-17 Towards Robust Endogenous Reasoning: Unifying Drift Adaptation in Non-Stationary Tuning Xiaoyu Yang et.al. 2604.15705 null
2026-04-17 C-Mining: Unsupervised Discovery of Seeds for Cultural Data Synthesis via Geometric Misalignment Pufan Zeng et.al. 2604.15675 null
2026-04-17 GroupDPO: Memory efficient Group-wise Direct Preference Optimization Jixuan Leng et.al. 2604.15602 null
2026-04-16 StoSignSGD: Unbiased Structural Stochasticity Fixes SignSGD for Training Large Language Models Dingzhi Yu et.al. 2604.15416 null
2026-04-16 MADE: A Living Benchmark for Multi-Label Text Classification with Uncertainty Quantification of Medical Device Adverse Events Raunak Agarwal et.al. 2604.15203 null
2026-04-16 RaTA-Tool: Retrieval-based Tool Selection with Multimodal Large Language Models Gabriele Mattioli et.al. 2604.14951 null
2026-04-16 WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training Yifu Chen et.al. 2604.14932 null
2026-04-16 Reasoning Dynamics and the Limits of Monitoring Modality Reliance in Vision-Language Models Danae Sánchez Villegas et.al. 2604.14888 null
2026-04-16 CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning Zhuo Wang et.al. 2604.14768 null
2026-04-16 Switching Efficiency: A Novel Framework for Dissecting AI Data Center Network Efficiency Niangen Ye et.al. 2604.14690 null
2026-04-16 SPAGBias: Uncovering and Tracing Structured Spatial Gender Bias in Large Language Models Binxian Su et.al. 2604.14672 null
2026-04-15 FoodSense: A Multisensory Food Dataset and Benchmark for Predicting Taste, Smell, Texture, and Sound from Images Sabab Ishraq et.al. 2604.14388 null
2026-04-15 The Cost of Language: Centroid Erasure Exposes and Exploits Modal Competition in Multimodal Language Models Akshay Paruchuri et.al. 2604.14363 null
2026-04-15 DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines Gabriel Pimenta de Freitas Cardoso et.al. 2604.14314 null
2026-04-15 Don’t Let the Video Speak: Audio-Contrastive Preference Optimization for Audio-Visual Language Models Ami Baid et.al. 2604.14129 null
2026-04-15 TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration Zerun Ma et.al. 2604.14116 null
2026-04-15 MAny: Merge Anything for Multimodal Continual Instruction Tuning Zijian Gao et.al. 2604.14016 null
2026-04-15 Do We Still Need Humans in the Loop? Comparing Human and LLM Annotation in Active Learning for Hostility Detection Ahmad Dawar Hakimi et.al. 2604.13899 null
2026-04-15 SparseBalance: Load-Balanced Long Context Training with Dynamic Sparse Attention Hongtao Xu et.al. 2604.13847 null
2026-04-15 Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges Xiaohua Wang et.al. 2604.13602 null
2026-04-15 SAKURAONE: An Open Ethernet-Based AI HPC System and Its Observed Workload Dynamics in a Single-Tenant LLM Development Environment Fumikazu Konishi et.al. 2604.13600 null
2026-04-15 Debate to Align: Reliable Entity Alignment through Two-Stage Multi-Agent Debate Cunda Wang et.al. 2604.13551 null
2026-04-15 Synthesizing Instruction-Tuning Datasets with Contrastive Decoding Tatsuya Ichinose et.al. 2604.13538 null
2026-04-14 Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization Aadyot Bhatnagar et.al. 2604.13175 null
2026-04-14 Visual Preference Optimization with Rubric Rewards Ya-Qi Yu et.al. 2604.13029 null
2026-04-14 One Token Away from Collapse: The Fragility of Instruction-Tuned Helpfulness Erfan Baghaei Potraghloo et.al. 2604.13006 null
2026-04-14 Boosting Visual Instruction Tuning with Self-Supervised Guidance Sophia Sirko-Galouchenko et.al. 2604.12966 null
2026-04-14 From Imitation to Discrimination: Progressive Curriculum Learning for Robust Web Navigation Chuang Peng et.al. 2604.12666 null
2026-04-14 Safety Training Modulates Harmful Misalignment Under On-Policy RL, But Direction Depends on Environment Design Leon Eshuijs et.al. 2604.12500 null
2026-04-14 Analyzing the Effect of Noise in LLM Fine-tuning Lingfang Li et.al. 2604.12469 null
2026-04-14 Three Birds, One Stone: Solving the Communication-Memory-Privacy Trilemma in LLM Fine-tuning Over Wireless Networks with Zeroth-Order Optimization Zhijie Cai et.al. 2604.12401 null
2026-04-14 AgenticAI-DialogGen: Topic-Guided Conversation Generation for Fine-Tuning and Evaluating Short- and Long-Term Memories of LLMs Manoj Madushanka Perera et.al. 2604.12179 null
2026-04-14 Nucleus-Image: Sparse MoE for Image Generation Chandan Akiti et.al. 2604.12163 null
2026-04-13 Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models Syed Rifat Raiyan et.al. 2604.12076 null
2026-04-13 CLSGen: A Dual-Head Fine-Tuning Framework for Joint Probabilistic Classification and Verbalized Explanation WonJin Yoon et.al. 2604.11801 null
2026-04-13 RPA-Check: A Multi-Stage Automated Framework for Evaluating Dynamic LLM-based Role-Playing Agents Riccardo Rosati et.al. 2604.11655 null
2026-04-13 MLLM-as-a-Judge Exhibits Model Preference Bias Shuitsu Koyama et.al. 2604.11589 null
2026-04-13 OOM-RL: Out-of-Money Reinforcement Learning Market-Driven Alignment for LLM-Based Multi-Agent Systems Kun Liu et.al. 2604.11477 null
2026-04-13 Mobile GUI Agent Privacy Personalization with Trajectory Induced Preference Optimization Zhixin Lin et.al. 2604.11259 null
2026-04-13 BITS Pilani at SemEval-2026 Task 9: Structured Supervised Fine-Tuning with DPO Refinement for Polarization Detection Atharva Gupta et.al. 2604.11121 null
2026-04-13 DDO-RM for LLM Preference Optimization: A Minimal Held-Out Benchmark against DPO Tiantian Zhang et.al. 2604.11119 null
2026-04-12 Advancing Polish Language Modeling through Tokenizer Optimization in the Bielik v3 7B and 11B Series Krzysztof Ociepa et.al. 2604.10799 null
2026-04-12 Teaching Language Models How to Code Like Learners: Conversational Serialization for Student Simulation Charles Koutcheme et.al. 2604.10720 null
2026-04-12 ProUIE: A Macro-to-Micro Progressive Learning Method for LLM-based Universal Information Extraction Wenda Liu et.al. 2604.10633 null
2026-04-12 CogInstrument: Modeling Cognitive Processes for Bidirectional Human-LLM Alignment in Planning Tasks Anqi Wang et.al. 2604.10587 null
2026-04-12 Calibration Collapse Under Sycophancy Fine-Tuning: How Reward Hacking Breaks Uncertainty Quantification in LLMs Subramanyam Sahoo et.al. 2604.10585 null
2026-04-10 Think Less, Know More: State-Aware Reasoning Compression with Knowledge Guidance for Efficient Reasoning Yi Sui et.al. 2604.09150 null
2026-04-10 NyayaMind- A Framework for Transparent Legal Reasoning and Judgment Prediction in the Indian Legal System Parjanya Aditya Shukla et.al. 2604.09069 null
2026-04-10 TaxPraBen: A Scalable Benchmark for Structured Evaluation of LLMs in Chinese Real-World Tax Practice Gang Hu et.al. 2604.08948 null
2026-04-09 Cards Against LLMs: Benchmarking Humor Alignment in Large Language Models Yousra Fettach et.al. 2604.08757 null
2026-04-09 Decomposing the Delta: What Do Models Actually Learn from Preference Pairs? Chia-Hsuan Lee et.al. 2604.08723 null
2026-04-09 SUPERNOVA: Eliciting General Reasoning in LLMs with Reinforcement Learning on Natural Instructions Ashima Suvarna et.al. 2604.08477 null
2026-04-09 ProMedical: Hierarchical Fine-Grained Criteria Modeling for Medical LLM Alignment via Explicit Injection He Geng et.al. 2604.08326 null
2026-04-09 Towards Identification and Intervention of Safety-Critical Parameters in Large Language Models Weiwei Qi et.al. 2604.08297 null
2026-04-09 Self-Debias: Self-correcting for Debiasing Large Language Models Xuan Feng et.al. 2604.08243 null
2026-04-09 EditCaption: Human-Aligned Instruction Synthesis for Image Editing via Supervised Fine-Tuning and Direct Preference Optimization Xiangyuan Wang et.al. 2604.08213 null
2026-04-09 Vision-Language Foundation Models for Comprehensive Automated Pavement Condition Assessment Blessing Agyei Kyem et.al. 2604.08212 null
2026-04-09 Aligning Agents via Planning: A Benchmark for Trajectory-Level Reward Modeling Jiaxuan Wang et.al. 2604.08178 null
2026-04-09 DSCA: Dynamic Subspace Concept Alignment for Lifelong VLM Editing Gyanendra Das et.al. 2604.07965 null
2026-04-09 Rethinking Data Mixing from the Perspective of Large Language Models Yuanjian Xu et.al. 2604.07963 null
2026-04-09 Large Language Model Post-Training: A Unified View of Off-Policy and On-Policy Learning Shiwan Zhao et.al. 2604.07941 null
2026-04-08 VersaVogue: Visual Expert Orchestration and Preference Alignment for Unified Fashion Synthesis Jian Yu et.al. 2604.07210 null
2026-04-08 Gemma 4, Phi-4, and Qwen3: Accuracy-Efficiency Tradeoffs in Dense and MoE Reasoning Language Models Md Motaleb Hossen Manik et.al. 2604.07035 null
2026-04-08 MARS: Enabling Autoregressive Models Multi-Token Generation Ziqi Jin et.al. 2604.07023 null
2026-04-08 Beyond Accuracy: Diagnosing Algebraic Reasoning Failures in LLMs Across Nine Complexity Dimensions Parth Patil et.al. 2604.06799 null
2026-04-08 Multi-Faceted Self-Consistent Preference Alignment for Query Rewriting in Conversational Search Zhiyu Cao et.al. 2604.06771 null
2026-04-08 The Theorems of Dr. David Blackwell and Their Contributions to Artificial Intelligence Napoleon Paxton et.al. 2604.06621 null
2026-04-07 Limits of Difficulty Scaling: Hard Samples Yield Diminishing Returns in GRPO-Tuned SLMs Suraj Yadav et.al. 2604.06298 null
2026-04-07 Stories of Your Life as Others: A Round-Trip Evaluation of LLM-Generated Life Stories Conditioned on Rich Psychometric Profiles Ben Wigler et.al. 2604.06071 null
2026-04-07 How LLMs Follow Instructions: Skillful Coordination, Not a Universal Mechanism Elisabetta Rocchetti et.al. 2604.06015 null
2026-04-07 Beyond Compromise: Pareto-Lenient Consensus for Efficient Multi-Preference LLM Alignment Renxuan Tan et.al. 2604.05965 null
2026-04-07 BOSCH: Black-Box Binary Optimization for Short-Context Attention-Head Selection in LLMs Abbas Ghaddar et.al. 2604.05942 null
2026-04-07 JD-BP: A Joint-Decision Generative Framework for Auto-Bidding and Pricing Linghui Meng et.al. 2604.05845 null
2026-04-07 Vision-Guided Iterative Refinement for Frontend Code Generation Hannah Sansford et.al. 2604.05839 null
2026-04-07 Controlling Distributional Bias in Multi-Round LLM Generation via KL-Optimized Fine-Tuning Yanbei Jiang et.al. 2604.05756 null
2026-04-06 Instruction-Tuned LLMs for Parsing and Mining Unstructured Logs on Leadership HPC Systems Ahmad Maroof Karimi et.al. 2604.05168 null
2026-04-06 SenseAI: A Human-in-the-Loop Dataset for RLHF-Aligned Financial Sentiment Reasoning Berny Kabalisa et.al. 2604.05135 null
2026-04-06 Offline RL for Adaptive Policy Retrieval in Prior Authorization Ruslan Sharifullin et.al. 2604.05125 null
2026-04-06 One Model for All: Multi-Objective Controllable Language Models Qiang He et.al. 2604.04497 null
2026-04-06 MolDA: Molecular Understanding and Generation via Large Language Diffusion Model Seohyeon Shin et.al. 2604.04403 null
2026-04-06 Developing Authentic Simulated Learners for Mathematics Teacher Learning: Insights from Three Approaches with Large Language Models Jie Cao et.al. 2604.04361 null
2026-04-05 APPA: Adaptive Preference Pluralistic Alignment for Fair Federated RLHF of LLMs Mahmoud Srewa et.al. 2604.04261 null
2026-04-05 DARE: Diffusion Large Language Models Alignment and Reinforcement Executor Jingyi Yang et.al. 2604.04215 null
2026-04-05 A Semi-Automated Annotation Workflow for Paediatric Histopathology Reports Using Small Language Models Avish Vijayaraghavan et.al. 2604.04168 null
2026-04-05 Extracting and Steering Emotion Representations in Small Language Models: A Methodological Comparison Jihoon Jeong et.al. 2604.04064 null
2026-04-05 COBOL-Coder: Domain-Adapted Large Language Models for COBOL Code Generation and Translation Anh T. V. Dau et.al. 2604.03986 null
2026-04-05 SafeCtrl: Region-Aware Safety Control for Text-to-Image Diffusion via Detect-Then-Suppress Lingyun Zhang et.al. 2604.03941 null
2026-04-04 Where to Steer: Input-Dependent Layer Selection for Steering Improves LLM Alignment Soham Gadgil et.al. 2604.03867 null