CV Arxiv Daily

Updated on 2026.05.06

Usage instructions: here

world model

Publish Date	Title	Authors	PDF	Code
2026-05-05	Implementing True MPI Sessions and Evaluating MPI Initialization Scalability	Hui Zhou et.al.	2605.03983	null
2026-05-05	A Benchmark for Interactive World Models with a Unified Action Generation Framework	Jianjie Fang et.al.	2605.03941	null
2026-05-05	RoboAlign-R1: Distilled Multimodal Reward Alignment for Robot Video World Models	Hao Wu et.al.	2605.03821	null
2026-05-05	What You Think is What You See: Driving Exploration in VLM Agents via Visual-Linguistic Curiosity	Haoxi Li et.al.	2605.03782	null
2026-05-05	AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics	Tencent HY Team et.al.	2605.03652	null
2026-05-05	Learning to Theorize the World from Observation	Doojin Baek et.al.	2605.03413	null
2026-05-04	Existence, Asymptotic Behavior, and Numerical Analysis of a Generalized Abel Differential Equation with Applications in Financial Modeling	Dragos-Patru Covei et.al.	2605.02831	null
2026-05-04	DynoSLAM: Dynamic SLAM with Generative Graph Neural Networks for Real-World Social Navigation	Danil Tokhchukov et.al.	2605.02759	null
2026-05-04	Shadow-Loom: Causal Reasoning over Graphical World Model of Narratives	David Wilmot et.al.	2605.02475	null
2026-05-04	Video Generation with Predictive Latents	Yian Zhao et.al.	2605.02134	null
2026-05-03	TRAP: Tail-aware Ranking Attack for World-Model Planning	Siyuan Duan et.al.	2605.01950	null
2026-05-03	Divide and Conquer: Decoupled Representation Alignment for Multimodal World Models	Junyuan Xiao et.al.	2605.01896	null
2026-05-03	Embody4D: A Generalist 4D World Model for Embodied AI	Peiyan Tu et.al.	2605.01799	null
2026-05-03	SignVerse-2M: A Two-Million-Clip Pose-Native Universe of 25+ Sign Languages	Sen Fang et.al.	2605.01720	null
2026-05-03	Latent State Design for World Models under Sufficiency Constraints	Keon Woo Kim et.al.	2605.01694	null
2026-05-03	Video Active Perception: Effective Inference-Time Long-Form Video Understanding with Vision-Language Models	Martin Q. Ma et.al.	2605.01662	null
2026-05-01	Physically Native World Models: A Hamiltonian Perspective on Generative World Modeling	Sen Cui et.al.	2605.00412	null
2026-04-30	World Model for Robot Learning: A Comprehensive Survey	Bohan Hou et.al.	2605.00080	null
2026-04-30	Being-H0.7: A Latent World-Action Model from Egocentric Videos	Hao Luo et.al.	2605.00078	null
2026-04-30	HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation	Xin Zhou et.al.	2604.28196	null
2026-04-30	LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models	Hao Chen et.al.	2604.28192	null
2026-04-30	Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling	Keming Wu et.al.	2604.28185	null
2026-04-30	Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces	Andrew Bond et.al.	2604.28122	null
2026-04-30	Dreaming Across Towns: Semantic Rollout and Town-Adversarial Regularization for Zero-Shot Held-Out-Town Fixed-Route Driving in CARLA	Feeza Khan Khanzada et.al.	2604.27994	null
2026-04-30	GUI Agents with Reinforcement Learning: Toward Digital Inhabitants	Junan Hu et.al.	2604.27955	null
2026-04-30	Flying by Inference: Active Inference World Models for Adaptive UAV Swarms	Kaleem Arshid et.al.	2604.27935	null
2026-04-30	Simulating clinical interventions with a generative multimodal model of human physiology	Guy Lutsker et.al.	2604.27899	null
2026-04-30	Graph World Models: Concepts, Taxonomy, and Future Directions	Jiawei Liu et.al.	2604.27895	null
2026-04-30	MotuBrain: An Advanced World Action Model for Robot Control	MotuBrain Team et.al.	2604.27792	null
2026-04-29	World2VLM: Distilling World Model Imagination into VLMs for Dynamic Spatial Reasoning	Wanyue Zhang et.al.	2604.26934	null
2026-04-29	STARRY: Spatial-Temporal Action-Centric World Modeling for Robotic Manipulation	Yuxuan Tian et.al.	2604.26848	null
2026-04-29	Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising	Jun Guo et.al.	2604.26694	null
2026-04-29	AGEL-Comp: A Neuro-Symbolic Framework for Compositional Generalization in Interactive Agents	Mahnoor Shahid et.al.	2604.26522	null
2026-04-29	DepthPilot: From Controllability to Interpretability in Colonoscopy Video Generation	Junhu Fu et.al.	2604.26232	null
2026-04-28	Lifting Embodied World Models for Planning and Control	Alex N. Wang et.al.	2604.26182	null
2026-04-28	HuM-Eval: A Coarse-to-Fine Framework for Human-Centric Video Evaluation	Bingzi Zhang et.al.	2604.25361	null
2026-04-28	ProDrive: Proactive Planning for Autonomous Driving via Ego-Environment Co-Evolution	Chuyao Fu et.al.	2604.25329	null
2026-04-27	Unfolding an Atomistic World: Atomistic Simulation of Reactor Pressure Vessel Steel Across Year-and-Meter Scales	Haozhi Han et.al.	2604.24091	null
2026-04-26	From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation	Jiafeng Wu et.al.	2604.23629	null
2026-04-26	Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling	Zhen Ye et.al.	2604.23586	null
2026-04-26	Emotion-Conditioned Short-Horizon Human Pose Forecasting with a Lightweight Predictive World Model	Jingni Huang et.al.	2604.23532	null
2026-04-25	Active Inference: A method for Phenotyping Agency in AI systems?	Philip Wilson et.al.	2604.23278	null
2026-04-24	Beyond Single-Agent Alignment: Preventing Context-Fragmented Violations in Multi-Agent Systems	Jie Wu et.al.	2604.22879	null
2026-04-24	Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond	Meng Chu et.al.	2604.22748	null
2026-04-24	Beyond Patient Invariance: Learning Cardiac Dynamics via Action-Conditioned JEPAs	Jose Geraldo Fernandes et.al.	2604.22618	null
2026-04-24	Video Analysis and Generation via a Semantic Progress Function	Gal Metzer et.al.	2604.22554	null
2026-04-24	OccDirector: Language-Guided Behavior and Interaction Generation in 4D Occupancy Space	Zhuding Liang et.al.	2604.22240	null
2026-04-24	A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism, Governance, and Dynamics in Complex Societies	Somyajit Chakraborty et.al.	2604.22227	null
2026-04-24	dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model	Yaxuan Li et.al.	2604.22152	null
2026-04-23	Causality and Semantic Separation	Anna Zhang et.al.	2604.22041	null
2026-04-23	Seeing Fast and Slow: Learning the Flow of Time in Videos	Yen-Siang Wu et.al.	2604.21931	null
2026-04-23	Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions	Jiseon Kim et.al.	2604.21871	null
2026-04-23	Hi-WM: Human-in-the-World-Model for Scalable Robot Post-Training	Yaxuan Li et.al.	2604.21741	null
2026-04-22	Building a Precise Video Language with Human-AI Oversight	Zhiqiu Lin et.al.	2604.21718	null
2026-04-23	WorldMark: A Unified Benchmark Suite for Interactive Video World Models	Xiaojie Xu et.al.	2604.21686	null
2026-04-22	Agentic AI for Personalized Physiotherapy: A Multi-Agent Framework for Generative Video Training and Real-Time Pose Correction	Abhishek Dharmaratnakar et.al.	2604.21154	null
2026-04-22	Open-H-Embodiment: A Large-Scale Dataset for Enabling Foundation Models in Medical Robotics	Open-H-Embodiment Consortium et.al.	2604.21017	null
2026-04-22	DeVI: Physics-based Dexterous Human-Object Interaction via Synthetic Video Imitation	Hyeonwoo Kim et.al.	2604.20841	null
2026-04-22	Occupancy Reward Shaping: Improving Credit Assignment for Offline Goal-Conditioned Reinforcement Learning	Aravind Venugopal et.al.	2604.20627	null
2026-04-22	CCTVBench: Contrastive Consistency Traffic VideoQA Benchmark for Multimodal LLMs	Xingcheng Zhou et.al.	2604.20460	null
2026-04-22	X-Cache: Cross-Chunk Block Caching for Few-Step Autoregressive World Models Inference	Yixiao Zeng et.al.	2604.20289	null
2026-04-22	Cortex 2.0: Grounding World Models in Real-World Industrial Deployment	Adriana Aida et.al.	2604.20246	null
2026-04-22	Toward Safe Autonomous Robotic Endovascular Interventions using World Models	Harry Robertshaw et.al.	2604.20151	null
2026-04-21	ChipCraftBrain: Validation-First RTL Generation via Multi-Agent Orchestration	Cagri Eryilmaz et.al.	2604.19856	null
2026-04-21	CityRAG: Stepping Into a City via Spatially-Grounded Video Generation	Gene Chou et.al.	2604.19741	null
2026-04-21	UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling	Boyu Chen et.al.	2604.19734	null
2026-04-22	Mask World Model: Predicting What Matters for Robust Robot Policy Learning	Yunfan Lou et.al.	2604.19683	null
2026-04-21	Safety-Critical Contextual Control via Online Riemannian Optimization with World Models	Tongxin Li et.al.	2604.19639	null
2026-04-21	LASER: Learning Active Sensing for Continuum Field Reconstruction	Huayu Deng et.al.	2604.19355	null
2026-04-21	RoboWM-Bench: A Benchmark for Evaluating World Models in Robotic Manipulation	Feng Jiang et.al.	2604.19092	null
2026-04-20	Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training	Vin Bhaskara et.al.	2604.18701	null
2026-04-21	MultiWorld: Scalable Multi-Agent Multi-View Video World Models	Haoyu Wu et.al.	2604.18564	null
2026-04-20	OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation	Jinghui Lu et.al.	2604.18486	null
2026-04-20	Sonata: A Hybrid World Model for Inertial Kinematics under Clinical Data Scarcity	Blaise Delaney et.al.	2604.18058	null
2026-04-20	The Umwelt Representation Hypothesis: Rethinking Universality	Victoria Bosch et.al.	2604.17960	null
2026-04-20	Scaling Human-AI Coding Collaboration Requires a Governable Consensus Layer	Tianfu Wang et.al.	2604.17883	null
2026-04-19	Infrastructure-Centric World Models: Bridging Temporal Depth and Spatial Breadth for Roadside Perception	Siyuan Meng et.al.	2604.17651	null
2026-04-19	Dual-Anchoring: Addressing State Drift in Vision-Language Navigation	Kangyi Wu et.al.	2604.17473	null
2026-04-19	Long-CODE: Isolating Pure Long-Context as an Orthogonal Dimension in Video Evaluation	Zhijiang Tang et.al.	2604.17428	null
2026-04-19	DreamShot: Personalized Storyboard Synthesis with Video Diffusion Prior	Junjia Huang et.al.	2604.17195	null
2026-04-18	TensorHub: Rethinking AI Model Hub with Tensor-Centric Compression	Tingfeng Lan et.al.	2604.17104	null
2026-04-18	LIVE: Leveraging Image Manipulation Priors for Instruction-based Video Editing	Weicheng Wang et.al.	2604.17021	null
2026-04-18	SafeDream: Safety World Model for Proactive Early Jailbreak Detection	Bo Yan et.al.	2604.16824	null
2026-04-16	POMDP-based Object Search with Growing State Space and Hybrid Action Domain	Yongbo Chen et.al.	2604.14965	null
2026-04-16	Learning Ad Hoc Network Dynamics via Graph-Structured World Models	Can Karacelebi et.al.	2604.14811	null
2026-04-16	World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems	Runze Li et.al.	2604.14732	null
2026-04-15	HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds	Team HY-World et.al.	2604.14268	null
2026-04-15	Seedance 2.0: Advancing Video Generation for World Complexity	Team Seedance et.al.	2604.14148	null
2026-04-15	Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective	Weijie Wang et.al.	2604.14025	null
2026-04-15	Beyond State Consistency: Behavior Consistency in Text-Based World Models	Youling Huang et.al.	2604.13824	null
2026-04-15	Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap	Hanxuan Chen et.al.	2604.13654	null
2026-04-15	DiT as Real-Time Rerenderer: Streaming Video Stylization with Autoregressive Diffusion Transformer	Hengye Lyu et.al.	2604.13509	null
2026-04-15	VibeFlow: Versatile Video Chroma-Lux Editing through Self-Supervised Learning	Yifan Li et.al.	2604.13425	null
2026-04-14	Robotic Manipulation is Vision-to-Geometry Mapping ( $f(v) \rightarrow G$ ): Vision-Geometry Backbones over Language and Video Models	Zijian Song et.al.	2604.12908	null
2026-04-14	ArtifactWorld: Scaling 3D Gaussian Splatting Artifact Restoration via Video Generation Models	Xinliang Wang et.al.	2604.12251	null
2026-04-13	Grounded World Model for Semantically Generalizable Planning	Quanyi Li et.al.	2604.11751	null
2026-04-13	Dyadic Partnership(DP): A Missing Link Towards Full Autonomy in Medical Robotics	Nassir Navab et.al.	2604.11423	null
2026-04-13	ComSim: Building Scalable Real-World Robot Data Generation via Compositional Simulation	Yiran Qin et.al.	2604.11386	null
2026-04-13	WM-DAgger: Enabling Efficient Data Aggregation for Imitation Learning with World Models	Anlan Yu et.al.	2604.11351	null
2026-04-13	3D-Anchored Lookahead Planning for Persistent Robotic Scene Memory via World-Model-Based MCTS	Bronislav Sidik et.al.	2604.11302	null
2026-04-13	AIM: Intent-Aware Unified world action Modeling with Spatial Value Maps	Liaoyuan Fan et.al.	2604.11135	null
2026-04-13	From Topology to Trajectory: LLM-Driven World Models For Supply Chain Resilience	Jia Luo et.al.	2604.11041	null
2026-04-13	OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models	Xiaomeng Hu et.al.	2604.10866	null
2026-04-12	Do LLMs Build Spatial World Models? Evidence from Grid-World Maze Tasks	Weijiang Li et.al.	2604.10690	null
2026-04-11	Zero-shot World Models Are Developmentally Efficient Learners	Khai Loong Aw et.al.	2604.10333	null
2026-04-11	VGA-Bench: A Unified Benchmark and Multi-Model Framework for Video Aesthetics and Generation Quality Evaluation	Longteng Jiang et.al.	2604.10127	null
2026-04-10	EgoTL: Egocentric Think-Aloud Chains for Long-Horizon Tasks	Lulin Liu et.al.	2604.09535	null
2026-04-10	Toward World Models for Epidemiology	Zeeshan Memon et.al.	2604.09519	null
2026-04-10	PhysInOne: Visual Physics Learning and Reasoning in One Suite	Siyuan Zhou et.al.	2604.09415	null
2026-04-10	VAG: Dual-Stream Video-Action Generation for Embodied Data Synthesis	Xiaolei Lang et.al.	2604.09330	null
2026-04-10	Learning Vision-Language-Action World Models for Autonomous Driving	Guoqing Wang et.al.	2604.09059	null
2026-04-10	Advantage-Guided Diffusion for Model-Based Reinforcement Learning	Daniele Foffano et.al.	2604.09035	null
2026-04-10	Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory	Zile Wang et.al.	2604.08995	null
2026-04-10	WOMBET: World Model-based Experience Transfer for Robust and Sample-efficient Reinforcement Learning	Mintae Kim et.al.	2604.08958	null
2026-04-10	Multi-Agent Decision-Focused Learning via Value-Aware Sequential Communication	Benjamin Amoh et.al.	2604.08944	null
2026-04-09	Toward Hardware-Agnostic Quadrupedal World Models via Morphology Conditioning	Mohamad H. Danesh et.al.	2604.08780	null
2026-04-09	Phantom: Physics-Infused Video Generation via Joint Modeling of Visual and Latent Physical Dynamics	Ying Shen et.al.	2604.08503	null
2026-04-09	Grounding Clinical AI Competency in Human Cognition Through the Clinical World Model and Skill-Mix Framework	Seyed Amir Ahmad Safavi-Naini et.al.	2604.08226	null
2026-04-09	Beyond Static Forecasting: Unleashing the Power of World Models for Mobile Traffic Extrapolation	Xiaoqian Qi et.al.	2604.08199	null
2026-04-09	ViVa: A Video-Generative Value Model for Robot Reinforcement Learning	Jindi Lv et.al.	2604.08168	null
2026-04-09	MotionScape: A Large-Scale Real-World Highly Dynamic UAV Video Dataset for World Models	Zile Guo et.al.	2604.07991	null
2026-04-09	WorldMAP: Bootstrapping Vision-Language Navigation Trajectory Prediction with Generative World Models	Hongjin Chen et.al.	2604.07957	null
2026-04-09	DailyArt: Discovering Articulation from Single Static Images via Latent Dynamics	Hang Zhang et.al.	2604.07758	null
2026-04-09	CausalVAE as a Plug-in for World Models: Towards Reliable Counterfactual Dynamics	Ziyi Ding et.al.	2604.07712	null
2026-04-08	Grasp as You Dream: Imitating Functional Grasping from Generated Human Demonstrations	Chao Tang et.al.	2604.07517	null
2026-04-08	GIRL: Generative Imagination Reinforcement Learning via Information-Theoretic Hallucination Control	Prakul Sunil Hiremath et.al.	2604.07426	null
2026-04-08	How Much LLM Does a Self-Revising Agent Actually Need?	Seongwoo Jeong et.al.	2604.07236	null
2026-04-08	PhyEdit: Towards Real-World Object Manipulation via Physically-Grounded Image Editing	Ruihang Xu et.al.	2604.07230	null
2026-04-08	INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling	InSpatio Team et.al.	2604.07209	null
2026-04-08	Radio-Frequency Inverse Rendering for Wireless Environment Modeling	Fuhai Wang et.al.	2604.07086	null
2026-04-08	Telecom World Models: Unifying Digital Twins, Foundation Models, and Predictive Planning for 6G	Hang Zou et.al.	2604.06882	null
2026-04-08	The Rhetoric of Machine Learning	Robert C. Williamson et.al.	2604.06754	null
2026-04-08	Controllable Generative Video Compression	Ding Ding et.al.	2604.06655	null
2026-04-07	Neural Computers	Mingchen Zhuge et.al.	2604.06425	null
2026-04-07	Evolution of Video Generative Foundations	Teng Hu et.al.	2604.06339	null
2026-04-07	Action Images: End-to-End Policy Learning via Multiview Video Generation	Haoyu Zhen et.al.	2604.06168	null
2026-04-07	Toward Consistent World Models with Multi-Token Prediction and Latent Semantic Enhancement	Qimin Zhong et.al.	2604.06155	null
2026-04-07	SEM-ROVER: Semantic Voxel-Guided Diffusion for Large-Scale Driving Scene Generation	Hiba Dahmani et.al.	2604.06113	null
2026-04-06	Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding	Chaoyou Fu et.al.	2604.05015	null
2026-04-06	StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing	StarVLA Community et.al.	2604.05014	null
2026-04-06	A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens	Tommie Kerssies et.al.	2604.04913	null
2026-04-06	Individual and Combined Effects of English as a Second Language and Typos on LLM Performance	Serena Liu et.al.	2604.04723	null
2026-04-06	OpenWorldLib: A Unified Codebase and Definition of Advanced World Models	DataFlow Team et.al.	2604.04707	null
2026-04-06	Preserving Forgery Artifacts: AI-Generated Video Detection at Native Scale	Zhengcen Li et.al.	2604.04634	null
2026-04-06	Veo-Act: How Far Can Frontier Video Models Advance Generalizable Robot Manipulation?	Zhongru Zhang et.al.	2604.04502	null
2026-04-06	UENR-600K: A Large-Scale Physically Grounded Dataset for Nighttime Video Deraining	Pei Yang et.al.	2604.04402	null
2026-04-05	DriveVA: Video Action Models are Zero-Shot Drivers	Mengmeng Liu et.al.	2604.04198	null
2026-04-05	ATSS: Detecting AI-Generated Videos via Anomalous Temporal Self-Similarity	Hang Wang et.al.	2604.04029	null
2026-04-04	Rethinking Position Embedding as a Context Controller for Multi-Reference and Multi-Shot Video Generation	Binyuan Huang et.al.	2604.03738	null
2026-04-04	VidNum-1.4K: A Comprehensive Benchmark for Video-based Numerical Reasoning	Shaoyang Cui et.al.	2604.03701	null

embodied AI

Publish Date	Title	Authors	PDF	Code
2026-05-04	Channel-Level Relation to Attentive Aggregation with Neighborhood-Homogeneity Constraint for Point Cloud Analysis	Jiaqi Shi et.al.	2605.02357	null
2026-05-03	Embody4D: A Generalist 4D World Model for Embodied AI	Peiyan Tu et.al.	2605.01799	null
2026-05-02	ESARBench: A Benchmark for Agentic UAV Embodied Search and Rescue	Daoxuan Zhang et.al.	2605.01371	null
2026-05-02	VUDA: Breaking CUDA-Vulkan Isolation for Spatial Sharing of Compute and Graphics on the Same GPU	Bin Xu et.al.	2605.01352	null
2026-05-01	Split and Aggregation Learning for Foundation Models Over Mobile Embodied AI Network (MEAN): A Comprehensive Survey	Qianzhou Chen et.al.	2605.00970	null
2026-05-01	Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning	Chengshuai Shi et.al.	2605.00347	null
2026-04-30	World Model for Robot Learning: A Comprehensive Survey	Bohan Hou et.al.	2605.00080	null
2026-04-30	Bridging Values and Behavior: A Hierarchical Framework for Proactive Embodied Agents	Chunhui Zhang et.al.	2604.27699	null
2026-04-30	Robot Learning from Human Videos: A Survey	Junyi Ma et.al.	2604.27621	null
2026-04-30	SpaAct: Spatially-Activated Transition Learning with Curriculum Adaptation for Vision-Language Navigation	Pengna Li et.al.	2604.27620	null
2026-04-30	World2Minecraft: Occupancy-Driven Simulated Scenes Construction	Lechao Zhang et.al.	2604.27578	null
2026-04-30	SpatialGrammar: A Domain-Specific Language for LLM-Based 3D Indoor Scene Generation	Song Tang et.al.	2604.27555	null
2026-04-30	Context as Prior: Bayesian-Inspired Intent Inference for Non-Speaking Agents with a Household Cat Testbed	Wenqian Zhang et.al.	2604.27445	null
2026-04-29	3D Generation for Embodied AI and Robotic Simulation: A Survey	Tianwei Ye et.al.	2604.26509	null
2026-04-29	Multiple Consistent 2D-3D Mappings for Robust Zero-Shot 3D Visual Grounding	Yufei Yin et.al.	2604.26261	null
2026-04-28	Lifting Embodied World Models for Planning and Control	Alex N. Wang et.al.	2604.26182	null
2026-04-28	GS-Playground: A High-Throughput Photorealistic Simulator for Vision-Informed Robot Learning	Yufei Jia et.al.	2604.25459	null
2026-04-28	Where Did It Go Wrong? Capability-Oriented Failure Attribution for Vision-and-Language Navigation Agents	Jianming Chen et.al.	2604.25161	null
2026-04-27	Interoceptive machine framework: Toward interoception-inspired regulatory architectures in artificial intelligence	Diego Candia-Rivera et.al.	2604.24527	null
2026-04-27	AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents	Hojoon Kim et.al.	2604.24039	null
2026-04-26	From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation	Jiafeng Wu et.al.	2604.23629	null
2026-04-26	PhysCodeBench: Benchmarking Physics-Aware Symbolic Simulation of 3D Scenes via Self-Corrective Multi-Agent Refinement	Tianyidan Xie et.al.	2604.23580	null
2026-04-24	AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI	Mohammad Sadegh Salehi et.al.	2604.23018	null
2026-04-22	EgoDyn-Bench: Evaluating Ego-Motion Understanding in Vision-Centric Foundation Models for Autonomous Driving	Finn Rasmus Schäfer et.al.	2604.22851	null
2026-04-27	A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism, Governance, and Dynamics in Complex Societies	Somyajit Chakraborty et.al.	2604.22227	null
2026-04-23	A Replicable Robotics Awareness Method Using LLM-Enabled Robotics Interaction: Evidence from a Corporate Challenge	S. A. Prieto et.al.	2604.21377	null
2026-04-23	ReCAPA: Hierarchical Predictive Correction to Mitigate Cascading Failures	Xiyin Zeng et.al.	2604.21232	null
2026-04-23	Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment	Jingkun Chen et.al.	2604.21160	null
2026-04-22	Planetary Exploration 3.0: A Roadmap for Software-Defined, Radically Adaptive Space Systems	Masahiro Ono et.al.	2604.20910	null
2026-04-22	LLM-Guided Safety Agent for Edge Robotics with an ISO-Compliant Perception-Compute-Control Architecture	Xu Huang et.al.	2604.20193	null
2026-04-21	Environmental Understanding Vision-Language Model for Embodied Agent	Jinsik Bang et.al.	2604.19839	null
2026-04-21	InHabit: Leveraging Image Foundation Models for Scalable 3D Human Placement	Nikita Kister et.al.	2604.19673	null
2026-04-21	SafetyALFRED: Evaluating Safety-Conscious Planning of Multimodal Large Language Models	Josue Torres-Fonseca et.al.	2604.19638	null
2026-04-21	RoboWM-Bench: A Benchmark for Evaluating World Models in Robotic Manipulation	Feng Jiang et.al.	2604.19092	null
2026-04-21	Explore Like Humans: Autonomous Exploration with Online SG-Memo Construction for Embodied Agents	Xu Chen et.al.	2604.19034	null
2026-04-20	Will People Enjoy a Robot Trainer? A Case Study with Snoopie the Pacerbot	Maximilian Du et.al.	2604.18331	null
2026-04-20	EmbodiedLGR: Integrating Lightweight Graph Representation and Retrieval for Semantic-Spatial Memory in Robotic Agents	Paolo Riva et.al.	2604.18271	null
2026-04-20	E3VS-Bench: A Benchmark for Viewpoint-Dependent Active Perception in 3D Gaussian Splatting Scenes	Koya Sakamoto et.al.	2604.17969	null
2026-04-20	StableIDM: Stabilizing Inverse Dynamics Model against Manipulator Truncation via Spatio-Temporal Refinement	Kerui Li et.al.	2604.17887	null
2026-04-20	OmniVLA-RL: A Vision-Language-Action Model with Spatial Understanding and Online RL	Haoxiang Jie et.al.	2604.17706	null
2026-04-19	Seeing Isn’t Believing: Mitigating Belief Inertia via Active Intervention in Embodied Agents	Hanlin Wang et.al.	2604.17252	null
2026-04-19	GaLa: Hypergraph-Guided Visual Language Models for Procedural Planning	Kun Wang et.al.	2604.17241	null
2026-04-18	Mini-BEHAVIOR-Gran: Revealing U-Shaped Effects of Instruction Granularity on Language-Guided Embodied Agents	Sukai Huang et.al.	2604.17019	null
2026-04-18	Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification	Jiawen Wen et.al.	2604.16993	null
2026-04-18	Chain Of Interaction Benchmark (COIN): When Reasoning meets Embodied Interaction	Xianhao Wang et.al.	2604.16886	null
2026-04-16	GIST: Multimodal Knowledge Extraction and Spatial Grounding via Intelligent Semantic Topology	Shivendra Agrawal et.al.	2604.15495	null
2026-04-20	ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints	Pei-An Chen et.al.	2604.14902	null
2026-04-16	World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems	Runze Li et.al.	2604.14732	null
2026-04-16	Model-Based Reinforcement Learning Exploits Passive Body Dynamics for High-Performance Biped Robot Locomotion	Tomoya Kamimura et.al.	2604.14565	null
2026-04-15	SpaceMind: A Modular and Self-Evolving Embodied Vision-Language Agent Framework for Autonomous On-orbit Servicing	Aodi Wu et.al.	2604.14399	null
2026-04-15	[Emerging Ideas] Artificial Tripartite Intelligence: A Bio-Inspired, Sensor-First Architecture for Physical AI	You Rim Choi et.al.	2604.13959	null
2026-04-15	EmbodiedClaw: Conversational Workflow Execution for Embodied AI Development	Xueyang Zhou et.al.	2604.13800	null
2026-04-15	ESCAPE: Episodic Spatial Memory and Adaptive Execution Policy for Long-Horizon Mobile Manipulation	Jingjing Qian et.al.	2604.13633	null
2026-04-16	VGGT-Segmentor: Geometry-Enhanced Cross-View Segmentation	Yulu Gao et.al.	2604.13596	null
2026-04-15	AgentComm: Semantic Communication for Embodied Agents	Peiwen Jiang et.al.	2604.13558	null
2026-04-15	Evolvable Embodied Agent for Robotic Manipulation via Long Short-Term Reflection and Optimization	Jianzong Wang et.al.	2604.13533	null
2026-04-14	Exploration and Exploitation Errors Are Measurable for Language Model Agents	Jaden Park et.al.	2604.13151	null
2026-04-14	Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting	Ziyuan Xia et.al.	2604.12626	null
2026-04-15	Reading Between the Pixels: Linking Text-Image Embedding Alignment to Typographic Attack Success on Vision-Language Models	Ravikumar Balakrishnan et.al.	2604.12371	null
2026-04-13	Human-Inspired Context-Selective Multimodal Memory for Social Robots	Hangyeol Kang et.al.	2604.12081	null
2026-04-13	GeomPrompt: Geometric Prompt Learning for RGB-D Semantic Segmentation Under Missing and Degraded Depth	Krishna Jaganathan et.al.	2604.11585	null
2026-04-13	DA-PTQ: Drift-Aware Post-Training Quantization for Efficient Vision-Language-Action Models	Siyuan Xu et.al.	2604.11572	null
2026-04-13	Efficient Emotion-Aware Iconic Gesture Prediction for Robot Co-Speech	Edwin C. Montiel-Vazquez et.al.	2604.11417	null
2026-04-13	EmbodiedGovBench: A Benchmark for Governance, Recovery, and Upgrade Safety in Embodied Agent Systems	Xue Qin et.al.	2604.11174	null
2026-04-13	EgoFun3D: Modeling Interactive Objects from Egocentric Videos using Function Templates	Weikun Peng et.al.	2604.11038	null
2026-04-13	Federated Single-Agent Robotics: Multi-Robot Coordination Without Intra-Robot Multi-Agent Fragmentation	Xue Qin et.al.	2604.11028	null
2026-04-14	ArtiCAD: Articulated CAD Assembly Design via Multi-Agent Code Generation	Yuan Shui et.al.	2604.10992	null
2026-04-13	ScoRe-Flow: Complete Distributional Control via Score-Based Reinforcement Learning for Flow Matching	Xiaotian Qiu et.al.	2604.10962	null
2026-04-12	ReplicateAnyScene: Zero-Shot Video-to-3D Composition via Textual-Visual-Spatial Alignment	Mingyu Dong et.al.	2604.10789	null
2026-04-12	HOG-Layout: Hierarchical 3D Scene Generation, Optimization and Editing via Vision-Language Models	Haiyan Jiang et.al.	2604.10772	null
2026-04-10	PhysInOne: Visual Physics Learning and Reasoning in One Suite	Siyuan Zhou et.al.	2604.09415	null
2026-04-10	V-CAGE: Vision-Closed-Loop Agentic Generation Engine for Robotic Manipulation	Yaru Liu et.al.	2604.09036	null
2026-04-10	PilotBench: A Benchmark for General Aviation Agents with Safety Constraints	Yalun Wu et.al.	2604.08987	null
2026-04-10	AssemLM: Spatial Reasoning Multimodal Large Language Models for Robotic Assembly	Zhi Jing et.al.	2604.08983	null
2026-04-09	AniGen: Unified $S^3$ Fields for Animatable 3D Asset Generation	Yi-Hua Huang et.al.	2604.08746	null
2026-04-09	3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding	Makanjuola Ogunleye et.al.	2604.08645	null
2026-04-09	Visually-grounded Humanoid Agents	Hang Ye et.al.	2604.08509	null
2026-04-10	PolySLGen: Online Multimodal Speaking-Listening Reaction Generation in Polyadic Interaction	Zhi-Yi Lin et.al.	2604.08125	null
2026-04-10	Governed Capability Evolution for Embodied Agents: Safe Upgrade, Compatibility Checking, and Runtime Rollback for Embodied Capability Modules	Xue Qin et.al.	2604.08059	null
2026-04-09	DP-DeGauss: Dynamic Probabilistic Gaussian Decomposition for Egocentric 4D Scene Reconstruction	Tingxi Chen et.al.	2604.07986	null
2026-04-09	PanoSAM2: Lightweight Distortion- and Memory-aware Adaptions of SAM2 for 360 Video Object Segmentation	Dingwen Xiao et.al.	2604.07901	null
2026-04-09	Object-Attribute-Relation Model Driven Adaptive Hierarchical Transmission for Multimodal Semantic Communication	Chenxing Li et.al.	2604.07859	null
2026-04-09	Harnessing Embodied Agents: Runtime Governance for Policy-Constrained Execution	Xue Qin et.al.	2604.07833	null
2026-04-09	Learning Without Losing Identity: Capability Evolution for Embodied Agents	Xue Qin et.al.	2604.07799	null
2026-04-09	DailyArt: Discovering Articulation from Single Static Images via Latent Dynamics	Hang Zhang et.al.	2604.07758	null
2026-04-08	Spatio-Temporal Grounding of Large Language Models from Perception Streams	Jacob Anderson et.al.	2604.07592	null
2026-04-08	Infrastructure First: Enabling Embodied AI for Science in the Global South	Shaoshan Liu et.al.	2604.06722	null
2026-04-07	Hazard Management in Robot-Assisted Mammography Support	Ioannis Stefanakos et.al.	2604.05749	null
2026-04-07	Rectified Schrödinger Bridge Matching for Few-Step Visual Navigation	Wuyang Luan et.al.	2604.05673	null
2026-04-07	Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming	Baoshun Tong et.al.	2604.05595	null
2026-04-07	CoEnv: Driving Embodied Multi-Agent Collaboration via Compositional Environment	Li Kang et.al.	2604.05484	null
2026-04-06	StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing	StarVLA Community et.al.	2604.05014	null
2026-04-06	InfBaGel: Human-Object-Scene Interaction Generation with Dynamic Perception and Iterative Refinement	Yude Zou et.al.	2604.04843	null
2026-04-06	Toward Self-Organizing Production Logistics in Circular Factories: A Multi-Agent Approach	Jan-Felix Klein et.al.	2604.04753	null
2026-04-06	ROSClaw: A Hierarchical Semantic-Physical Framework for Heterogeneous Multi-Agent Collaboration	Rongfeng Zhao et.al.	2604.04664	null
2026-04-05	Hypothesis Graph Refinement: Hypothesis-Driven Exploration with Cascade Error Correction for Embodied Navigation	Peixin Chen et.al.	2604.04108	null
2026-04-04	From Prompt to Physical Action: Structured Backdoor Attacks on LLM-Mediated Robotic Control Systems	Mingyang Xie et.al.	2604.03890	null
2026-04-03	Learning Additively Compositional Latent Actions for Embodied AI	Hangxing Wei et.al.	2604.03340	null
2026-04-03	OMNI-PoseX: A Fast Vision Model for 6D Object Pose Estimation in Embodied Tasks	Michael Zhang et.al.	2604.02759	null
2026-04-02	Reliability-Aware Geometric Fusion for Robust Audio-Visual Navigation	Teng Liu et.al.	2604.02391	null
2026-04-02	Hi-LOAM: Hierarchical Implicit Neural Fields for LiDAR Odometry and Mapping	Zhiliu Yang et.al.	2604.01720	null
2026-03-31	Benchmarking Interaction, Beyond Policy: a Reproducible Benchmark for Collaborative Instance Object Navigation	Edoardo Zorzi et.al.	2604.00265	null

image generation

Publish Date	Title	Authors	PDF	Code
2026-05-05	Large Language Models are Universal Reasoners for Visual Generation	Sucheng Ren et.al.	2605.04040	null
2026-05-05	Flow Sampling: Learning to Sample from Unnormalized Densities via Denoising Conditional Processes	Aaron Havens et.al.	2605.03984	null
2026-05-05	DMGD: Train-Free Dataset Distillation with Semantic-Distribution Matching in Diffusion Models	Qichao Wang et.al.	2605.03877	null
2026-05-05	Phase-Corrected Near-Field Microwave Imaging via Inverse Source Reconstruction with Modulated Signals	Quanfeng Wang et.al.	2605.03875	null
2026-05-05	Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation	Bin Wu et.al.	2605.03849	null
2026-05-05	Towards accurate extreme event likelihoods from diffusion model climate emulators	Peter Manshausen et.al.	2605.03802	null
2026-05-05	GeoTopoDiff: Learning Geometry–Topology Graph Priors through Boundary-Constrained Mixed Diffusion for Sparse-Slice 3D Porous Reconstruction	Yue Shi et.al.	2605.03764	null
2026-05-05	Agent-Based Modeling of Low-Emission Fertilizer Adoption for Dairy Farm Decarbonisation using Empirical Farm Data	Surya Jayakumar et.al.	2605.03648	null
2026-05-05	Diffusion Masked Pretraining for Dynamic Point Cloud	Zhuoyue Zhang et.al.	2605.03639	null
2026-05-05	Bridging the Embodiment Gap: Disentangled Cross-Embodiment Video Editing	Zhiyuan Li et.al.	2605.03637	null
2026-05-04	Active Sampling for Ultra-Low-Bit-Rate Video Compression via Conditional Controlled Diffusion	Amirhosein Javadi et.al.	2605.02849	null
2026-05-04	TOC-SR: Task-Optimal Compact diffusion for Image Super Resolution	Sowmya Vajrala et.al.	2605.02767	null
2026-05-04	SIAM: Head and Brain MRI Segmentation from Few High-Quality Templates via Synthetic Training	Romain Valabregue et.al.	2605.02737	null
2026-05-04	Stylistic Attribute Control in Latent Diffusion Models	Max Reimann et.al.	2605.02583	null
2026-05-04	MooD: An Efficient VA-Driven Affective Image Editing Framework via Fine-Grained Semantic Control	Xinyi Yin et.al.	2605.02521	null
2026-05-04	Anomaly-Preference Image Generation	Fuyun Wang et.al.	2605.02439	null
2026-05-04	DirectEdit: Step-Level Accurate Inversion for Flow-Based Image Editing	Desong Yang et.al.	2605.02417	null
2026-05-04	DriftDecode: One-Step Wireless Image Decoding via Drifting-Inspired Detail Recovery	Jingwen Fu et.al.	2605.02325	null
2026-05-04	Anon: Extrapolating Optimizer Adaptivity Across the Real Spectrum	Yiheng Zhang et.al.	2605.02317	null
2026-05-04	A Hybrid Approach for Closing the Sim2real Appearance Gap in Game Engine Synthetic Datasets	Stefanos Pasios et.al.	2605.02291	null
2026-05-01	Repurposing Image Diffusion Models for Adversarial Synthetic Structured Data: A Case Study of Ground Truth Drift	Adam Arthur et.al.	2605.00788	null
2026-05-01	Reconstruction of glymphatic transport fields from subject-specific imaging data, with particular emphasis on cerebrospinal fluid flow and tracer conservation	A. Derya Bakiler et.al.	2605.00730	null
2026-05-01	PhysEdit: Physically-Consistent Region-Aware Image Editing via Adaptive Spatio-Temporal Reasoning	Guandong Li et.al.	2605.00707	null
2026-05-01	STARE: Step-wise Temporal Alignment and Red-teaming Engine for Multi-modal Toxicity Attack	Xutao Mao et.al.	2605.00699	null
2026-05-01	UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors	Houyuan Chen et.al.	2605.00658	null
2026-05-01	Faithful Extreme Image Rescaling with Learnable Reversible Transformation and Semantic Priors	Hao Wei et.al.	2605.00605	null
2026-05-01	Colorful-Noise: Training-Free Low-Frequency Noise Manipulation for Color-Based Conditional Image Generation	Nadav Z. Cohen et.al.	2605.00548	null
2026-05-01	End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer	Wenda Chu et.al.	2605.00503	null
2026-05-01	Trees to Flows and Back: Unifying Decision Trees and Diffusion Models	Sai Niranjan Ramachandran et.al.	2605.00414	null
2026-05-01	Binomial flows: Denoising and flow matching for discrete ordinal data	Yair Shenfeld et.al.	2605.00360	null
2026-04-30	PhyCo: Learning Controllable Physical Priors for Generative Motion	Sriram Narayanan et.al.	2604.28169	null
2026-04-29	AdvDMD: Adversarial Reward Meets DMD For High-Quality Few-Step Generation	Xu Wang et.al.	2604.28126	null
2026-04-30	From LLM-Driven Trading Card Generation to Procedural Relatedness: A Pokémon Case Study	Johannes Pfau et.al.	2604.27972	null
2026-04-30	Diffusion-OAMP for Joint Image Compression and Wireless Transmission	Wentao Hou et.al.	2604.27952	null
2026-04-30	Noise2Map: End-to-End Diffusion Model for Semantic Segmentation and Change Detection	Ali Shibli et.al.	2604.27889	null
2026-04-30	Machine Unlearning for Class Removal through SISA-based Deep Neural Network Architectures	Ishrak Hamim Mahi et.al.	2604.27804	null
2026-04-30	Leveraging Verifier-Based Reinforcement Learning in Image Editing	Hanzhong Guo et.al.	2604.27505	null
2026-04-30	Electrothermal Dynamics of Cold Front in Impure Tokamak Plasmas	S. Oshiro et.al.	2604.27444	null
2026-04-30	ABC: Any-Subset Autoregression via Non-Markovian Diffusion Bridges in Continuous Time and Space	Gabe Guo et.al.	2604.27443	null
2026-04-30	Sparse-View 3D Gaussian Splatting in the Wild	Wongi Park et.al.	2604.27422	null
2026-04-29	SEAL: Semantic-aware Single-image Sticker Personalization with a Large-scale Sticker-tag Dataset	Changhyun Roh et.al.	2604.26883	null
2026-04-29	Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data	Bao Pham et.al.	2604.26841	null
2026-04-29	Conditional diffusion denoising probabilistic model for super-resolution of atmospheric boundary layer large eddy simulation	Omar Sallam et.al.	2604.26776	null
2026-04-29	Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising	Jun Guo et.al.	2604.26694	null
2026-04-29	Delta Score Matters! Spatial Adaptive Multi Guidance in Diffusion Models	Haosen Li et.al.	2604.26503	null
2026-04-29	Probabilistic data quality assessment for structural monitoring data via outlier-resistant conditional diffusion model	Qi Li et.al.	2604.26366	null
2026-04-29	Beyond Fixed Formulas: Data-Driven Linear Predictor for Efficient Diffusion Models	Zhirong Shen et.al.	2604.26365	null
2026-04-29	ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance	Yang Yang et.al.	2604.26348	null
2026-04-29	SpatialFusion: Endowing Unified Image Generation with Intrinsic 3D Geometric Awareness	Haiyi Qiu et.al.	2604.26341	null
2026-04-28	Charge diffusion and modulation transfer function in a Nancy Grace Roman Space Telescope detector	Emily Macbeth et.al.	2604.26114	null
2026-04-28	DDA-Thinker: Decoupled Dual-Atomic Reinforcement Learning for Reasoning-Driven Image Editing	Hanqing Yang et.al.	2604.25477	null
2026-04-28	A Systematic Post-Train Framework for Video Generation	Zeyue Xue et.al.	2604.25427	null
2026-04-28	Benchmarking Layout-Guided Diffusion Models through Unified Semantic-Spatial Evaluation in Closed and Open Settings	Luca Parolari et.al.	2604.25358	null
2026-04-28	Edge-Cloud Collaborative Reconstruction via Structure-Aware Latent Diffusion for Downstream Remote Sensing Perception	Yun Li et.al.	2604.25319	null
2026-04-28	Golden RPG: Confidence-Adaptive Region-Aware Noise for Compositional Text-to-Image Generation	Hao Li et.al.	2604.25314	null
2026-04-28	The Thinking Pixel: Recursive Sparse Reasoning in Multimodal Diffusion Latents	Yuwei Sun et.al.	2604.25299	null
2026-04-28	Exploring Time Conditioning in Diffusion Generative Models from Disjoint Noisy Data Manifolds	Liuzhuozheng Li et.al.	2604.25289	null
2026-04-28	ResetEdit: Precise Text-guided Editing of Generated Image via Resettable Starting Latent	Hanyi Wang et.al.	2604.25128	null
2026-04-27	Generative diffusion models for spatiotemporal influenza forecasting	Joseph Lemaitre et.al.	2604.24913	null
2026-04-27	VibeToken: Scaling 1D Image Tokenizers and Autoregressive Models for Dynamic Resolution Generations	Maitreya Patel et.al.	2604.24885	null
2026-04-27	DiffQEC: A versatile diffusion model for quantum error correction	Tianyi Xu et.al.	2604.24640	null
2026-04-27	Meta-CoT: Enhancing Granularity and Generalization in Image Editing	Shiyi Zhang et.al.	2604.24625	null
2026-04-27	Diffusion Model as a Generalist Segmentation Learner	Haoxiao Wang et.al.	2604.24575	null
2026-04-27	CA-IDD: Cross-Attention Guided Identity-Conditional Diffusion for Identity-Consistent Face Swapping	Md Shohel Rana et.al.	2604.24493	null
2026-04-27	Guiding Vector Field Generation via Score-based Diffusion Model	Zirui Chen et.al.	2604.24487	null
2026-04-27	TextGround4M: A Prompt-Aligned Dataset for Layout-Aware Text Rendering	Dongxing Mao et.al.	2604.24459	null
2026-04-27	Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion	Zhongjie Duan et.al.	2604.24351	null
2026-04-27	GeoEdit: Local Frames for Fast, Training-Free On-Manifold Editing in Diffusion Models	Yiming Zhang et.al.	2604.24238	null
2026-04-27	Seeing Is No Longer Believing: Frontier Image Generation Models, Synthetic Visual Evidence, and Real-World Risk	Shuai Wu et.al.	2604.24197	null
2026-04-27	Bridging Restoration and Generation Manifolds in One-Step Diffusion for Real-World Super-Resolution	Shyang-En Weng et.al.	2604.24136	null
2026-04-24	Statistical Analysis of Markovian Generative Modeling	Eddie Aamari et.al.	2604.22712	null
2026-04-24	Generative Modeling of Neurodegenerative Brain Anatomy with 4D Longitudinal Diffusion Model	Nivetha Jayakumar et.al.	2604.22700	null
2026-04-24	Structure-Guided Diffusion Model for EEG-Based Visual Cognition Reconstruction	Yongxiang Lian et.al.	2604.22649	null
2026-04-24	Efficient Diffusion Distillation via Embedding Loss	Jincheng Ying et.al.	2604.22379	null
2026-04-24	TabSCM: A practical Framework for Generating Realistic Tabular Data	Sven Jacob et.al.	2604.22337	null
2026-04-24	Knowledge Visualization: A Benchmark and Method for Knowledge-Intensive Text-to-Image Generation	Ran Zhao et.al.	2604.22302	null
2026-04-24	Evaluation of image simulation open source solutions for simulation of synthetic images in lunar environment	Jai G Singla et.al.	2604.22296	null
2026-04-24	AI-Driven Performance-to-Design Generation and Optimization of Marine Propellers	Leah Chen et.al.	2604.22224	null
2026-04-24	Breaking Watermarks in the Frequency Domain: A Modulated Diffusion Attack Framework	Chunpeng Wang et.al.	2604.22220	null
2026-04-24	Multimodal Diffusion to Mutually Enhance Polarized Light and Low Resolution EBSD Data	Harry Dong et.al.	2604.22212	null
2026-04-23	VistaBot: View-Robust Robot Manipulation via Spatiotemporal-Aware View Synthesis	Songen Gu et.al.	2604.21914	null
2026-04-23	UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection	Yanran Zhang et.al.	2604.21904	null
2026-04-23	A Scale-Adaptive Framework for Joint Spatiotemporal Super-Resolution with Diffusion Models	Max Defez et.al.	2604.21903	null
2026-04-23	Causality-Encoded Diffusion Models for Interventional Sampling and Edge Inference	Li Chen et.al.	2604.21843	null
2026-04-23	Quotient-Space Diffusion Models	Yixian Xu et.al.	2604.21809	null
2026-04-23	DCMorph: Face Morphing via Dual-Stream Cross-Attention Diffusion	Tahar Chettaoui et.al.	2604.21627	null
2026-04-23	Generative Learning Enhanced Intelligent Resource Management for Cell-Free Delay Deterministic Communications	Shuangbo Xiong et.al.	2604.21587	null
2026-04-23	DiffNR: Diffusion-Enhanced Neural Representation Optimization for Sparse-View 3D Tomographic Reconstruction	Shiyan Su et.al.	2604.21518	null
2026-04-23	VARestorer: One-Step VAR Distillation for Real-World Image Super-Resolution	Yixuan Zhu et.al.	2604.21450	null
2026-04-23	TopoStyle: Supporting Iterative Design with Generative AI for 2.5D Topology Optimization	Shuyue Feng et.al.	2604.21315	null
2026-04-22	ParetoSlider: Diffusion Models Post-Training for Continuous Reward Control	Shelly Golan et.al.	2604.20816	null
2026-04-22	LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model	Inclusion AI et.al.	2604.20796	null
2026-04-22	Geometric Renyi Differential Privacy: Ricci Curvature Characterized by Heat Diffusion Mechanisms	Xiaotian Chang et.al.	2604.20761	null
2026-04-22	GeoRelight: Learning Joint Geometrical Relighting and Reconstruction with Flexible Multi-Modal Diffusion Transformers	Yuxuan Xue et.al.	2604.20715	null
2026-04-22	Physics-Informed Conditional Diffusion for Motion-Robust Retinal Temporal Laser Speckle Contrast Imaging	Qian Chen et.al.	2604.20594	null
2026-04-22	Exploring Spatial Intelligence from a Generative Perspective	Muzhi Zhu et.al.	2604.20570	null
2026-04-22	Near-Field Wideband Channel Estimation for XL-MIMO Systems via Denoising Diffusion Model	Qingxia Feng et.al.	2604.20494	null
2026-04-22	Conditional Monte Carlo Tree Diffusion for Designing Cell-Type-Specific and Biologically Faithful Regulatory DNA	Animesh Awasthi et.al.	2604.20488	null
2026-04-22	Discrete Preference Learning for Personalized Multimodal Generation	Yuting Zhang et.al.	2604.20434	null
2026-04-22	Cold-Start Forecasting of New Product Life-Cycles via Conditional Diffusion Models	Ruihan Zhou et.al.	2604.20370	null
2026-04-21	Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items	Mengting Chen et.al.	2604.19748	null
2026-04-21	AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model	Yutian Chen et.al.	2604.19747	null
2026-04-21	Generative Drifting for Conditional Medical Image Generation	Zirong Li et.al.	2604.19736	null
2026-04-21	ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis	Zhengwentai Sun et.al.	2604.19720	null
2026-04-21	MedFlowSeg: Flow Matching for Medical Image Segmentation with Frequency-Aware Attention	Zhi Chen et.al.	2604.19675	null
2026-04-21	InHabit: Leveraging Image Foundation Models for Scalable 3D Human Placement	Nikita Kister et.al.	2604.19673	null
2026-04-21	Budgeted Online Influence Maximization	Pierre Perrault et.al.	2604.19672	null
2026-04-21	Multi-Cycle Spatio-Temporal Adaptation in Human-Robot Teaming	Alex Cuellar et.al.	2604.19670	null
2026-04-21	CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation	Xiangyang Luo et.al.	2604.19636	null
2026-04-21	SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing	Ying Zeng et.al.	2604.19587	null
2026-04-20	PlankFormer: Robust Plankton Instance Segmentation via MAE-Pretrained Vision Transformers and Pseudo Community Image Generation	Masaharu Miyazaki et.al.	2604.17856	null
2026-04-20	UniCSG: Unified High-Fidelity Content-Constrained Style-Driven Generation via Staged Semantic and Frequency Disentanglement	Jingwei Yang et.al.	2604.17850	null
2026-04-20	Efficient Diffusion Models under Nonconvex Equality and Inequality constraints via Landing	Kijung Jeon et.al.	2604.17838	null
2026-04-20	AnyLift: Scaling Motion Reconstruction from Internet Videos via 2D Diffusion	Hongjie Li et.al.	2604.17818	null
2026-04-20	Optimally Bridging Semantics and Data: Generative Semantic Communication via Schrödinger Bridge	Dahua Gao et.al.	2604.17802	null
2026-04-20	Structure-Adaptive Sparse Diffusion in Voxel Space for 3D Medical Image Enhancement	Hongxu Jiang et.al.	2604.17773	null
2026-04-20	Grokking of Diffusion Models: Case Study on Modular Addition	Joon Hyeok Kim et.al.	2604.17673	null
2026-04-19	ViPS: Video-informed Pose Spaces for Auto-Rigged Meshes	Honglin Chen et.al.	2604.17623	null
2026-04-19	DGSSM: Diffusion guided state-space models for multimodal salient object detection	Suklav Ghosh et.al.	2604.17585	null
2026-04-19	Target Parameterization in Diffusion Models for Nonlinear Spatiotemporal System Identification	Achraf El Messaoudi et.al.	2604.17566	null
2026-04-17	Repurposing 3D Generative Model for Autoregressive Layout Generation	Haoran Feng et.al.	2604.16299	null
2026-04-17	Enhancing Hazy Wildlife Imagery: AnimalHaze3k and IncepDehazeGan	Shivarth Rai et.al.	2604.16284	null
2026-04-17	Motion-Adapter: A Diffusion Model Adapter for Text-to-Motion Generation of Compound Actions	Yue Jiang et.al.	2604.16135	null
2026-04-17	Elucidating the SNR-t Bias of Diffusion Probabilistic Models	Meng Yu et.al.	2604.16044	null
2026-04-17	From Competition to Coopetition: Coopetitive Training-Free Image Editing Based on Text Guidance	Jinhao Shen et.al.	2604.15948	null
2026-04-17	Making Image Editing Easier via Adaptive Task Reformulation with Agentic Executions	Bo Zhao et.al.	2604.15917	null
2026-04-17	Efficient Video Diffusion Models: Advancements and Challenges	Shitong Shao et.al.	2604.15911	null
2026-04-17	Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration	Jun Li et.al.	2604.15829	null
2026-04-17	Neural Continuous-Time Markov Chain: Discrete Diffusion via Decoupled Jump Timing and Direction	Jingyuan Li et.al.	2604.15694	null
2026-04-17	CLIMB: Controllable Longitudinal Brain Image Generation using Mamba-based Latent Diffusion Model and Gaussian-aligned Autoencoder	Duy-Phuong Dao et.al.	2604.15611	null
2026-04-16	TokenLight: Precise Lighting Control in Images using Attribute Tokens	Sumit Chaturvedi et.al.	2604.15310	null
2026-04-16	An Analysis of Regularization and Fokker-Planck Residuals in Diffusion Models for Image Generation	Onno Niemann et.al.	2604.15171	null
2026-04-16	Towards Faster Language Model Inference Using Mixture-of-Experts Flow Matching	Aihua Li et.al.	2604.15009	null
2026-04-16	Diffusion Crossover: Defining Evolutionary Recombination in Diffusion Models via Noise Sequence Interpolation	Chisatao Kumada et.al.	2604.14790	null
2026-04-16	Constraint-based Pre-training: From Structured Constraints to Scalable Model Initialization	Fu Feng et.al.	2604.14769	null
2026-04-16	SynHAT: A Two-stage Coarse-to-Fine Diffusion Framework for Synthesizing Human Activity Traces	Rongchao Xu et.al.	2604.14705	null
2026-04-16	Mean Flow Policy Optimization	Xiaoyi Dong et.al.	2604.14698	null
2026-04-16	Seen-to-Scene: Keep the Seen, Generate the Unseen for Video Outpainting	Inseok Jeon et.al.	2604.14648	null
2026-04-16	Uncertainty-aware Generative Learning Path Recommendation with Cognition-Adaptive Diffusion	Xiangrui Xiong et.al.	2604.14613	null
2026-04-16	Prompt-Guided Image Editing with Masked Logit Nudging in Visual Autoregressive Models	Amir El-Ghoussani et.al.	2604.14591	null
2026-04-15	Diffusion Language Models for Speech Recognition	Davyd Naveriani et.al.	2604.14001	null
2026-04-15	Creo: From One-Shot Image Generation to Progressive, Co-Creative Ideation	Zoe De Simone et.al.	2604.13956	null
2026-04-15	ASTRA: Enhancing Multi-Subject Generation with Retrieval-Augmented Pose Guidance and Disentangled Position Embedding	Tianze Xia et.al.	2604.13938	null
2026-04-15	Three-dimensional photon transport in spinodal photocatalytic aerogels: how bicontinuous morphology controls kinetic rate constants	Renaud A. L. Vallée et.al.	2604.13929	null
2026-04-15	Blind Bitstream-corrupted Video Recovery via Metadata-guided Diffusion Model	Shuyun Wang et.al.	2604.13906	null
2026-04-15	PostureObjectstitch: Anomaly Image Generation Considering Assembly Relationships in Industrial Scenarios	Zebei Tong et.al.	2604.13863	null
2026-04-15	DiffMagicFace: Identity Consistent Facial Editing of Real Videos	Huanghao Yin et.al.	2604.13841	null
2026-04-15	EMGFlow: Robust and Efficient Surface Electromyography Synthesis via Flow Matching	Boxuan Jiang et.al.	2604.13685	null
2026-04-15	Reconstruction of a 3D wireframe from a single line drawing via generative depth estimation	Elton Cao et.al.	2604.13549	null
2026-04-15	LEGO-MOF: Equivariant Latent Manipulation for Editable, Generative, and Optimizable MOF Design	Chaoran Zhang et.al.	2604.13520	null
2026-04-14	Generative Refinement Networks for Visual Synthesis	Jian Han et.al.	2604.13030	null
2026-04-14	Causal Diffusion Models for Counterfactual Outcome Distributions in Longitudinal Data	Farbod Alinezhad et.al.	2604.12992	null
2026-04-14	Turbulent pair dispersion with Stochastic Generative Diffusion Models	Andrei Pantea et.al.	2604.12932	null
2026-04-14	Transformer Based Machine Fault Detection From Audio Input	Kiran Voderhobli Holla et.al.	2604.12733	null
2026-04-14	OFA-Diffusion Compression: Compressing Diffusion Model in One-Shot Manner	Haoyang Jiang et.al.	2604.12668	null
2026-04-14	SOAR: Self-Correction for Optimal Alignment and Refinement in Diffusion Models	You Qin et.al.	2604.12617	null
2026-04-14	StructDiff: A Structure-Preserving and Spatially Controllable Diffusion Model for Single-Image Generation	Yinxi He et.al.	2604.12575	null
2026-04-14	T2I-BiasBench: A Multi-Metric Framework for Auditing Demographic and Cultural Bias in Text-to-Image Models	Nihal Jaiswal et.al.	2604.12481	null
2026-04-14	Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling	Zida Li et.al.	2604.12446	null
2026-04-14	Bridging the Micro–Macro Gap: Frequency-Aware Semantic Alignment for Image Manipulation Localization	Xiaojie Liang et.al.	2604.12341	null
2026-04-13	Diffusing diffusivity model with dichotomous noise	Dongho Lee et.al.	2604.11800	null
2026-04-13	LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling	Yuxin Chen et.al.	2604.11748	null
2026-04-13	On the Robustness of Watermarking for Autoregressive Image Generation	Andreas Müller et.al.	2604.11720	null
2026-04-13	Representations Before Pixels: Semantics-Guided Hierarchical Video Prediction	Efstathios Karypidis et.al.	2604.11707	null
2026-04-13	Dual-Control Frequency-Aware Diffusion Model for Depth-Dependent Optical Microrobot Microscopy Image Generation	Lan Wei et.al.	2604.11680	null
2026-04-13	RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time	Haozhe Wang et.al.	2604.11626	null
2026-04-13	Progressively Texture-Aware Diffusion for Contrast-Enhanced Sparse-View CT	Tianqi Wang et.al.	2604.11559	null
2026-04-13	Continuous Adversarial Flow Models	Shanchuan Lin et.al.	2604.11521	null
2026-04-13	Anthropogenic Regional Adaptation in Multimodal Vision-Language Model	Samuel Cahyawijaya et.al.	2604.11490	null
2026-04-13	Degradation-Aware and Structure-Preserving Diffusion for Real-World Image Super-Resolution	Yang Ji et.al.	2604.11470	null
2026-04-13	One Scale at a Time: Scale-Autoregressive Modeling for Fluid Flow Distributions	Mario Lino et.al.	2604.11403	null
2026-04-13	DiLO: Decoupling Generative Priors and Neural Operators via Diffusion Latent Optimization for Inverse Problems	Haibo Liu et.al.	2604.11375	null
2026-04-13	Any 3D Scene is Worth 1K Tokens: 3D-Grounded Representation for Scene Generation at Scale	Dongxu Wei et.al.	2604.11331	null
2026-04-13	Learning Discrete Diffusion of Graphs via Free-Energy Gradient Flows	Dario Rancati et.al.	2604.11311	null
2026-04-13	Structured State-Space Regularization for Compact and Generation-Friendly Image Tokenization	Jinsung Lee et.al.	2604.11089	null
2026-04-13	LaDA-Band: Language Diffusion Models for Vocal-to-Accompaniment Generation	Qi Wang et.al.	2604.11052	null
2026-04-10	Envisioning the Future, One Step at a Time	Stefan Andreas Baumann et.al.	2604.09527	null
2026-04-10	Gardening on the Moon: An Advection-Diffusion Model to Guide the Search for Supernova Debris in the Lunar Regolith	Emily S. Costello et.al.	2604.09524	null
2026-04-10	SCoRe: Clean Image Generation from Diffusion Models Trained on Noisy Images	Yuta Matsuzaki et.al.	2604.09436	null
2026-04-10	Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories	Wonbong Jang et.al.	2604.09429	null
2026-04-10	EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure	Junyeong Ahn et.al.	2604.09405	null
2026-04-10	Region-Constrained Group Relative Policy Optimization for Flow-Based Image Editing	Zhuohan Ouyang et.al.	2604.09386	null
2026-04-10	Hitem3D 2.0: Multi-View Guided Native 3D Texture Generation	Huiang He et.al.	2604.09231	null
2026-04-10	Training-free, Perceptually Consistent Low-Resolution Previews with High-Resolution Image for Efficient Workflows of Diffusion Models	Wongi Jeong et.al.	2604.09227	null
2026-04-10	SHIFT: Steering Hidden Intermediates in Flow Transformers	Nina Konovalova et.al.	2604.09213	null
2026-04-10	CT-1: Vision-Language-Camera Models Transfer Spatial Reasoning Knowledge to Camera-Controllable Video Generation	Haoyu Zhao et.al.	2604.09201	null
2026-04-09	When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models	Zhengyang Sun et.al.	2604.08546	null
2026-04-09	RewardFlow: Generate Images by Optimizing What You Reward	Onkar Susladkar et.al.	2604.08536	null
2026-04-09	Novel View Synthesis as Video Completion	Qi Wu et.al.	2604.08500	null
2026-04-09	LAMP: Lift Image-Editing as General 3D Priors for Open-world Manipulation	Jingjing Wang et.al.	2604.08475	null
2026-04-09	Bias-Constrained Diffusion Schedules for PDE Emulations: Reconstruction Error Minimization and Efficient Unrolled Training	Constantin Le Cleï et.al.	2604.08357	null
2026-04-09	Controlling the rain fall statistics using Mean-Reverting Jump Diffusion model	Joya GhoshDastider et.al.	2604.08338	null
2026-04-09	DiV-INR: Extreme Low-Bitrate Diffusion Video Compression with INR Conditioning	Eren Çetin et.al.	2604.08329	null
2026-04-09	HistDiT: A Structure-Aware Latent Conditional Diffusion Model for High-Fidelity Virtual Staining in Histopathology	Aasim Bin Saleem et.al.	2604.08305	null
2026-04-09	GroundingAnomaly: Spatially-Grounded Diffusion for Few-Shot Anomaly Synthesis	Yishen Liu et.al.	2604.08301	null
2026-04-09	EditCaption: Human-Aligned Instruction Synthesis for Image Editing via Supervised Fine-Tuning and Direct Preference Optimization	Xiangyuan Wang et.al.	2604.08213	null
2026-04-08	Distilling Photon-Counting CT into Routine Chest CT through Clinically Validated Degradation Modeling	Junqi Liu et.al.	2604.07329	null
2026-04-08	GenLCA: 3D Diffusion for Full-Body Avatars from In-the-Wild Videos	Yiqian Wu et.al.	2604.07273	null
2026-04-08	PhyEdit: Towards Real-World Object Manipulation via Physically-Grounded Image Editing	Ruihang Xu et.al.	2604.07230	null
2026-04-08	VersaVogue: Visual Expert Orchestration and Preference Alignment for Unified Fashion Synthesis	Jian Yu et.al.	2604.07210	null
2026-04-08	SurFITR: A Dataset for Surveillance Image Forgery Detection and Localisation	Qizhou Wang et.al.	2604.07101	null
2026-04-08	Granular mixing and flow dynamics in horizontal stirred bed reactors	Sahar Pourandi et.al.	2604.07082	null
2026-04-08	Not all tokens contribute equally to diffusion learning	Guoqing Zhang et.al.	2604.07026	null
2026-04-08	MAR-GRPO: Stabilized GRPO for AR-diffusion Hybrid Image Generation	Xiaoxiao Ma et.al.	2604.06966	null
2026-04-08	FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling	Yitong Li et.al.	2604.06916	null
2026-04-08	RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details	Dewei Zhou et.al.	2604.06870	null
2026-04-08	FVD: Inference-Time Alignment of Diffusion Models via Fleming-Viot Resampling	Shivanshu Shekhar et.al.	2604.06779	null
2026-04-08	FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching	Junchao Yi et.al.	2604.06757	null
2026-04-07	DiffHDR: Re-Exposing LDR Videos with Video Diffusion Models	Zhengming Yu et.al.	2604.06161	null
2026-04-07	Learning-Guided Force-Feedback Model Predictive Control with Obstacle Avoidance for Robotic Deburring	Krzysztof Wojciechowski et.al.	2604.06133	null
2026-04-07	PoM: A Linear-Time Replacement for Attention with the Polynomial Mixer	David Picard et.al.	2604.06129	null
2026-04-07	SEM-ROVER: Semantic Voxel-Guided Diffusion for Large-Scale Driving Scene Generation	Hiba Dahmani et.al.	2604.06113	null
2026-04-07	Graph-PiT: Enhancing Structural Coherence in Part-Based Image Synthesis via Graph Priors	Junbin Zhang et.al.	2604.06074	null
2026-04-07	Beyond Black-Scholes: A Computational Framework for Option Pricing Using Heston, GARCH, and Jump Diffusion Models	Karmanpartap Singh Sidhu et.al.	2604.06068	null
2026-04-07	Lipschitz regularity in Flow Matching and Diffusion Models: sharp sampling rates and functional inequalities	Arthur Stéphanovitch et.al.	2604.06065	null
2026-04-07	HumANDiff: Articulated Noise Diffusion for Motion-Consistent Human Video Generation	Tao Hu et.al.	2604.05961	null
2026-04-07	Leveraging Image Editing Foundation Models for Data-Efficient CT Metal Artifact Reduction	Ahmet Rasim Emirdagi et.al.	2604.05934	null
2026-04-07	Improving Controllable Generation: Faster Training and Better Performance via $x_0$ -Supervision	Amadou S. Sangare et.al.	2604.05761	null
2026-04-06	Your Pre-trained Diffusion Model Secretly Knows Restoration	Sudarshan Rajagopalan et.al.	2604.04924	null
2026-04-06	Diffusion of PeV Cosmic Rays in the Turbulent and Multiphase Interstellar Medium	Yue Hu et.al.	2604.04814	null
2026-04-06	Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning	Lei Zhang et.al.	2604.04746	null
2026-04-06	ZeD-MAP: Bundle Adjustment Guided Zero-Shot Depth Maps for Real-Time Aerial Imaging	Selim Ahmet Iz et.al.	2604.04667	null
2026-04-06	Training-Free Refinement of Flow Matching with Divergence-based Sampling	Yeonwoo Cha et.al.	2604.04646	null
2026-04-06	Beyond Semantics: Uncovering the Physics of Fakes via Universal Physical Descriptors for Cross-Modal Synthetic Detection	Mei Qiu et.al.	2604.04608	null
2026-04-06	PR-IQA: Partial-Reference Image Quality Assessment for Diffusion-Based Novel View Synthesis	Inseong Choi et.al.	2604.04576	null
2026-04-06	Erasure or Erosion? Evaluating Compositional Degradation in Unlearned Text-To-Image Diffusion Models	Arian Komaei Koma et.al.	2604.04575	null
2026-04-06	Training-Free Image Editing with Visual Context Integration and Concept Alignment	Rui Song et.al.	2604.04487	null
2026-04-06	Beyond Few-Step Inference: Accelerating Video Diffusion Transformer Model Serving with Inter-Request Caching Reuse	Hao Liu et.al.	2604.04451	null

LLM training

Publish Date	Title	Authors	PDF	Code
2026-05-05	Audio-Visual Intelligence in Large Foundation Models	You Qin et.al.	2605.04045	null
2026-05-05	Stayin’ Aligned Over Time: Towards Longitudinal Human-LLM Alignment via Contextual Reflection and Privacy-Preserving Behavioral Data	Simret Araya Gebreegziabher et.al.	2605.04029	null
2026-05-05	On Adaptivity in Zeroth-Order Optimization	Hassan Dbouk et.al.	2605.03869	null
2026-05-05	Natural Language Processing: A Comprehensive Practical Guide from Tokenisation to RLHF	Mullosharaf K. Arabov et.al.	2605.03799	null
2026-05-05	AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics	Tencent HY Team et.al.	2605.03652	null
2026-05-05	Revisiting Graph-Tokenizing Large Language Models: A Systematic Evaluation of Graph Token Understanding	Zhongjian Zhang et.al.	2605.03514	null
2026-05-04	Moral Sensitivity in LLMs: A Tiered Evaluation of Contextual Bias via Behavioral Profiling and Mechanistic Interpretability	Yash Aggarwal et.al.	2605.03217	null
2026-05-04	Enwar 3.0: An Agentic Multi-Modal LLM Orchestrator for Situation-Aware Beamforming, Blockage Prediction, and Handover Management	Ahmad M. Nazar et.al.	2605.03215	null
2026-05-04	Geometric Deviation as an Unsupervised Pre-Generation Reliability Signal: Probing LLM Representations for Answerability	Yucheng Du et.al.	2605.03196	null
2026-05-04	Bolek: A Multimodal Language Model for Molecular Reasoning	Frederic Grabowski et.al.	2605.02745	null
2026-05-04	Gradient-Gated DPO: Stabilizing Preference Optimization in Language Models	Inoussa Mouiche et.al.	2605.02626	null
2026-05-04	Efficient Preference Poisoning Attack on Offline RLHF	Chenye Yang et.al.	2605.02495	null
2026-05-04	Anomaly-Preference Image Generation	Fuyun Wang et.al.	2605.02439	null
2026-05-04	Reliability-Oriented Multilingual Orthopedic Diagnosis: A Domain-Adaptive Modeling and a Conceptual Validation Framework	Danish Ali et.al.	2605.02266	null
2026-05-03	Maistros: A Greek Large Language Model Adapted Through Knowledge Distillation From Large Reasoning Models	Nikolaos Giarelis et.al.	2605.01870	null
2026-05-03	RMGAP: Benchmarking the Generalization of Reward Models across Diverse Preferences	Yangyang Zhou et.al.	2605.01831	null
2026-05-02	LLM Output Detectability and Task Performance Can be Jointly Optimized	Koshiro Saito et.al.	2605.01350	null
2026-05-02	Addressing Data Scarcity in Bangla Fake News Detection: An LLM-Based Dataset Augmentation Approach	Ahmed Alfey Sani et.al.	2605.01292	null
2026-05-02	GIFT: Guided Fine-Tuning and Transfer for Enhancing Instruction-Tuned Language Models	Zhiwen Ruan et.al.	2605.01256	null
2026-05-01	Let ViT Speak: Generative Language-Image Pre-training	Yan Fang et.al.	2605.00809	null
2026-05-01	AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments	Zhijie Cai et.al.	2605.00650	null
2026-05-01	H-RAG at SemEval-2026 Task 8: Hierarchical Parent-Child Retrieval for Multi-Turn RAG Conversations	Passant Elchafei et.al.	2605.00631	null
2026-05-01	DynamicPO: Dynamic Preference Optimization for Recommendation	Xingyu Hu et.al.	2605.00327	null
2026-05-01	Online Self-Calibration Against Hallucination in Vision-Language Models	Minghui Chen et.al.	2605.00323	null
2026-04-30	Attention Is Where You Attack	Aviral Srivastava et.al.	2605.00236	null
2026-04-30	TUR-DPO: Topology- and Uncertainty-Aware Direct Preference Optimization	Abdulhady Abas Abdullah et.al.	2605.00224	null
2026-04-30	Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback	Yikai Wang et.al.	2605.00155	null
2026-04-30	ViLegalNLI: Natural Language Inference for Vietnamese Legal Texts	Nhung Thi-Hong Duong et.al.	2605.00116	null
2026-04-30	FiLMMeD: Feature-wise Linear Modulation for Cross-Problem Multi-Depot Vehicle Routing	Arthur Corrêa et.al.	2604.28102	null
2026-04-30	Learning from Disagreement: Clinician Overrides as Implicit Preference Signals for Clinical AI in Value-Based Care	Prabhjot Singh et.al.	2604.28010	null
2026-04-30	ZipCCL: Efficient Lossless Data Compression of Communication Collectives for Accelerating LLM Training	Wenxiang Lin et.al.	2604.27844	null
2026-04-30	Mind the Gap: Structure-Aware Consistency in Preference Learning	Mehryar Mohri et.al.	2604.27733	null
2026-04-30	Language Ideologies in a Multilingual Society: An LLM-based Analysis of Luxembourgish News Comments	Emilia Milano et.al.	2604.27661	null
2026-04-30	HAVEN: Hybrid Automated Verification ENgine for UVM Testbench Synthesis with LLMs	Chang-Chih Meng et.al.	2604.27643	null
2026-04-30	SecGoal: A Benchmark for Security Goal Extraction and Formalization from Protocol Documents	Dawei Huang et.al.	2604.27601	null
2026-04-30	Leveraging Verifier-Based Reinforcement Learning in Image Editing	Hanzhong Guo et.al.	2604.27505	null
2026-04-30	Secret Stealing Attacks on Local LLM Fine-Tuning through Supply-Chain Model Code Backdoors	Zi Li et.al.	2604.27426	null
2026-04-29	Instruction Complexity Induces Positional Collapse in Adversarial LLM Evaluation	Jon-Paul Cacioli et.al.	2604.27249	null
2026-04-29	Zero-Shot to Full-Resource: Cross-lingual Transfer Strategies for Aspect-Based Sentiment Analysis	Jakob Fehle et.al.	2604.26619	null
2026-04-29	Translating Under Pressure: Domain-Aware LLMs for Crisis Communication	Antonio Castaldo et.al.	2604.26597	null
2026-04-29	SplitFT: An Adaptive Federated Split Learning System For LLMs Fine-Tuning	Yimeng Shan et.al.	2604.26388	null
2026-04-28	Hierarchical Multi-Persona Induction from User Behavioral Logs: Learning Evidence-Grounded and Truthful Personas	Nayoung Choi et.al.	2604.26120	null
2026-04-28	When Errors Can Be Beneficial: A Categorization of Imperfect Rewards for Policy Gradient	Shuning Shang et.al.	2604.25872	null
2026-04-28	From Soliloquy to Agora: Memory-Enhanced LLM Agents with Decentralized Debate for Optimization Modeling	Jianghao Lin et.al.	2604.25847	null
2026-04-28	Step-Audio-R1.5 Technical Report	Yuxin Zhang et.al.	2604.25719	null
2026-04-28	Backtranslation Augmented Direct Preference Optimization for Neural Machine Translation	Mehrdad Ghassabi et.al.	2604.25702	null
2026-04-28	Health System Scale Semantic Search Across Unstructured Clinical Notes	Faith Wavinya Mutinda et.al.	2604.25605	null
2026-04-28	A Systematic Post-Train Framework for Video Generation	Zeyue Xue et.al.	2604.25427	null
2026-04-28	FED-FSTQ: Fisher-Guided Token Quantization for Communication-Efficient Federated Fine-Tuning of LLMs on Edge Devices	Changyu Li et.al.	2604.25421	null
2026-04-28	Below-Chance Blindness: Prompted Underperformance in Small LLMs Produces Positional Bias Rather than Answer Avoidance	Jon-Paul Cacioli et.al.	2604.25249	null
2026-04-28	Frictive Policy Optimization for LLMs: Epistemic Intervention, Risk-Sensitive Control, and Reflective Alignment	James Pustejovsky et.al.	2604.25136	null
2026-04-28	What Makes Good Instruction-Tuning Data? An In-Context Learning Perspective	Guangzeng Han et.al.	2604.25132	null
2026-04-27	A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations	Zihan Liu et.al.	2604.24468	null
2026-04-27	A Multi-Dimensional Audit of Politically Aligned Large Language Models	Lisa Korver et.al.	2604.24429	null
2026-04-27	Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment	Wenzhe Xu et.al.	2604.24178	null
2026-04-27	TACO: Efficient Communication Compression of Intermediate Tensors for Scalable Tensor-Parallel LLM Training	Man Liu et.al.	2604.24088	null
2026-04-27	Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B	Jon-Paul Cacioli et.al.	2604.24070	null
2026-04-27	Disagreement as Signals: Dual-view Calibration for Sequential Recommendation Denoising	Sijia Li et.al.	2604.24048	null
2026-04-27	FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training	Rezaul Karim et.al.	2604.24013	null
2026-04-27	Hindsight Preference Optimization for Financial Time Series Advisory	Yanwei Cui et.al.	2604.23988	null
2026-04-27	Continual Calibration: Coverage Can Collapse Before Accuracy in Lifelong LLM Fine-Tuning	Ibne Farabi Shihab et.al.	2604.23987	null
2026-04-27	MatchRDMA: A Segmented and Rate-Matched Long-Haul RDMA Scheme for Geo-distributed LLM Training over OTN	Jun Dai et.al.	2604.23932	null
2026-04-24	CAGE-SGG: Counterfactual Active Graph Evidence for Open-Vocabulary Scene Graph Generation	Suiyang Guang et.al.	2604.22274	null
2026-04-24	TTS-PRISM: A Perceptual Reasoning and Interpretable Speech Model for Fine-Grained Diagnosis	Xi Wang et.al.	2604.22225	null
2026-04-24	Verbal Confidence Saturation in 3-9B Open-Weight Instruction-Tuned LLMs: A Pre-Registered Psychometric Validity Screen	Jon-Paul Cacioli et.al.	2604.22215	null
2026-04-23	PermaFrost-Attack: Stealth Pretraining Seeding(SPS) for planting Logic Landmines During LLM Training	Harsh Kumar et.al.	2604.22117	null
2026-04-23	When Cow Urine Cures Constipation on YouTube: Limits of LLMs in Detecting Culture-specific Health Misinformation	Anamta Khan et.al.	2604.22002	null
2026-04-23	When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs	Pegah Khayatan et.al.	2604.21911	null
2026-04-23	Why are all LLMs Obsessed with Japanese Culture? On the Hidden Cultural and Regional Biases of LLMs	Joseba Fernandez de Landa et.al.	2604.21751	null
2026-04-23	Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation	Nikita Severin et.al.	2604.21536	null
2026-04-23	Generalizing Numerical Reasoning in Table Data through Operation Sketches and Self-Supervised Learning	Hanjun Cho et.al.	2604.21495	null
2026-04-23	Reasoning Primitives in Hybrid and Non-Hybrid LLMs	Shivam Rawat et.al.	2604.21454	null
2026-04-23	CAP: Controllable Alignment Prompting for Unlearning in LLMs	Zhaokun Wang et.al.	2604.21251	null
2026-04-23	Reasoning About Traversability: Language-Guided Off-Road 3D Trajectory Planning	Byounggun Park et.al.	2604.21249	null
2026-04-23	Zero-Shot Detection of LLM-Generated Text via Implicit Reward Model	Runheng Liu et.al.	2604.21223	null
2026-04-23	On Reasoning Behind Next Occupation Recommendation	Shan Dong et.al.	2604.21204	null
2026-04-22	TabSHAP	Aryan Chaudhary et.al.	2604.21120	null
2026-04-22	MGDA-Decoupled: Geometry-Aware Multi-Objective Optimisation for DPO-based LLM Alignment	Andor Vári-Kakas et.al.	2604.20685	null
2026-04-22	The Effect of Idea Elaboration on the Automatic Assessment of Idea Originality	Umberto Domanti et.al.	2604.20569	null
2026-04-22	Where Reasoning Breaks: Logic-Aware Path Selection by Controlling Logical Connectives in LLMs Reasoning Chains	Seunghyun Park et.al.	2604.20564	null
2026-04-22	Evian: Towards Explainable Visual Instruction-tuning Data Auditing	Zimu Jia et.al.	2604.20544	null
2026-04-22	Surrogate modeling for interpreting black-box LLMs in medical predictions	Changho Han et.al.	2604.20331	null
2026-04-22	Image Generators are Generalist Vision Learners	Valentin Gabeur et.al.	2604.20329	null
2026-04-22	LLM-guided phase diagram construction through high-throughput experimentation	Ryo Tamura et.al.	2604.20304	null
2026-04-22	HiPO: Hierarchical Preference Optimization for Adaptive Reasoning in LLMs	Darsh Kachroo et.al.	2604.20140	null
2026-04-21	Bootstrapping Post-training Signals for Open-ended Tasks via Rubric-based Self-play on Pre-training Text	Chengyu Huang et.al.	2604.20051	null
2026-04-21	Super Apriel: One Checkpoint, Many Speeds	SLAM Labs et.al.	2604.19877	null
2026-04-21	Exploring Language-Agnosticity in Function Vectors: A Case Study in Machine Translation	Nurkhan Laiyk et.al.	2604.19678	null
2026-04-21	HP-Edit: A Human-Preference Post-Training Framework for Image Editing	Fan Li et.al.	2604.19406	null
2026-04-21	Location Not Found: Exposing Implicit Local and Global Biases in Multilingual LLMs	Guy Mor-Lan et.al.	2604.19292	null
2026-04-21	HarDBench: A Benchmark for Draft-Based Co-Authoring Jailbreak Attacks for Safe Human-LLM Collaborative Writing	Euntae Kim et.al.	2604.19274	null
2026-04-21	UniEP: Unified Expert-Parallel MoE MegaKernel for LLM Training	Size Zheng et.al.	2604.19241	null
2026-04-21	The Rise of Verbal Tics in Large Language Models: A Systematic Analysis Across Frontier Models	Shuai Wu et.al.	2604.19139	null
2026-04-21	SAHM: A Benchmark for Arabic Financial and Shari’ah-Compliant Reasoning	Rania Elbadry et.al.	2604.19098	null
2026-04-21	STK-Adapter: Incorporating Evolving Graph and Event Chain for Temporal Knowledge Graph Extrapolation	Shuyuan Zhao et.al.	2604.19042	null
2026-04-21	Policy Gradient Primal-Dual Method for Safe Reinforcement Learning from Human Feedback	Qiang Liu et.al.	2604.19024	null
2026-04-21	Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control	Julian Skifstad et.al.	2604.19018	null
2026-04-20	JudgeMeNot: Personalizing Large Language Models to Emulate Judicial Reasoning in Hebrew	Itay Razumenko et.al.	2604.18041	null
2026-04-20	Architecture Matters More Than Scale: A Comparative Study of Retrieval and Memory Augmentation for Financial QA Under SME Compute Constraints	Jianan Liu et.al.	2604.17979	null
2026-04-20	Efficient Federated RLHF via Zeroth-Order Policy Optimization	Deyi Wang et.al.	2604.17747	null
2026-04-19	PBSBench: A Multi-Level Vision-Language Framework and Benchmark for Hematopathology Whole Slide Image Interpretation	Yuanlong Wang et.al.	2604.17570	null
2026-04-19	PoliLegalLM: A Technical Report on a Large Language Model for Political and Legal Affairs	Yuting Huang et.al.	2604.17543	null
2026-04-19	E2E-GMNER: End-to-End Generative Grounded Multimodal Named Entity Recognition	Meng Zhang et.al.	2604.17319	null
2026-04-19	Cat-DPO: Category-Adaptive Safety Alignment	Tiankai Yang et.al.	2604.17299	null
2026-04-19	HeadRank: Decoding-Free Passage Reranking via Preference-Aligned Attention Heads	Juyuan Wang et.al.	2604.17237	null
2026-04-19	Guardrails in Logit Space: Safety Token Regularization for LLM Alignment	Thong Bach et.al.	2604.17210	null
2026-04-18	Complementing Self-Consistency with Cross-Model Disagreement for Uncertainty Quantification	Kimia Hamidieh et.al.	2604.17112	null
2026-04-17	Sketching the Readout of Large Language Models for Scalable Data Attribution and Valuation	Yide Ran et.al.	2604.16197	null
2026-04-17	CiPO: Counterfactual Unlearning for Large Reasoning Models through Iterative Preference Optimization	Junyi Li et.al.	2604.15847	null
2026-04-17	Into the Gray Zone: Domain Contexts Can Blur LLM Safety Boundaries	Ki Sen Hung et.al.	2604.15717	null
2026-04-17	Towards Robust Endogenous Reasoning: Unifying Drift Adaptation in Non-Stationary Tuning	Xiaoyu Yang et.al.	2604.15705	null
2026-04-17	C-Mining: Unsupervised Discovery of Seeds for Cultural Data Synthesis via Geometric Misalignment	Pufan Zeng et.al.	2604.15675	null
2026-04-17	GroupDPO: Memory efficient Group-wise Direct Preference Optimization	Jixuan Leng et.al.	2604.15602	null
2026-04-16	StoSignSGD: Unbiased Structural Stochasticity Fixes SignSGD for Training Large Language Models	Dingzhi Yu et.al.	2604.15416	null
2026-04-16	MADE: A Living Benchmark for Multi-Label Text Classification with Uncertainty Quantification of Medical Device Adverse Events	Raunak Agarwal et.al.	2604.15203	null
2026-04-16	RaTA-Tool: Retrieval-based Tool Selection with Multimodal Large Language Models	Gabriele Mattioli et.al.	2604.14951	null
2026-04-16	WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training	Yifu Chen et.al.	2604.14932	null
2026-04-16	Reasoning Dynamics and the Limits of Monitoring Modality Reliance in Vision-Language Models	Danae Sánchez Villegas et.al.	2604.14888	null
2026-04-16	CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning	Zhuo Wang et.al.	2604.14768	null
2026-04-16	Switching Efficiency: A Novel Framework for Dissecting AI Data Center Network Efficiency	Niangen Ye et.al.	2604.14690	null
2026-04-16	SPAGBias: Uncovering and Tracing Structured Spatial Gender Bias in Large Language Models	Binxian Su et.al.	2604.14672	null
2026-04-15	FoodSense: A Multisensory Food Dataset and Benchmark for Predicting Taste, Smell, Texture, and Sound from Images	Sabab Ishraq et.al.	2604.14388	null
2026-04-15	The Cost of Language: Centroid Erasure Exposes and Exploits Modal Competition in Multimodal Language Models	Akshay Paruchuri et.al.	2604.14363	null
2026-04-15	DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines	Gabriel Pimenta de Freitas Cardoso et.al.	2604.14314	null
2026-04-15	Don’t Let the Video Speak: Audio-Contrastive Preference Optimization for Audio-Visual Language Models	Ami Baid et.al.	2604.14129	null
2026-04-15	TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration	Zerun Ma et.al.	2604.14116	null
2026-04-15	MAny: Merge Anything for Multimodal Continual Instruction Tuning	Zijian Gao et.al.	2604.14016	null
2026-04-15	Do We Still Need Humans in the Loop? Comparing Human and LLM Annotation in Active Learning for Hostility Detection	Ahmad Dawar Hakimi et.al.	2604.13899	null
2026-04-15	SparseBalance: Load-Balanced Long Context Training with Dynamic Sparse Attention	Hongtao Xu et.al.	2604.13847	null
2026-04-15	Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges	Xiaohua Wang et.al.	2604.13602	null
2026-04-15	SAKURAONE: An Open Ethernet-Based AI HPC System and Its Observed Workload Dynamics in a Single-Tenant LLM Development Environment	Fumikazu Konishi et.al.	2604.13600	null
2026-04-15	Debate to Align: Reliable Entity Alignment through Two-Stage Multi-Agent Debate	Cunda Wang et.al.	2604.13551	null
2026-04-15	Synthesizing Instruction-Tuning Datasets with Contrastive Decoding	Tatsuya Ichinose et.al.	2604.13538	null
2026-04-14	Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization	Aadyot Bhatnagar et.al.	2604.13175	null
2026-04-14	Visual Preference Optimization with Rubric Rewards	Ya-Qi Yu et.al.	2604.13029	null
2026-04-14	One Token Away from Collapse: The Fragility of Instruction-Tuned Helpfulness	Erfan Baghaei Potraghloo et.al.	2604.13006	null
2026-04-14	Boosting Visual Instruction Tuning with Self-Supervised Guidance	Sophia Sirko-Galouchenko et.al.	2604.12966	null
2026-04-14	From Imitation to Discrimination: Progressive Curriculum Learning for Robust Web Navigation	Chuang Peng et.al.	2604.12666	null
2026-04-14	Safety Training Modulates Harmful Misalignment Under On-Policy RL, But Direction Depends on Environment Design	Leon Eshuijs et.al.	2604.12500	null
2026-04-14	Analyzing the Effect of Noise in LLM Fine-tuning	Lingfang Li et.al.	2604.12469	null
2026-04-14	Three Birds, One Stone: Solving the Communication-Memory-Privacy Trilemma in LLM Fine-tuning Over Wireless Networks with Zeroth-Order Optimization	Zhijie Cai et.al.	2604.12401	null
2026-04-14	AgenticAI-DialogGen: Topic-Guided Conversation Generation for Fine-Tuning and Evaluating Short- and Long-Term Memories of LLMs	Manoj Madushanka Perera et.al.	2604.12179	null
2026-04-14	Nucleus-Image: Sparse MoE for Image Generation	Chandan Akiti et.al.	2604.12163	null
2026-04-13	Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models	Syed Rifat Raiyan et.al.	2604.12076	null
2026-04-13	CLSGen: A Dual-Head Fine-Tuning Framework for Joint Probabilistic Classification and Verbalized Explanation	WonJin Yoon et.al.	2604.11801	null
2026-04-13	RPA-Check: A Multi-Stage Automated Framework for Evaluating Dynamic LLM-based Role-Playing Agents	Riccardo Rosati et.al.	2604.11655	null
2026-04-13	MLLM-as-a-Judge Exhibits Model Preference Bias	Shuitsu Koyama et.al.	2604.11589	null
2026-04-13	OOM-RL: Out-of-Money Reinforcement Learning Market-Driven Alignment for LLM-Based Multi-Agent Systems	Kun Liu et.al.	2604.11477	null
2026-04-13	Mobile GUI Agent Privacy Personalization with Trajectory Induced Preference Optimization	Zhixin Lin et.al.	2604.11259	null
2026-04-13	BITS Pilani at SemEval-2026 Task 9: Structured Supervised Fine-Tuning with DPO Refinement for Polarization Detection	Atharva Gupta et.al.	2604.11121	null
2026-04-13	DDO-RM for LLM Preference Optimization: A Minimal Held-Out Benchmark against DPO	Tiantian Zhang et.al.	2604.11119	null
2026-04-12	Advancing Polish Language Modeling through Tokenizer Optimization in the Bielik v3 7B and 11B Series	Krzysztof Ociepa et.al.	2604.10799	null
2026-04-12	Teaching Language Models How to Code Like Learners: Conversational Serialization for Student Simulation	Charles Koutcheme et.al.	2604.10720	null
2026-04-12	ProUIE: A Macro-to-Micro Progressive Learning Method for LLM-based Universal Information Extraction	Wenda Liu et.al.	2604.10633	null
2026-04-12	CogInstrument: Modeling Cognitive Processes for Bidirectional Human-LLM Alignment in Planning Tasks	Anqi Wang et.al.	2604.10587	null
2026-04-12	Calibration Collapse Under Sycophancy Fine-Tuning: How Reward Hacking Breaks Uncertainty Quantification in LLMs	Subramanyam Sahoo et.al.	2604.10585	null
2026-04-10	Think Less, Know More: State-Aware Reasoning Compression with Knowledge Guidance for Efficient Reasoning	Yi Sui et.al.	2604.09150	null
2026-04-10	NyayaMind- A Framework for Transparent Legal Reasoning and Judgment Prediction in the Indian Legal System	Parjanya Aditya Shukla et.al.	2604.09069	null
2026-04-10	TaxPraBen: A Scalable Benchmark for Structured Evaluation of LLMs in Chinese Real-World Tax Practice	Gang Hu et.al.	2604.08948	null
2026-04-09	Cards Against LLMs: Benchmarking Humor Alignment in Large Language Models	Yousra Fettach et.al.	2604.08757	null
2026-04-09	Decomposing the Delta: What Do Models Actually Learn from Preference Pairs?	Chia-Hsuan Lee et.al.	2604.08723	null
2026-04-09	SUPERNOVA: Eliciting General Reasoning in LLMs with Reinforcement Learning on Natural Instructions	Ashima Suvarna et.al.	2604.08477	null
2026-04-09	ProMedical: Hierarchical Fine-Grained Criteria Modeling for Medical LLM Alignment via Explicit Injection	He Geng et.al.	2604.08326	null
2026-04-09	Towards Identification and Intervention of Safety-Critical Parameters in Large Language Models	Weiwei Qi et.al.	2604.08297	null
2026-04-09	Self-Debias: Self-correcting for Debiasing Large Language Models	Xuan Feng et.al.	2604.08243	null
2026-04-09	EditCaption: Human-Aligned Instruction Synthesis for Image Editing via Supervised Fine-Tuning and Direct Preference Optimization	Xiangyuan Wang et.al.	2604.08213	null
2026-04-09	Vision-Language Foundation Models for Comprehensive Automated Pavement Condition Assessment	Blessing Agyei Kyem et.al.	2604.08212	null
2026-04-09	Aligning Agents via Planning: A Benchmark for Trajectory-Level Reward Modeling	Jiaxuan Wang et.al.	2604.08178	null
2026-04-09	DSCA: Dynamic Subspace Concept Alignment for Lifelong VLM Editing	Gyanendra Das et.al.	2604.07965	null
2026-04-09	Rethinking Data Mixing from the Perspective of Large Language Models	Yuanjian Xu et.al.	2604.07963	null
2026-04-09	Large Language Model Post-Training: A Unified View of Off-Policy and On-Policy Learning	Shiwan Zhao et.al.	2604.07941	null
2026-04-08	VersaVogue: Visual Expert Orchestration and Preference Alignment for Unified Fashion Synthesis	Jian Yu et.al.	2604.07210	null
2026-04-08	Gemma 4, Phi-4, and Qwen3: Accuracy-Efficiency Tradeoffs in Dense and MoE Reasoning Language Models	Md Motaleb Hossen Manik et.al.	2604.07035	null
2026-04-08	MARS: Enabling Autoregressive Models Multi-Token Generation	Ziqi Jin et.al.	2604.07023	null
2026-04-08	Beyond Accuracy: Diagnosing Algebraic Reasoning Failures in LLMs Across Nine Complexity Dimensions	Parth Patil et.al.	2604.06799	null
2026-04-08	Multi-Faceted Self-Consistent Preference Alignment for Query Rewriting in Conversational Search	Zhiyu Cao et.al.	2604.06771	null
2026-04-08	The Theorems of Dr. David Blackwell and Their Contributions to Artificial Intelligence	Napoleon Paxton et.al.	2604.06621	null
2026-04-07	Limits of Difficulty Scaling: Hard Samples Yield Diminishing Returns in GRPO-Tuned SLMs	Suraj Yadav et.al.	2604.06298	null
2026-04-07	Stories of Your Life as Others: A Round-Trip Evaluation of LLM-Generated Life Stories Conditioned on Rich Psychometric Profiles	Ben Wigler et.al.	2604.06071	null
2026-04-07	How LLMs Follow Instructions: Skillful Coordination, Not a Universal Mechanism	Elisabetta Rocchetti et.al.	2604.06015	null
2026-04-07	Beyond Compromise: Pareto-Lenient Consensus for Efficient Multi-Preference LLM Alignment	Renxuan Tan et.al.	2604.05965	null
2026-04-07	BOSCH: Black-Box Binary Optimization for Short-Context Attention-Head Selection in LLMs	Abbas Ghaddar et.al.	2604.05942	null
2026-04-07	JD-BP: A Joint-Decision Generative Framework for Auto-Bidding and Pricing	Linghui Meng et.al.	2604.05845	null
2026-04-07	Vision-Guided Iterative Refinement for Frontend Code Generation	Hannah Sansford et.al.	2604.05839	null
2026-04-07	Controlling Distributional Bias in Multi-Round LLM Generation via KL-Optimized Fine-Tuning	Yanbei Jiang et.al.	2604.05756	null
2026-04-06	Instruction-Tuned LLMs for Parsing and Mining Unstructured Logs on Leadership HPC Systems	Ahmad Maroof Karimi et.al.	2604.05168	null
2026-04-06	SenseAI: A Human-in-the-Loop Dataset for RLHF-Aligned Financial Sentiment Reasoning	Berny Kabalisa et.al.	2604.05135	null
2026-04-06	Offline RL for Adaptive Policy Retrieval in Prior Authorization	Ruslan Sharifullin et.al.	2604.05125	null
2026-04-06	One Model for All: Multi-Objective Controllable Language Models	Qiang He et.al.	2604.04497	null
2026-04-06	MolDA: Molecular Understanding and Generation via Large Language Diffusion Model	Seohyeon Shin et.al.	2604.04403	null
2026-04-06	Developing Authentic Simulated Learners for Mathematics Teacher Learning: Insights from Three Approaches with Large Language Models	Jie Cao et.al.	2604.04361	null
2026-04-05	APPA: Adaptive Preference Pluralistic Alignment for Fair Federated RLHF of LLMs	Mahmoud Srewa et.al.	2604.04261	null
2026-04-05	DARE: Diffusion Large Language Models Alignment and Reinforcement Executor	Jingyi Yang et.al.	2604.04215	null
2026-04-05	A Semi-Automated Annotation Workflow for Paediatric Histopathology Reports Using Small Language Models	Avish Vijayaraghavan et.al.	2604.04168	null
2026-04-05	Extracting and Steering Emotion Representations in Small Language Models: A Methodological Comparison	Jihoon Jeong et.al.	2604.04064	null
2026-04-05	COBOL-Coder: Domain-Adapted Large Language Models for COBOL Code Generation and Translation	Anh T. V. Dau et.al.	2604.03986	null
2026-04-05	SafeCtrl: Region-Aware Safety Control for Text-to-Image Diffusion via Detect-Then-Suppress	Lingyun Zhang et.al.	2604.03941	null
2026-04-04	Where to Steer: Input-Dependent Layer Selection for Steering Improves LLM Alignment	Soham Gadgil et.al.	2604.03867	null