Latest AI Papers: Nov 16, 2025

by Alex Johnson 31 views

Latest AI Research Papers - November 16, 2025

This article provides a curated overview of the latest research papers in Artificial Intelligence, published on November 16, 2025. The papers are categorized into four key areas: Vision Language Action (VLA), Robotics, Vision Language Models (VLM), and World Models. This summary aims to provide a snapshot of the current advancements in these rapidly evolving fields. For a more comprehensive reading experience and access to the full papers, please visit the Github page.

Vision Language Action

Vision Language Action (VLA) models are at the forefront of AI, focusing on creating agents that can understand visual information and take actions. Here's a look at the latest papers:

  • Towards Blind and Low-Vision Accessibility of Lightweight VLMs and Custom LLM-Evals: This paper explores how to make VLA models accessible for people with visual impairments, a crucial step towards inclusive AI.
  • OmniVGGT: Omni-Modality Driven Visual Geometry Grounded: This research delves into grounding visual geometry using multiple modalities, enhancing the understanding of complex scenes.
  • SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation: This paper, accepted to AAAI 2026, presents advancements in robotic manipulation by aligning semantics for efficient performance.
  • Facial-R1: Aligning Reasoning and Recognition for Facial Emotion Analysis: This work, accepted by AAAI 2026, focuses on improving facial emotion analysis by aligning reasoning and recognition.
  • Agent Journey Beyond RGB: Hierarchical Semantic-Spatial Representation Enrichment for Vision-and-Language Navigation: This paper, also accepted to AAAI 2026, explores enriching navigation by going beyond standard RGB data.
  • Phantom Menace: Exploring and Enhancing the Robustness of VLA Models against Physical Sensor Attacks: Accepted by AAAI 2026, this paper investigates the robustness of VLA models against physical sensor attacks.
  • Audio-VLA: Adding Contact Audio Perception to Vision-Language-Action Model for Robotic Manipulation: This research explores the addition of contact audio perception to VLA models for robotic manipulation.
  • Improving Pre-Trained Vision-Language-Action Policies with Model-Based Search: This paper focuses on improving pre-trained VLA policies using model-based search.
  • Towards Embodied Agentic AI: Review and Classification of LLM- and VLM-Driven Robot Autonomy and Interaction: This paper reviews and classifies LLM and VLM-driven robot autonomy and interaction.
  • MAP-VLA: Memory-Augmented Prompting for Vision-Language-Action Model in Robotic Manipulation: This paper explores the use of memory-augmented prompting in VLA models for robotic manipulation.
  • WMPO: World Model-based Policy Optimization for Vision-Language-Action Models: This research explores world model-based policy optimization for VLA models.
  • Think, Remember, Navigate: Zero-Shot Object-Goal Navigation with VLM-Powered Reasoning: This paper investigates zero-shot object-goal navigation using VLM-powered reasoning.
  • Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds: This paper presents an open recipe for building generalist agents in 3D open worlds.
  • Survey of Vision-Language-Action Models for Embodied Manipulation: This is a survey of VLA models in the context of embodied manipulation, presented in Chinese.
  • MirrorLimb: Implementing hand pose acquisition and robot teleoperation based on RealMirror: This research focuses on implementing hand pose acquisition and robot teleoperation using RealMirror.

Robot

Robotics research continues to advance, focusing on improving robot capabilities and applications. Key papers include:

  • Robot Crash Course: Learning Soft and Stylized Falling: This research focuses on teaching robots to fall safely and with style.
  • Text to Robotic Assembly of Multi Component Objects using 3D Generative AI and Vision Language Models: Accepted to NeurIPS 2025, this paper uses 3D generative AI and VLM for robotic assembly.
  • SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation: (Repeated from above). This paper is also relevant to the robotics category.
  • Onboard Mission Replanning for Adaptive Cooperative Multi-Robot Systems: This paper focuses on mission replanning for cooperative multi-robot systems.
  • Improving dependability in robotized bolting operations: This research aims to improve the reliability of robotized bolting operations.
  • RoboBenchMart: Benchmarking Robots in Retail Environment: This paper focuses on benchmarking robots in retail environments.
  • Opinion: Towards Unified Expressive Policy Optimization for Robust Robot Learning: Accepted by NeurIPS 2025, this paper focuses on unified policy optimization for robust robot learning.
  • Physics-informed Machine Learning for Static Friction Modeling in Robotic Manipulators Based on Kolmogorov-Arnold Networks: This research uses physics-informed machine learning for static friction modeling.
  • DecARt Leg: Design and Evaluation of a Novel Humanoid Robot Leg with Decoupled Actuation for Agile Locomotion: This paper focuses on the design and evaluation of a novel humanoid robot leg.
  • ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory: This paper focuses on synthesizing plausible robotic manipulation videos.
  • Audio-VLA: Adding Contact Audio Perception to Vision-Language-Action Model for Robotic Manipulation: (Repeated from above). This paper is also relevant to the robotics category.
  • PuffyBot: An Untethered Shape Morphing Robot for Multi-environment Locomotion: This paper discusses an untethered shape-morphing robot.
  • Stochastic Adaptive Estimation in Polynomial Curvature Shape State Space for Continuum Robots: This research focuses on stochastic adaptive estimation for continuum robots.
  • A Shared-Autonomy Construction Robotic System for Overhead Works: This paper presents a shared-autonomy construction robotic system.
  • Towards Embodied Agentic AI: Review and Classification of LLM- and VLM-Driven Robot Autonomy and Interaction: (Repeated from above). This paper is also relevant to the robotics category.

Vision Language Model

Vision Language Models (VLM) are crucial for AI's ability to understand and process both visual and textual information. Here's what's new:

  • Querying Labeled Time Series Data with Scenario Programs: This paper explores querying labeled time series data using scenario programs.
  • Towards Blind and Low-Vision Accessibility of Lightweight VLMs and Custom LLM-Evals: (Repeated from above). This paper is also relevant to the VLM category.
  • Impact of Layer Norm on Memorization and Generalization in Transformers: This research investigates the impact of layer normalization on transformers.
  • OmniVGGT: Omni-Modality Driven Visual Geometry Grounded: (Repeated from above). This paper is also relevant to the VLM category.
  • Text to Robotic Assembly of Multi Component Objects using 3D Generative AI and Vision Language Models: (Repeated from above). This paper is also relevant to the VLM category.
  • SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation: (Repeated from above). This paper is also relevant to the VLM category.
  • Drifting Away from Truth: GenAI-Driven News Diversity Challenges LVLM-Based Misinformation Detection: This research looks at challenges in misinformation detection using LVLMs.
  • vMFCoOp: Towards Equilibrium on a Unified Hyperspherical Manifold for Prompting Biomedical VLMs: This paper focuses on prompt engineering for biomedical VLMs.
  • MonkeyOCR v1.5 Technical Report: Unlocking Robust Document Parsing for Complex Patterns: This technical report details advancements in document parsing using MonkeyOCR.
  • Rethinking Visual Information Processing in Multimodal LLMs: This paper rethinks how visual information is processed in multimodal LLMs.
  • Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models: This research focuses on mitigating hallucinations in LVLMs.
  • VADB: A Large-Scale Video Aesthetic Database with Professional and Multi-Dimensional Annotations: This paper introduces a large-scale video aesthetic database.
  • PROPA: Toward Process-level Optimization in Visual Reasoning via Reinforcement Learning: This paper explores process-level optimization in visual reasoning using reinforcement learning.
  • Causal-HalBench: Uncovering LVLMs Object Hallucinations Through Causal Intervention: This research uncovers object hallucinations in LVLMs.
  • Preconditioned Inexact Stochastic ADMM for Deep Model: This paper explores preconditioned stochastic optimization methods for deep models.

World Model

World Models are essential for enabling AI agents to learn and interact within simulated environments, fostering better decision-making capabilities. The latest research includes:

  • Group Spike and Slab Variational Bayes: This paper explores variational Bayesian methods.
  • Querying Labeled Time Series Data with Scenario Programs: (Repeated from above). This paper is also relevant to the world model category.
  • Towards Blind and Low-Vision Accessibility of Lightweight VLMs and Custom LLM-Evals: (Repeated from above). This paper is also relevant to the world model category.
  • Pretrained Joint Predictions for Scalable Batch Bayesian Optimization of Molecular Designs: This research uses pretrained joint predictions for molecular design optimization.
  • Bi-Level Contextual Bandits for Individualized Resource Allocation under Delayed Feedback: This paper focuses on resource allocation using contextual bandits.
  • Belief Net: A Filter-Based Framework for Learning Hidden Markov Models from Observations: This research introduces a filter-based framework for learning HMMs.
  • Dynamic Avatar-Scene Rendering from Human-centric Context: This paper explores dynamic avatar-scene rendering.
  • SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation: (Repeated from above). This paper is also relevant to the world model category.
  • How Worrying Are Privacy Attacks Against Machine Learning?: This research examines the severity of privacy attacks against machine learning models.
  • On the Detectability of Active Gradient Inversion Attacks in Federated Learning: This paper focuses on detecting gradient inversion attacks in federated learning.
  • Strategic Opponent Modeling with Graph Neural Networks, Deep Reinforcement Learning and Probabilistic Topic Modeling: This paper explores strategic opponent modeling.
  • Multi-view Structural Convolution Network for Domain-Invariant Point Cloud Recognition of Autonomous Vehicles: This research focuses on point cloud recognition for autonomous vehicles.
  • Quantifying Climate Policy Action and Its Links to Development Outcomes: A Cross-National Data-Driven Analysis: This paper analyzes climate policy and development outcomes.
  • LocalBench: Benchmarking LLMs on County-Level Local Knowledge and Reasoning: This paper focuses on benchmarking LLMs on local knowledge.
  • On Stealing Graph Neural Network Models: This research explores the theft of GNN models.

This collection of papers highlights the rapid advancements in AI, with significant progress in VLA, Robotics, VLM, and World Models. As the field continues to evolve, these developments will likely drive further innovation and practical applications.

For more detailed information and the full papers, please visit the Github page. Check the github page frequently, new articles are updated every day.

Explore Further: For a deeper dive into cutting-edge AI research, consider exploring the resources provided by the arXiv repository, where many of these papers are initially published. This platform offers access to a wide array of scientific articles, providing a comprehensive view of the latest developments in AI and related fields.