Qureos

Find The RightJob.

Research Intern: Vision-Language-Action and World Models for Autonomous Systems

Job Number: P25INT-43
Honda Research Institute USA (HRI-US) is seeking a highly motivated intern to join its intelligent autonomy and AI research efforts. This role focuses on advancing approaches for handling rare, long-tail scenarios in autonomous driving by exploring complementary modeling paradigms. The candidate will work with modern multimodal and predictive modeling techniques, including vision-language(-action) models and world modeling approaches, to better understand and represent complex real-world situations. The work will contribute to improving the robustness, interpretability, and reliability of intelligent autonomous systems.
San Jose, CA

Key Responsibilities



  • Develop multimodal and predictive models, including vision-language(-action) and world models, using post-training (e.g. fine-tuning) to improve performance in rare, safety-critical scenarios.
  • Curate and preprocess datasets from public benchmarks with a focus on long-tail and edge-case conditions.
  • Design experiments to evaluate model behavior in complex scenarios and analyze results to identify strengths, limitations, failure modes, and potential improvements in rare-event settings.
  • Collaborate with cross-functional teams to align research direction and technical goals.
  • Contribute to a portfolio of patents, academic publications, and prototypes to demonstrate research value.

Minimum Qualifications



  • M.S. in Computer Science, Electrical Engineering, Robotics, Artificial Intelligence, Machine Learning, or a related field.
  • Strong background in machine learning, deep learning, or multimodal AI, including experience with vision-language(-action) models and/or world models.
  • Experience with model training, fine-tuning, or large-scale data processing.
  • Proficiency in Python and ML frameworks (e.g., PyTorch, TensorFlow).
  • Strong written and verbal communication skills, with the ability to present technical ideas and results clearly to diverse audiences.

Bonus Qualifications

  • Ph.D. in Computer Science, Electrical Engineering, Robotics, Artificial Intelligence, Machine Learning, or a related field.
  • Familiarity with autonomous systems, robotics, or mobility-related datasets.
  • Experience with parameter-efficient training methods (e.g., LoRA, adapters).
  • Exposure to long-tail/edge-case analysis or safety-critical systems.
  • Strong analytical and problem-solving skills for diagnosing model behavior.
  • Publication record in top-tier conferences (e.g., CVPR, ICCV, ECCV, WACV, NeurIPS, ICLR).


Years of Work Experience Required
0

Desired Start Date
8/31/2026

Internship Duration
3 Months

Position Keywords
Mutimodal learning, vision-language(-action) models, world models, long-tail scenarios, autonomous driving




Warning:You do not have the permission to access the upload fields on this form. Contact the form owner or portal administrator to request the access.

© 2026 Qureos. All rights reserved.