Verifiably Following Complex Robot Instructions with Foundation Models

Abstract

When instructing robots, users want to flexibly express constraints, refer to arbitrary landmarks, and verify robot behavior, while robots must disambiguate instructions into specifications and ground instruction referents in the real world. To address this problem, we propose Language Instruc- tion grounding for Motion Planning (LIMP), an approach that enables robots to verifiably follow complex, open-ended in- structions in real-world environments without prebuilt semantic maps. LIMP constructs a symbolic instruction representation that reveals the robot’s alignment with an instructor’s intended motives and affords the synthesis of correct-by-construction robot behaviors. We conduct a large-scale evaluation of LIMP on 150 instructions across five real-world environments, demon- strating its versatility and ease of deployment in diverse, unstructured domains. LIMP performs comparably to state- of-the-art baselines on standard open-vocabulary tasks and additionally achieves a 79% success rate on complex spatiotem- poral instructions, significantly outperforming baselines that only reach 38%

This figure visualizes our approach, Language Instruction grounding for Motion Planning (LIMP), executing the instruction: "Bring the green plush toy to the whiteboard in front of it, watch out for the robot in front of the toy". LIMP has no semantic information of the environment prior to this instruction, rather at runtime, our approach leverages VLMs and spatial reasoning to detect and ground open-vocabulary instruction referents. LIMP then generates a verifiably correct task and motion plan that enables the robot to navigate from its start location (yellow, A), to the green plush toy (green, B), execute a pick skill which searches for and grasps the object, then navigate to the whiteboard (blue, C) while avoiding the robot in the space (red circles), to finally execute a place skill which sets the object down.

BibTeX


        @article{quartey2024verifiably,
          title={Verifiably Following Complex Robot Instructions with Foundation Models},
          author={Quartey, Benedict and Rosen, Eric and Tellex, Stefanie and Konidaris, George},
          journal={arXiv preprint arXiv:2402.11498},
          year={2024}
        }

Special thanks to Nerfies for an awesome website template!

Verifiably Following Complex Robot Instructions with Foundation Models

Abstract

Demo 1

Demo 2

Demo 3

LIMP (Ours)

NLMap-Saycan

Code-as-policies

BibTeX