RALLI, Robotic Action-Language
Learning through Interaction
funded by WWTF.
-
Empirical data
Action Verb Corpus The corpus comprises multimodal data of 55 episodes (recordings) conducted by 15 humans with in total 467 instances of simple actions (take, put, push). Recorded are audio, video and motion data (hand and arm) while participants perform an action and describe what they do.
Extension to the Action Verb Corpus The extension dataset consists of 41 recordings conducted by 2 users experienced with the system performing the same three actions as in AVC — take (208 instances), put (208 instances), and push (91 instances). The actions were performed without any instructions. The focus of the extension is to facilitate visual action recognition.