For infants, the first problem in learning a word is to map the word to its referent; a second problem is to remember that mapping when the word and/or referent are again encountered. Recent infant studies suggest that spatial location plays a key role in how infants solve both problems. Here we provide a new theoretical model and new empirical evidence on how the body – and its momentary posture – may be central to these processes. The present study uses a name-object mapping task in which names are either encountered in the absence of their target (experiments 1–3, 6 & 7), or when their target is present but in a location previously associated with a foil (experiments 4, 5, 8 & 9). A humanoid robot model (experiments 1–5) is used to instantiate and test the hypothesis that body-centric spatial location, and thus the bodies’ momentary posture, is used to centrally bind the multimodal features of heard names and visual objects. The robot model is shown to replicate existing infant data and then to generate novel predictions, which are tested in new infant studies (experiments 6–9). Despite spatial location being task-irrelevant in this second set of experiments, infants use body-centric spatial contingency over temporal contingency to map the name to object. Both infants and the robot remember the name-object mapping even in new spatial locations. However, the robot model shows how this memory can emerge –not from separating bodily information from the word-object mapping as proposed in previous models of the role of space in word-object mapping – but through the body’s momentary disposition in space.