In search for the neural mechanisms of autonomous behavior: behavior-driven differential Hebbian learning.

Ralf Der

The story:

What the videos tell is that a simple neural mechanism, behavior-driven differential Hebbian learning (BDDHL), is able of eliciting behavioral competencies of embodied systems in a completely autonomous way.

Abstract: When Donald Hebb published his 1949 book "The Organization of Behavior" he opened a new way of thinking in theoretical neuroscience which, in retrospective, is very close to contemporary ideas in self-organization. His metaphor of "wiring together" what "fires together" matches very closely the common paradigm that global organization can derive from simple local rules. While ingenious at his time and inspiring the research over decades, the results still fall short of the expectations. For instance, unsupervised as they are, such neural mechanisms should be able to explain and realize the self-organized acquisition of sensorimotor competencies. This paper proposes a new synaptic law which replaces Hebb's original metaphor by that of "chaining together" what "changes together". Starting from differential Hebbian learning, the new rule grounds the behavior of the external physical world directly in the internal synaptic dynamics. Neurorobotics is an ideal testing ground for this behavior-based extrinsic plasticity. This paper focuses on the close coupling between body, control, and environment in challenging physical settings. The examples demonstrate how the new synaptic mechanism induces a self-determined "search and converge" strategy in behavior space, generating a variety of sensorimotor competencies. The emerging behavior patterns are qualified by involving body and environment in an irreducible conjunction with the internal mechanism. The results may not only be of immediate interest for the further development of embodied intelligence. They also offer a new view on the role of self-learning processes in natural evolution and in the brain.

More videos on snakes, fighting, tumbling, and jumping hexapods here.

See also our book The playful machine (by Ralf Der and Georg Martius) and our robotics page on self-organizing robots.

There, you also find applications to hardware robots like the Semni or the Stumpy robot.


Video S1: Spherical robot on level ground I. In the beginning, the robot is kicked by a mechanical force (an attracting force center marked by the red dot) so that it starts rolling. This initial motion is rapidly picked up and amplified by the BBDHL rule so that a stable rolling mode is generated. By subsequent kicks, the modes can be switched but the robot returns after a while into one of its most stable modes, rotating around one of its axes (the learning of the response matrix A is slow in this scenario).

Video S2: Spherical robot on level ground II. Example of a fast rolling mode reached with faster model learning. Note that in the beginning the video is in slow motion mode.

Video S3: Spherical robot in a circular basin. When started in a basin, the robot reaches after a short "orientation phase" with irregular behavior a limit cycle, circulating in the basin. The radius of the circle is changing with a slow change of the learning parameters.

Video S4: Interacting spherical robots in a circular basin. With more than one robot, despite the interactions between the robots, the robots reach a stable circular motion adapted to the geometry of the basin. At time 02:30 the learning rate was increased whereby the robot increases velocity and eventually leaves the basin.

Video S5: The crawling snake. Crawling mode of the snake bot. The mode is excited spontaneously by the universal learning algorithm without any external help. In particular, there is no central pattern generator, the only sensor information the robot gets is from its own joint angles.

Video S6: Switching modes by external influences. When applying forces (the red dot) the snake may switch modes. After the first interaction, a kind of sidewinding mode is established, switched into a forward crawling mode that is pretty stable against external perturbations. In the end, the direction of crawling is inverted reflecting the perfect forward-backward symmetry of the snake bot. Note the speed up by a factor of 3.

Video S7: The constitutive role of the agent-environment coupling. The physical coupling between agent and environment (here the contact with the ground) is constitutive for the formation and persistence of the modes. In the video, the gravity is switched off (at time 05:45) so that the robot loses contact with the ground, leading to the rapid decay of the crawling mode. After switching on gravity again, a new mode (kind of sidewinding) is emerging showing that the old mode was completely forgotten.

Video S8: Effects of the agent-environment coupling I. Collisions with obstacles elicit strong reactive forces in the joints which can not be compensated completely by the motors, leading to rapid changes in the values of the joint angle sensors. This provokes, via the learning rule, a change in the behavior of the system as a whole, here an inversion of locomotion velocities when colliding with the wall. In between the collisions the robot seems to remember the shape of its confinement. See also video S9 . Video speed up factor 5.

Video S9: Effects of the agent-environment coupling II. Another reaction pattern when colliding with the wall. See video S8 for more details on this effect. Video speed up factor 10.

Video S10: Stotting behavior. A stotting like behavior is observed after time 29:40, with the robot jumping vertically by moving all six legs (nearly) simultaneously.

Video S11: Emerging locomotion pattern.

================ Additional videos =================

Video A1: Hexapod modes. Each of the legs is equipped with 5 passive tarsus elements increasing the complexity of the physical system tremendously. Note how after a while metastable oscillatory modes emerge (at times 10:40 and 11:30).

Video A2: Agent-environment coupling: the fighting humanoids. Both robots have strong magnets as their hands which are temporarily switched on and off (red color when on). The neural controller is driven by the behavior-based differential Hebbian learning mechanism proposed in the paper. The only sensor information each robot gets is from its joint angles which are largely influenced by the interactions with both the opponent and the ground. The gray disks are repellers pushing the robots back into the fighting arena.

Video A3: Agent-environment coupling: The fighting hexapods. The agent-environment coupling is most intensive if the environment is dominated by other robots in strong physical contact. When getting in contact, the physical reality of each of the robots is strongly determined by the counteractions of its opponent. In the experiments, the only sensor information the robot gets is from its own joint angles.

Video A4: Hexapod with "armband" robot.

Video A5: Walking-like pattern.

Video A6: Predator's jump.

This document was translated from LATEX by HEVEA.