Swarm-bots project

The hole/obstacle avoidance task has been designed for studying collective navigation strategies. It can be considered an instance of the broader family of all-terrain navigation tasks, which generally tackle the problem of navigating in complex environments presenting obstacles, rough terrains, holes, gaps or narrow passages. The work presented in this section summarises the research activity about hole/obstacle avoidance. In particular, it is worth mentioning that the research carried out has focused on the sole hole avoidance problem, while the capability of avoiding also obstacles resulted as an interesting generalisation of the synthesised behaviours.

The s-bots are placed in an arena presenting open borders and holes, in which the swarm-bot could fall. When considering the generalisation to obstacle avoidance, the arena is surrounded by walls. s-bots are rigidly connected in a swarm-bot formation. Their goal is to efficiently explore the arena, avoiding to fall into the holes or to remain stuck against an obstacle.

The control of the swarm-bot is completely distributed, and s-bots can only rely on local information. The problem consists in how to coordinate the activity of the s-bots. In particular, the difficulty of the collective navigation is twofold: (i) coordinated motion must be performed in order to obtain a coherent navigation of the swarm-bot as a whole, as a result of the motion of its components; (ii) holes/obstacles are not perceived by all the s-bots at the same time. Thus, the presence of an hazard, once detected, must be communicated to the entire group, in order to trigger a change in the direction of motion.

The communication aspect is of central importance. Communication is in fact required for both the activities of coordinated motion and hole/obstacle avoidance. We tested two different communication strategies: (i) direct interactions among s-bots, referring to the transmission of information through physical contacts (i.e., through the connections between the s-bots) and (ii) direct communication, referring to the transmission of information through sound signals (i.e., by means of the speakers and microphones with which the s-bots are provided).

The complexity of the task justifies the use of evolutionary robotics techniques for the synthesis of the s-bots' controller. Here, we show how it is possible to evolve simple neural controllers that are able to perform coordinated motion and hole avoidance, relying only on simple forms of communication among the s-bots.

Each s-bot is provided with a traction sensor, already described here. Traction sensors are responsible for the detection of the direct interactions among s-bots. In fact, an s-bot can generate a traction force that is felt by the other s-bots connected through their grippers. This force mediates the communication among s-bots, and it can be exploited for coordinating the activities of the group (see previous examples). The presence of holes is detected using four ground sensors---infrared proximity sensors pointing to the ground---that are integral with the rotating turret. Each s-bot is also equipped with a loudspeaker and three directional microphones, used to detect the tone emitted by other s-bots.

s-bots can control the two wheels, independently setting their speed in the range [-6.5,6.5] rad/s. In this work, the s-bots start already assembled in a swarm-bot formation. The swarm-bot is composed of four s-bots rigidly connected to form a chain. It is placed in a square arena of 4 meters side, that presents open borders and two rectangular holes (see Fig. 1). The dimensions have been chosen to create passages that can be navigated by the swarm-bot, no matter its orientation.

Figure 1. Left: the arena used for the evolution of hole avoidance behaviors. Right: a close view of a swarm-bot composed of 4 s-bots linearly connected.

The s-bots are controlled by artificial neural networks, whose parameters are set by a simple evolutionary algorithm. A single genotype is used to create a group of s-bots with an identical control structure---a homogeneous group. Each s-bot is controlled by a fully connected, single layer feed-forward neural network---a perceptron.

As mentioned above, we performed two sets of experiments, which differ in the form of communication used. Thus, the neural networks controlling the s-bots change depending on which sensors and actuators are employed. In all the experiments, traction and ground sensors have been used. If direct communication is used, three more sensors are used, corresponding to the three microphones with which an s-bot is endowed. Concerning the actuators, the outputs of the perceptron are used to control the left and the right wheel. When direct communication is used, the activation of the loudspeaker has been handcrafted, simulating a sort of reflex action: an s-bot activates the loudspeaker whenever one of its ground sensors detects the presence of a hole. Thus, the neural network does not control the emission of a sound signal. However, it receives the information coming from the three directional microphones, and evolution is responsible for shaping the correct reaction to the perceived signals.

During the evolution, each genotype is evaluated 5 times (i.e., 5 trials). Each trial differs from the others in the initial position of the swarm-bot within the arena and the initial orientation of each s-bot's chassis. Each trial lasts 200 simulation cycles, which correspond to 20 seconds of real time. The behaviour produced by the evolved controller is evaluated according to a fitness function that takes into account only variables directly accessible to an s-bot. Fitness is estimated for each s-bot taking into account three components:

Fitness is computed for each s-bot taking part to the experiment. Then, we select the minimum among the individual fitnesses, which refers to the worst-behaving s-bot, ensuring a robust overall fitness computation.

For both settings---using only direct interactions (hereafter indicated as DI) and complementing them with direct communication (hereafter indicated as DC)---the evolutionary experiments were replicated 10 times. The average fitness values, computed over all the replications, are shown in Figure 2. All evolutionary runs were successful. It is worth noting that the average fitness of DC is slightly higher than in the case of DI. This suggests that the use of direct communication among s-bots is beneficial for the hole avoidance task.

Figure 2. Average performance of the best individual and of the population, plotted against the generation number.

A qualitative analysis of the behaviours produced by the two settings reveals no particular differences in the initial coordination phase that leads to a coherent motion of the swarm-bot. Coordinated motion is performed in a very similar way as described here. The differences between the two settings DC and DI are evident once the hole avoidance behaviour is considered. When an s-bot detects an edge, it rotates the chassis and changes the direction of motion in order to avoid falling. When using only direct interactions, this change in direction produces a traction force for the other s-bots. This triggers a new coordination phase for the selection of a new common direction of motion, which leads the swarm-bot away from the edge. This simple behaviour exploits the direct interactions among s-bots---shaped as traction forces---to communicate the presence of an hazard---the hole to be avoided. However, this strategy may fail as communication via traction is sometimes too weak to be perceived by the whole swarm-bot. On the contrary, the evolved controllers that make use of direct communication react faster to the detection of a hole: the s-bot that detects the hole emits a sound signal that is immediately perceived by the rest of the group. Thus, the whole swarm-bot starts turning away from the hole, without waiting to perceive a strong traction force. Traction is then exploited again in order to perform coordinated motion.

In order to quantitatively assess the difference in performance between DC and DI, we performed a post-evaluation analysis and compared the results obtained with the two settings. The best individuals evolved in the different setups have been re-evaluated 200 times. A box-plot summarising the performance of these individuals is shown in Fig. 3. It is possible to notice that DC generally performs better than DI. On the base of these data, we performed a two-way analysis of variance to test if there is a significant difference in performance between the settings. The result of the analysis allows us to reject the null hypothesis that there is no difference among the two settings (p-value <0.0001). On the base of the mean performance of the two settings---0.3316 for DC and 0.2708 for DI---we can conclude that, in the experimental conditions considered, a system that uses direct communication among the s-bots performs better than one that exploits only direct interactions.

Figure 3. Post-evaluation analysis performed evaluating 200 times the best individuals obtained from each replication of the experiment. Boxes represent the inter-quartile range of the data, while the horizontal bars inside the boxes mark the median values. The whiskers extend to the most extreme data points within 1.5 of the inter-quartile range from the box. The empty circles mark the outliers.

The above described behaviours present interesting generalisation features. In fact, the evolved controllers still work in many different environmental conditions, such as having a different number of s-bots forming swarm-bots of various shapes, having flexible connections among the s-bots and varying the arena shape and size. Additionally, if obstacles are scattered into the arena, the evolved controllers produce a collective obstacle avoidance behaviour, making it possible for the swarm-bot to avoid remaining stuck against walls/obstacles.

We tested this generalisation features in an arena containing holes and surrounded by walls, using the best controller produced by the DI setting (i.e., using direct interactions only). It is possible to notice that the swarm-bot is able to avoid falling into the holes, and it also reacts to collisions with walls, changing direction of motion and moving away from them. This behaviour is made possible by the traction sensors. In fact, when an s-bot hits an obstacle, its turret exerts a force on the chassis in a direction opposite to the obstacle. This force is felt as a traction pulling the s-bot away from the obstacle. In response to this traction, the s-bot rotates its chassis in order to cancel the traction. Moreover, the rigid connections between s-bots transmit the force resulting from the collision to the whole group, triggering a fast change in the direction of movement of the swarm-bot. In this case, the traction sensors can be seen as a bumper that detects collisions and trigger a reaction. It is interesting to notice that this bumper is omni-directional and distributed throughout the whole body of the swarm-bot, allowing a coherent reaction to the detection of obstacles.

This generalisation property has the only drawback of needing a collision with an obstacle to be able to detect it. In order to avoid this, we tested a second setting, using the ground sensors also as proximity sensors for obstacles. In fact, ground sensors are merely proximity sensors pointing to the ground with an inclination of 30 degrees. Therefore they can be used also for obstacle detection. It is possible to notice that turns near the obstacles are smoother than before, as there are no collisions between the swarm-bot and the walls. This method is of particular interest because, using the ground sensors for obstacle as well as for hole detection, it is suitable for triggering direct communication among the s-bots through sound signalling, as it was the case for the DC experiments setting described above. We observe, in this case, that the reaction to the detection of a wall is fast and reliable, and the resulting behaviour is qualitatively comparable with the sole hole avoidance.

Hole/obstacle avoidance

Experimental setup

Results

References