close
close

Multidimensional social influence drives leadership and composition-dependent success in octopus–fish hunting groups

Procedures were approved by the Max Planck Institute of Animal Behaviour, the Department of Agriculture and Fisheries Ethics Committee from Queensland, Australia, and the Great Barrier Reef Marine Parks Authority under permit G23/47925.1.

Location and recording procedures

Using SCUBA, we recorded interspecific hunting scenes between O. cyanea and multiple partners. Fieldwork spanned 1 month, between 1 October 2018 and 1 November 2018 (29.5577° N, 34.9519° E, Eilat, Israel), in a total of ~120 h of diving (~60 h each diver). Dives were performed two to three times per day, at relatively shallow depths (5–15 m), allowing ~2–3 h underwater per day, complying with local scientific diving regulations. Given that these hunts are not stationary, we adopted a search-and-follow procedure while maintaining a distance of >5 m to minimize disturbing natural interactions. We used two full-frame Sony Alpha 7SII with Zeiss 2/25 mm wide lenses mounted on an aluminium structure, as a stereocamera setup (hereafter ‘stereocamera rig’; Fig. 1b). A third full-frame Sony Alpha 7SII with Sony f/4 24–70 mm lens served as a focal camera for the octopus (hereafter ‘zoom camera’). Together with an additional ~30 min dataset in Egypt (El Quseir, 26.1014° N, 34.2803° E), video from the ‘zoom camera’ was used to quantify temporal characteristics of web-overs in solitary octopuses. We registered web-over frequency as the least subjective component of octopus foraging, and considered web-over duration as the timespan in which the octopus exhibits whitening of the interbrachial web skin over a specific habitat feature. There were no significant differences between data collected in Israel and Egypt during solitary hunting, regarding either web-over frequency (Poisson GLMM, n = 14, z = −0.661, P = 0.509) or duration (Gaussian GLMM, n = 107, z = −1.166, P = 0.244). Videos were all filmed at 25 fps with 4k resolution, and cameras were synchronized in Adobe Premiere via the timestamp of an underwater horn at the start of all recordings.

Field experiments

Complementary field experiments were conducted at Lizard Island Research Station in Australia, where baited (prawn Penaeus monodon) or empty U-PVC tee-fitting structures were placed ~50 cm from octopuses, to gauge how web-over characteristics, that is, occurrence (interaction with the structure or not) and duration of web-overs, were impacted by (1) presence of food, (2) previous strikes on the structure by fish and (3) presence of fish (Supplementary Fig. 16 and Supplementary Video 5). We used 5 min (300 s) of web-over duration as maximum trial time, as this meant that the octopus had taken the structure back to its den (in these cases, web-over duration was not considered).

Scene reconstructions and supplementary data sorting

Taking advantage of stereopsis provided by the overlap of the two cameras’ field views, the stereocamera rig allowed reliable and accurate 3D tracking of overall group collective movement and 3D reconstruction of habitat features. First, using computer-vision methods, videos were run through a structure-from-motion and multiview stereo open-source pipeline named ‘colmap’44. The concept of structure from motion allows the retrieval of 3D information from two-dimensional (2D) images, by matching key points of a stationary background over several video frames. Adding two camera positions per time frame (multiview stereo) then allows key points to be triangulated in 3D space and decreases camera projection errors. In addition, ‘colmap’ also performs intrinsic camera calibration, undistorting images due to different lens types and camera features, and extrinsic calibration by calculating the position of one camera relative to the other per time frame. This extrinsic calibration per time frame yields the camera positions relative to the reconstructed habitat at all time frames, allowing us to recreate the path taken by the cameras while filming (see example in Fig. 1b), along with a high-resolution 3D spatial reconstruction in which all habitat features across hunting scenes are detailed. We manually tracked individuals in the videos using the software Computer Vision Annotation Tool. We annotated three frames per second, which yielded a time resolution of 0.33 s for animal movement. Combining both cameras views, this sampling effort represents a total of ~500,000 individual annotations. We annotated all individuals in a collective hunt, specifically the left eye of the octopus and the end of the rostrum of the fish, ensuring consistency between different annotators.

We then used another software developed to incorporate the previously tracked animals in each camera in the ‘colmap’ habitat models and camera paths, ‘multiviewtracks’ or ‘mvt’29. Similarly to how key points were triangulated in the habitat reconstruction phase, individual positions were triangulated from the stereocamera rig’s known camera relative positions, and their movements reconstructed in three dimensions from the entire camera path (and therefore nullifying camera motion). Next, we specified the known world distance between the two cameras for scale (1.2 m) and obtained individual trajectories in real xyz coordinates. Taking advantage of knowing the real-world distance between cameras, we were also able to calculate reconstruction accuracy and reprojection camera errors. Reconstructions from ‘mvt’ had a remarkable median accuracy of 0.0001 m (that is, 0.1 mm) and a 3-sigma limit of 0.01 m (3 × s.d. error, that is, ~99.7% of the data). Finally, to further account for potential jittering arising from manual tracking, we searched and removed position outliers (x, y or z coordinate values diverging three times the standard deviation from the last three annotated frames), linearly interpolated missing values (up to 12 frames, or 4 s) and applied a Savitzky–Golay filter to smoothen data encompassing a time window of 19 frames (package ‘SciPy’).

In total, 3.5 h of collective hunting was reconstructed (example in Fig. 1b and Supplementary Video 2), composed of 13 different scenes representing different groups of interspecific hunting. From these 13 groups, we collected data and maintained the individual identity of 13 octopuses, 22 blacktips, 12 barbel goatfish, 10 lyretails, 20 blue goatfish and 4 yellow goatfish. The size of the groups varied between 2 and 10 individuals, with an average presence of 5.72 individuals, always with 1 octopus present. The shortest scene recording of a hunting group was 100 s, while the largest reached approximately 1,800 s (limited by camera battery). As movement frequency was nonlinear across time, particularly in long-duration groups (Supplementary Fig. 1), we standardized length variability to better show the dynamics of the groups. We used the smallest scene length of a hunting group (100 s) to divide all hunting groups into 100 s blocks, totalling 107 hunting subgroups. In the end, our reconstructed subgroups provided a sample size of 132 blacktips, 95 barbel goatfish, 42 lyretails, 107 octopuses, 102 blue goatfish and 23 yellow goatfish (see ‘Statistics’ for techniques used to deal with data dependency). We further re-ran analyses on the main estimated parameters using 200 s (Supplementary Fig. 18a,b) and 300 s (Supplementary Fig. 18c,d) blocks and confirmed that the specific time length chosen did not impact our results. Details on the number of blocks per group and overlap of species in each subgroup or group are provided in ref. 30, and dyadic sample sizes are shown in Supplementary Table 51. Our species-specific analyses were restricted to the abovementioned six species categories (or phenotypes) to assure robust statistical assessments. However, other species that were sporadically part of groups were the lionfish Pterois miles, the abudjubbe wrasse Cheilinus abudjubbe, Klunzinger’s wrasse Thalassoma rueppellii, the bandcheek wrasse Oxycheilinus digrammus and the sand lizardfish Synodus dermatogenys. Trajectories of these individuals were still tracked and taken into account when calculating group parameters (for example, when computing group centroids).

Pull-and-anchor analysis

The main analytical methodology used to quantify leadership was a modified version of the pull-and-anchor analysis3. In this paper, GPS tracks were 2D as animals moved almost exclusively in a 2D horizontal plane (that is, baboons moving over xy coordinates), so the code was adapted to 3D to include the z coordinate, that is, the vertical movements of the octopus and fishes in the water column. Compared with other metrics (for example, directional or speed correlations), pull-and-anchor analyses are particularly suited for groups that show erratic movement patterns and frequently change between tight to sparse formations, as well as for analysing movement sequences over short or long timescales11 (Supplementary Fig. 1).

The pull-and-anchor analysis focuses on assessing variation in dyadic distance between two individuals (i and j, extracting successful and unsuccessful initiation events (resulting in pulls and anchors, respectively). In essence, we look for minimum and maximum values of dyadic distance, until a minimum (t1)–maximum (t2)–minimum (t3) sequence is formed. For each of these interactions, there is one initiator and one potential follower: between t1 and t2, the individual increasing the distance relating to the other (let it be i in this example) is the initiator and the other individual (let it be j) is the candidate follower. After reaching the maximum dyadic distance (t2), the individual that then shortens the dyadic distance dictates whether this sequence was a successful initiation event (that is, pull) or an unsuccessful initiation event (that is, anchor). Following the example above, if the potential initiator i is the one shortening the dyadic distance to j, then this sequence is classified as an anchor event, where j is anchoring i. If, on the other hand, j is the one shortening the dyadic distance, that means that j followed i and it is classified as a pull event, that is, i is pulling j.

Following the original methodology, several steps were taken to ensure that small-scale variations were not included as pull–anchor events by filtering candidate sequences using disparity, strength and noise thresholds. Before candidate sequences were identified per se, to prevent jittering interference and small body part movements being considered as potential events, a noise threshold of 0.1 m was set as the minimum dyadic distance change between i and j. Note that this value is one to two orders of magnitude above the error calculated for the reconstruction and represents approximately 0.5–1 body size of the individuals generally present in the hunts. Disparity was calculated via the difference of covered distance by i and j across each time segment in relation to each other. Complementarily, strength was calculated as the relative change in dyadic distance between each time segment. Both these parameters range between 0 and 1, where, in the case of disparity, values near 0 depict an interaction in which both i and j moved similar distances during each time segment (thus making classification of the event ambiguous), and values near 1 indicate that either i or j performed the majority of movement in each time segment. In the case of strength, values near 0 indicate that the relative change in dyadic distance in each time segment was negligible (that is, i and j were always close together), whereas values near 1 indicate that the distance markedly changed among t1, t2 and t3 (that is, individuals were close, then far apart, then close again). We defined both thresholds at a minimum of 0.1. Using this methodology, we extracted 516 pulls and 664 anchors from the dataset, in a total of 1,180 events.

Pull–anchor metrics were normalized for the number of individuals present in each subgroup and, in species-specific analyses, the number of individuals from each species present. Thus, for each individual i, we quantified:

  1. (1)

    Normalized pulling frequency—number of pulls (that is, successful initiations) per minute (pi)

  2. (2)

    Normalized anchoring frequency—number of events anchoring an initiator per minute (ai)

  3. (3)

    Normalized initiation frequency—number of initiations per minute (pi + ai)

  4. (4)

    Normalized follower frequency—number of events following when solicited by an initiator per minute (fi)

  5. (5)

    Normalized anchored frequency—number of anchors (that is, unsuccessful initiations) per minute (ani)

Moreover, we also quantified ratios between leader–follower categories in pull–anchor events by measuring:

  1. (1)

    Initiation/pulling ratio (pulling efficiency)—number of successful pulling events divided by the total number of initiations

  2. (2)

    Anchoring/anchoring opportunities ratio (anchoring efficiency)—number of events anchoring an initiator divided by the total number of times solicited

  3. (3)

    Pulling/following ratio—number of events pulling divided by the number of events following

  4. (4)

    Initiations/following ratio—number of initiations divided by the number of events following others

See also Supplementary Table 52 for a focus on the metrics that translate to direct influence in others, either by stimulating or inhibiting their movement.

As referred in ref. 11, while observational field data do not explicitly capture causality, it is statistically improbable that all individuals in a group would make the same causally independent decision (despite considerable individual differences), at the same time, across hundreds of decision events, as would be required to explain group cohesion for the duration of hunting scenes. While some level of independent decision-making forms some part of the dynamics we observe, we have taken several steps beyond data thresholding to ensure that the statistical patterns found are consistent, allowing us to state with some confidence that these independent decision-making events become inconsequential to our conclusions.

First, to ensure that individuals were not simply moving randomly in space, we retrieved the angles between the relative movement vectors of the initiator (during t1t2) and the potential follower (during t2t3) when pulls and anchors occurred (Supplementary Fig. 19 and see also 1.general script30). First, we performed a Rayleigh’s test and verified that the frequency distribution is non-uniform, that is, there is preference for movement in certain angle ranges (P 19), a clear difference in probability emerged, statistically different from what one would expect from a 360° random movement (that is, movement equally likely to happen at any given angle). In our system, anchors are actually 2.3667 times more likely than pulls, if an individual would move in a random angle relatively to another. Moreover, if we calculate the probability of pulling or anchoring for each given angle (rounded as integers between 0 and 359), we can then obtain the average of all probabilities across integers, to get the mean probability of a pull or an anchor occurring at chance level. In our system, the mean probability of pulls is ~31.31% (median probability is even lower: 20%) and that of anchors is ~68.69% (median probability: 80%) (see end of 1.general.ipnyb). Noteworthily as well, the narrow angle range at which pulls occur (relative to any other angle) shows a bimodal high-frequency distribution around 30° and −30°, with a small number of pulls occurring near 0° (Supplementary Fig. 19). Thus, in the large majority of cases, pulling does not occur owing to movement inertia—that is, individual j apparently follows i, but such happens because j was moving in the same direction; individuals deviate from their course and actively follow others.

Moreover, we also ensured that the patterns that emerged from the data were consistent over different levels of organization or potential clustering. First, to prevent our results from being driven by simple preferential association with conspecifics, we analysed pulling frequencies without same-species interactions. Second, we analysed pulling frequencies only when an individual was the first initiator of any given follower. As such, we removed any potential pulls stemming from simply associating with the first initiator strongly. Lastly, we calculated the dyadic pull–anchor interactions between individual i and the centroid of the rest of the group, calculated as the mean position of all individuals except i. All results were found to be consistent across the different conditions.

Individual kinematics and group-level metrics

For each individual trajectory at a given time point, we calculated speed as the difference in the Euclidean distance given by the xyz positions in sequential frames normalized per second and tortuosity as the ratio between the average direction change of the trajectory over 12 frames or 4 s (6 frames or 2 s before and 6 frames or 2 s after) and moving in a straight direction. At the group level, we estimated the centroid of the group as the mean of all individuals present in the subgroup and calculated the Euclidean distances from a given individual and said centroid. Moreover, we also estimated the absolute angle between the individual trajectory and the centroid trajectory between frames as the angle to the centroid of the individual. To explore whether differences in individual angles could impact pull-and-anchor outcomes (successful or unsuccessful), we also calculated the average absolute angle between the initiator’s trajectory and the potential follower’s trajectory during initiations (between t1 and t2).

Hierarchical social networks

Social network analysis has been widely used in the fields of sociality and group behaviour over the past decade45. This type of analyses comprises a number of different tools that allow characterization of specific individual roles within groups as well as the structure and characteristics of their social interactions with other members of the group46. Following refs. 47,48, we adapted this approach by including directed social networks (that is, the influence of individual i over individual j, instead of ‘dyadic influence’ by itself) and by constructing a species rank order that depicts the hierarchical influence in groups.

To find species hierarchical structures within groups across different pull–anchor event parameters, we created adjacency matrixes in which pairwise comparisons between species l and m were performed on both sides of the initiator–potential follower axis, normalized by the number of co-occurrences in subgroups. From these matrixes, we used 2D directed social networks to find underlying network hierarchies for each parameter. First, edge direction (that is, arrow) was determined by the highest positive number, signifying, for example, that in a given interaction, species l was more likely to pull m than the opposite. Next, edge wideness was calculated as the difference between the scores of species l and m in pulling each other. In other words, if individual i had a much higher pulling score than j, the edge will be wider and directed to i. To clarify absolute hierarchies in each axis, node positions represent a given species’ mean ± s.e. parameter score per subgroup, for example, the average of all species-specific pulling scores for the species m as the initiator. Lastly, node colours were calculated as the mean value between xy node positions and portrayed using an xy mixed colour gradient, in which brighter and darker tones, respectively, indicate lower or higher values. All hierarchical social networks were built with package ‘NetworkX’ in Python.

Statistics

All statistical analysis were performed in R49. We used GLMM available in the package ‘glmmTMB’ to account for intragroup and intraindividual variability, and used an autoregressive component to deal with temporal correlations. Thus, as a first step, we implemented autoregressive model structures based on the subgroup number within each group, thus not only maintaining identity but also weighing temporal autocorrelation between subgroups within the same group (that is, subgroup + 0|group/individual). When this approach failed, that is, the models failed to converge or were outperformed by simpler GLMMs, we used a nested structure of random effects with individual identity nested within groups, that is, maintained across different subgroups (that is, 1|group/individual/subgroup). For web-over temporal characteristics, we also included field site as the first term of the nested random effects, given additional footage used from Egypt when the octopus was hunting alone.

Statistical models measuring different parameters were implemented as follows. Modelling of continuous variables (for example, speed, distance to centroid) was attempted first using a Gaussian distribution. In case of violation of assumptions, gamma distribution families were then used, first with log link and subsequently with inverse link if necessary. On the other hand, modelling of count data (for example, pulling frequency or anchoring frequency, with the number of individuals present in the subgroup used as an offset) was first attempted using Poisson distributions with identity or log links, depending on data distribution. In case of violation of assumptions or overdispersion, negative binomial distributions were then used. Zero inflation was used for both continuous and count variables in cases of excess of zeros and if necessary to enable model convergence. To measure ratios that constitute binary variables (for example, pull or anchor, within the total number of leadership attempts made), we used binomial distributions. In the case of ratios created from variables that are not binary (for example, the ratio between pulling frequency and anchoring frequency), we used the same approach as modelling count data, but included the second variable (in this example, anchoring frequency) as an offset of the first (in this case, pulling frequency). All models were validated by checking homogeneity of variances, residuals normality and normality of random effects, as well as overdispersion and collinearity when required (for example, with count data and when multiple factors were tested), using the ‘check_model’ function within the package’s ‘performance’ and ‘see’. Pairwise comparisons between levels within significant factors were analysed via Tukey HSD with Tukey multiplicity adjustments considering the number of comparisons made, using the package ‘emmeans’.

Finally, to calculate differences in the proportion of number of occurrences and number of punches received (for example, between different species or differently characterized groups), we used first χ2 tests with continuity corrections measuring the deviation of sampled distributions from random distributions (for example, where punches performed equally to all species) and afterwards performed two-sample equality of proportion tests with continuity correction to test pairwise differences (for example, between different species). Pearson correlations were also calculated to highlight meaningful differences. Results were plotted using ‘ggplot2’ and visually enhanced using Adobe Illustrator.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.