Figure 2. UMAP ordination of the WMD dataset with samples coloured according to three large taxonomic groups (Mysticete, Odontocete, and Pinniped). Pinniped sample points were plotted at double size to improve visualisation.
Within the Mysticete group, only three families contained enough samples to be considered for further analysis: Balaenopteridae, Balaenidae, and Eschrichtiidae. In the subsequent UMAP ordination, Balaenidae samples were almost completely overlapped with Balaenopteridae vocalizations, close to the plot centre (Fig. 3). Eschrichtiidae samples, the least represented label (i.e., the minority label) for the Mysticete , clustered in four distinct areas of the UMAP plot.
The Odontocete group was dominated by the Physteridaefamily, which represented the majority label for the subset, followed byDelphinidae and Monodontidae (Fig 4). Phocoenidaevocalisations were the minority label, and, similarly toEschrichtiidae, samples belonging to this family formed small clusters scattered across the UMAP plot area.