Methods

Data acquisition and processing

We retrieved occurrence records from two fish classes (Actinopterygii and Sarcopterygi) in Ugandan water bodies from GBIF online data repository (GBIF 2020). We used the occ_download_get function in rgbif package to retrieve data (Chamberlain et al. 2020). Except for genera Astatoreochromis and Pseudocrenilabrus , we changed all other haplochromine cichlids genera to Haplochromisto conform with FishBase nomenclature (Froese & Pauly 2019), which is based on Oijen (1996). Thus, names such as Astatotilapia nubilawere changed to Haplochromis nubilus ; Schubotzia eduardiana to H. eduardianus ; and Astatotilapia pallida to H. pallidus . Occurrences that were outside the geographic range described in FishBase, and whose identity could not be verified based on recent survey data, were discarded. We also excluded all occurrences without complete scientific names (genus and specific epithet), e.g., Haplochromis sp. and Oreochromis sp. Occurrences with unknown and incorrect water bodies were excluded, e.g., all records of H. eduardii that were recorded in Lake Albert in the GBIF datasets were discarded, as the species is endemic to Lake Edward (Froese & Pauly 2019). For occurrence records without a named waterbody of origin but with coordinates were determined based on the GPS coordinates. We used habitat descriptions, verbatim locality, and location remarks to identify the waterbody of origin. We also discarded all occurrences from manmade water bodies, such as ponds, tanks, and aquarium. Lakes Salisbury and Kasudho were changed to Bisina and Kasodo, respectively, to conform to the current names and avoid duplication of records for the same lakes. We categorized lakes <200 km2 as small lakes and >200km2 as large lakes, resulting in 7 large and 37 small lakes (Appendix S1 and S2). After preliminary processing of the data, a total of 14,452 occurrences records were retained for further analysis.

Conservation priority index formulation

We retrieved the conservation status of each species from the International Union for Conservation of Nature (IUCN) Redlist database (www.iucnredlist.org/). The species are classified as data deficient (DD), least concern (LC), critically endangered (CR), near threatened (NT), endangered (EN), and not evaluated (NE) (IUCN 2012). We used waterbody surface area, species richness, IUCN statuses, rarity, and scaling constant to develop the conservation priority index (CPIw), based on the formula:
\begin{equation} \mathrm{CPIw\ =}\ \frac{\sum_{\mathrm{i\ =\ 1}}^{\mathrm{n}}{\mathrm{\text{Cwt}}_{\mathrm{i}}\mathrm{\text{.\ R}}_{\mathrm{i}}}}{\mathrm{(Aw\ .\ 7)}}\nonumber \\ \end{equation}
where, for species i per unit surface area of a waterbody, Cwti is the species weight based on its IUCN status (i.e., CR (5), EN (4), VU (3), NT (2), LC (1)); n is the number of species in a particular waterbody; Aw is the total surface area of the lake, and Ri is the frequency of occurrence of a species in the water bodies (i.e., if a species occurred in one waterbody, then a weight of 5 was assigned, 2-3 water bodies (weight 4), 4-5 water bodies (weight 3), 6-10 water bodies (weight 2), and >10 water bodies (weight 1)). The value 7 is a scaling constant, indicating the total number of IUCN categories (IUCN, 2012). Note that fish species that were registered as NE and DD in the IUCN database were assigned a Cwti of 5 on the basis that such species can go extinct unnoticed and, therefore, should be considered in the same category as CR species (IUCN 2012). The surface area for each lake was obtained from the literature (Burgis & Symoens 1987; Vanden Bossche & Bernacsek 1990; Ogutu-Ohwayo et al. 1999; Schofield & Chapman 1999; Olowo et al. 2004). We used Google Earth to approximate the surface area for lakes Gawa, Kabaleka, Wamala, Nakabale, Owapet, Kirimira, and Kabaka because it was not found in the literature. The surface area of Lakes Natuali, Chankaranga, Okurachere, Kasunju, Nkuruba, and Mutabyo could not be determined from both literature and Google Earth, and thus we could not calculate their CPIw values.