Materials and Methods

Sample preparation

Diatom sampling was performed at 114 sites along the South Korean coast (Fig. 1 and Table S1) during January and February 2010. We chose sites accessible from the coast for phytoplankton netting, and that were < 40 km apart in order to achieve full coverage of Korean waters. One liter water samples were collected in clean polyethylene bottles for quantitative analysis, and phytoplankton samples were taken using a 20 μm mesh net for qualitative analysis. Collected samples were immediately fixed with 5% Lugol’s solution (Sigma, St. Louis, MO, USA) and transported to the laboratory. Environmental data, including temperature, salinity, and pH were measured in-situ , using a YSI-6600 portable meter (YSI; Yellow Springs, OH, USA).

Diatom assemblage analysis

The fixed water samples were allowed to settle for 1 d, and then the supernatant was removed to concentrate the phytoplankton. Total diatom abundance in each 1 L water sample was determined (the minimum found was 600 cells per sample) using a Sedgwick–Rafter counting chamber under a light microscope (LM, Axioskop 40; Zeiss, Germany), and diatom diversity and sample composition were determined.
To identify the diatom species positively, cellular organic material was removed using equal amounts of KMnO4 and HCl in a 70℃ water bath until the sample became clear, and then the acid was removed using five rinses. Selected cleaned samples were mounted in a Pleurax (cat. no.139-06682, Wako, Japan) and observed under the LM equipped with a CCD camera (AxioCamMRc5; Zeiss, German). For examination using a scanning electron microscope (SEM, JSM7600F, Jeol, Tokyo, Japan), the rest of the cleaned samples were filtered onto a polycarbonate membrane (3.0-μm pore size; TSTP02500, Millipore, Bedford, MA, USA), which was then dried in air. The filtrated membranes were attached to an aluminium stub using carbon tape and then sputter-coated with gold. The SEM was operated at accelerating voltages of 5 kV using a 10 mm working distance.

Statistical analysis

Species that contributed ≥ 1% of the total diatomic assemblage in at least one sample were selected for numerical analysis resulting in 156 diatom taxa being used. Diatom assemblage diversity was calculated using the Shannon–Wiener diversity index (Shannon and Weaver 1949). The absolute abundance of each species was transformed by its fourth root into normalizing skewed composition, data and pairwise distances between sampling sites were calculated using the Bray–Curtis similarity algorithm.
To identify spatial similarity between sampling sites, we performed eight hierarchical clustering methods based on the pairwise distance matrix. These were as follows: the single linkage method, the complete linkage method, the unweighted pair-group method using arithmetic averages (UPGMA), the weighted pair-group method using arithmetic averages (WPGMA), the unweighted pair-group method using centroids (UPGMC), the weighted pair-group method using centroids (WPGMC), and two variants of ward’s minimum variance method (ward.D and ward.D2). The degrees of data distortion from the eight methods were then assessed based on cophenetic correlation coefficients (Sokal and Rohlf 1962). Pairwise distances between sampling sites were calculated in the “vegan” package ((Oksanen, et al. 2013), and clustering was visualized using the “factoextra” package (Kassambara and Mundt 2017), both in R (the R Project for Statistical Computing, supported by the R Foundation for Statistical Computing).
The indicator value method (IndVal) was applied to identify indicator species among the groups of sites using the “indicspecies” method in R (De Cáceres 2013). The IndVal values ranged from zero for “not an indicator species” to one for “maximum indicator ability.”