David Orme - Authorea

1. Environmental soundscapes are increasingly being used as descriptors of ecosystem health and vocal animal biodiversity. Soundscape data can quickly become very expensive and difficult to manage, so data compression or temporal down-sampling are sometimes employed to reduce data storage and transmission costs. These parameters vary widely between experiments, with the consequences of this variation remaining mostly unknown. 2. We analyse field recordings from North-Eastern Borneo across a gradient of historical land-use. We quantify the impact of experimental parameters (mp3 compression, recording length and temporal subsetting) on soundscape descriptors (Analytical Indices and a convolutional neural net derived AudioSet Fingerprint). Both descriptor types were tested for their robustness to parameter alteration and their usability in a landscape classification task. 3. We find that compression and frame size both drive considerable variation in calculated index values. However, we find that the effects of this varaiation and temporal subsetting on the performance of classification models is minor: performance is much more strongly determined by acoustic index choice, with Audioset fingerprinting offering substantial (12-16%) increases in all of classifier accuracy, precision and recall. 4. We advise using the AudioSet Fingerprint in soundscape analysis, demonstrating its superior and consistent performance even on small pools of data. If data storage is a bottleneck to a study, we recommend Variable Bit Rate encoded compression (quality=0, 23% file size) to reduce file size without affecting most Analytical Index values. The AudioSet Fingerprint can be confidently compressed further to a Constant Bit Rate encoding of 64kb/s (8% file size) without any detectable effect. These recommendations balance the efficient use of restricted data storage against the comparability of results between different studies.