In this case, another measure of diversity is better:
entropy.
To see why, it helps looking at the simplified case when data labels are discrete (like in classification situations). In this case, entropy is higher when data is spread in a lot of categories: for N equi-distributed categories, the entropy is equal to log(N), which is an increasing function of N.
This reasoning can be generalized to our setting, where data lives in a continuous and high-dimensional space, using
differential entropy.