In this case, another measure of diversity is better: entropy
To see why, it helps looking at the simplified case when data labels are discrete (like in classification situations). In this case, entropy is higher when data is spread in a lot of categories: for N equi-distributed categories, the entropy is equal to log(N), which is an increasing function of N.
This reasoning can be generalized to our setting, where data lives in a continuous and high-dimensional space, using differential entropy.