Dear Rafaelleite:

I haven't written any scientific paper on this method. However, it's based on the spectral clustering algorithm (for the clusters' initialisation) and on the nearest neighbour method (for the cluster equalisation). Perhaps you can find the first scientific papers that describe these two methods for both, the mathematical description and reference.

Regarding the range of points in the clusters, I define it in terms of the equity_fraction as the following:

min_range = min(n_i)*equity_fraction

max_range = max(n_i)*(2-equity_fraction),

where n_i is the number of points per cluster, defined as N/k, N= total no. points, k= number of clusters.

Thus, all the clusters will have number of points between min_range and max_range.

I hope this explanation helps!

Dr. Carmen Adriana Martínez Barbosa
Dr. Carmen Adriana Martínez Barbosa

Written by Dr. Carmen Adriana Martínez Barbosa

Data Scientist | Sharing new algorithms useful to the DS community. LinkedIn: https://www.linkedin.com/in/camartinezbarbosa/

No responses yet