xxxxxxxxxxDemo for the PlutoCon 201
Author: Daniel Molina Cabrera
Talk: Teaching Computer Sciences and Metaheuristics Algorithms using Pluto
xxxxxxxxxxGrouping with constraints Problem
In this problem there is a dataset with N instances that represent numerical vectos, and it is wanted to group them in k clusters.
The target is select the clustering considering:
Fulfil as much as possible a number of constraints (several pairs of instances must be in the same clusters, they are listed in list ML, and another pairs of instances that have to be in different clusters, in list CL).
Reduce the intra-cluster distance:
For each cluster i it is calculated its centroid (median of the instances member of that cluster):
The instra-cluster distance is calculate with
.
xxxxxxxxxxInitial problem:
Giving several points like following:
xxxxxxxxxxscene (generic function with 1 method)xxxxxxxxxxn:
xxxxxxxxxxxxxxxxxxxxProblem: number k and constraints
xxxxxxxxxxk:
xxxxxxxxxx|ML|:
xxxxxxxxxxThere are several constraints between groups
xxxxxxxxxxxxxxxxxxxxML: {(4, 2), (9, 4), (8, 3)}
CL: {(6, 2)}
xxxxxxxxxxplot_pac (generic function with 2 methods)xxxxxxxxxxWe represent visually the constraints that have to be in the same cluster with a continuous line, and with a dashed line which have to be in different clusters.
xxxxxxxxxxxxxxxxxxxxInitial solution:
A initial solution is a partition, represented by a vector of length N, in which each position i has a value between 1 and k, the value represent the cluster assigned to the instance i.
xxxxxxxxxx:green
:orange
:black
xxxxxxxxxxxxxxxxxxxx1
2
2
1
3
1
2
1
2
1
xxxxxxxxxxWe remark with different colors each clustering, indicating in the title how many constraints are violated, and the average inter-clustering distance.
xxxxxxxxxxplot_pac(grouping)incumple (generic function with 1 method)xxxxxxxxxxdist_intra (generic function with 1 method)xxxxxxxxxxdistancia_intracluster (generic function with 1 method)xxxxxxxxxxplot_clusters (generic function with 1 method)xxxxxxxxxxGreedy Algorithm
The Greedy algorithm build a solution step by step. The process is as follow:
Randomly generate K clusters.
xxxxxxxxxxSeed:
xxxxxxxxxxxxxxxxxxxx3×2 Matrix{Float64}:
0.0123788 0.971216
0.455168 0.844532
0.90785 0.368487xxxxxxxxxxxxxxxxxxxxFor each element it is selected the cluster closer than violate less constraints.
xxxxxxxxxxxxxxxxxxxxclusters: {0.012378843419466268, 0.45516843007784, 0.9078498956593186, 0.971215649338451, 0.8445317509734542, 0.36848703579346}
xxxxxxxxxxxxxxxxxxxxIf the centroids should be updated, recalculate it and go to step 2.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxThe final solution obtained is:
xxxxxxxxxxxxxxxxxxxxcreaV (generic function with 1 method)xxxxxxxxxxcount_violations (generic function with 1 method)xxxxxxxxxxplot_greedy_options (generic function with 1 method)xxxxxxxxxxstep_greedy (generic function with 1 method)xxxxxxxxxxMeta-heuristic
To be able to apply the simple meta-heuristic we need a only objective function, thus we are going to join both criteria in a only fitness function.
xxxxxxxxxxThe measure is defined as
xxxxxxxxxxThe first action is to calculate
xxxxxxxxxxxxxxxxxxxxfitness (generic function with 1 method)xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxThe local search procedure is to randomly choose one position, and change its value (maintaning that for each cluster has at least one member).
xxxxxxxxxxxxxxxxxxxxBL (generic function with 1 method)xxxxxxxxxxxxxxxxxxxxIn the following there is the convergence graphic, that show the improvement of fitness through the run of the algorithm.
xxxxxxxxxxxxxxxxxxxx