Edit or run this notebook
5.8 s

Demo for the PlutoCon 201

Author: Daniel Molina Cabrera

Talk: Teaching Computer Sciences and Metaheuristics Algorithms using Pluto

9.3 Î¼s

Grouping with constraints Problem

In this problem there is a dataset with N instances that represent numerical vectos, and it is wanted to group them in k clusters.

The target is select the clustering considering:

  1. Fulfil as much as possible a number of constraints (several pairs of instances must be in the same clusters, they are listed in list ML, and another pairs of instances that have to be in different clusters, in list CL).

  • infeasability=∑i=0|ML|bool2int(hc(ML[i,1]→)≠hc(ML[i,2]→))+bool2int(hc(CL[i,1]→)=hc(CL[i,2]→))

  1. Reduce the intra-cluster distance:

  • For each cluster i it is calculated its centroid (median of the instances member of that cluster): μi→=1c∑xj→∈cixi→

  • The instra-cluster distance is calculate with ci¯=1|ci|∑xj→∈ci||xj→−μi→||2.

25.8 Î¼s

Initial problem:

Giving several points like following:

6.4 Î¼s
scene (generic function with 1 method)
70.3 Î¼s

n:

47.0 ms
5.0 s

Problem: number k and constraints

3.1 Î¼s

k:

66.8 Î¼s

|ML|: |CL|:

92.9 Î¼s

There are several constraints between groups

3.0 Î¼s
29.7 Î¼s

ML: {(4, 2), (9, 4), (8, 3)}

CL: {(6, 2)}

323 ms
plot_pac (generic function with 2 methods)
139 Î¼s

We represent visually the constraints that have to be in the same cluster with a continuous line, and with a dashed line which have to be in different clusters.

2.8 Î¼s
561 ms

Initial solution:

A initial solution is a partition, represented by a vector of length N, in which each position i has a value between 1 and k, the value represent the cluster assigned to the instance i.

7.0 Î¼s
colores
1.1 Î¼s
42.6 Î¼s
8.9 Î¼s

We remark with different colors each clustering, indicating in the title how many constraints are violated, and the average inter-clustering distance.

3.0 Î¼s
774 ms
incumple (generic function with 1 method)
39.5 Î¼s
dist_intra (generic function with 1 method)
48.4 Î¼s
distancia_intracluster (generic function with 1 method)
41.2 Î¼s
plot_clusters (generic function with 1 method)
72.3 Î¼s

Greedy Algorithm

The Greedy algorithm build a solution step by step. The process is as follow:

  1. Randomly generate K clusters.

7.1 Î¼s

Seed:

58.5 Î¼s
26.3 Î¼s
3×2 Matrix{Float64}:
 0.0123788  0.971216
 0.455168   0.844532
 0.90785    0.368487
2.5 Î¼s
141 ms
  1. For each element it is selected the cluster closer than violate less constraints.

7.1 Î¼s
23.0 ms

clusters: {0.012378843419466268, 0.45516843007784, 0.9078498956593186, 0.971215649338451, 0.8445317509734542, 0.36848703579346}

290 ms
3.3 s
  1. If the centroids should be updated, recalculate it and go to step 2.

5.0 Î¼s
34.1 Î¼s
989 ms

The final solution obtained is:

3.2 Î¼s
19.9 ms
creaV (generic function with 1 method)
45.1 Î¼s
count_violations (generic function with 1 method)
39.2 Î¼s
plot_greedy_options (generic function with 1 method)
82.2 Î¼s
step_greedy (generic function with 1 method)
124 Î¼s

Meta-heuristic

To be able to apply the simple meta-heuristic we need a only objective function, thus we are going to join both criteria in a only fitness function.

4.6 Î¼s

The measure is defined as C→+(infeasability⋅λ), as a combination of the intra-cluster distance with a λ penalty for each constraints that is not fullfil.

3.6 Î¼s

The first action is to calculate λ, defined ass the maximum distance between two points and divided by the total of constraints: λ=Max(Distij)R,∀i,j∈{1,...,n}

3.5 Î¼s

λ value: 0.23 = 0.9205 / 4

12.4 ms
fitness (generic function with 1 method)
22.5 Î¼s
31.8 Î¼s
79.2 ms

The local search procedure is to randomly choose one position, and change its value (maintaning that for each cluster has at least one member).

3.1 Î¼s
38.5 Î¼s
BL (generic function with 1 method)
90.4 Î¼s
89.0 ms

In the following there is the convergence graphic, that show the improvement of fitness through the run of the algorithm.

3.1 Î¼s
300 ns
Loading...i