BioloMICS menu

Agglomerative clustering

Full movie BioloMICS: Agglomerative clustering tree (see below).

There are two ways to start an agglomerative clustering:

Under Analytics, in the Classification group, click Polyphasic clustering.

Then in the Clustering tab, add the records using the Add selected records button.
In the main window of BioloMICS, in the BioSheet, select the record(s) to use for the clustering.

Right-click on one of the selected records > Transfer record(s) to clustering.

2. Include the fields that contain data to be used (number 9 below).

3. Set parameters (number 11).

4. Click Save as to save the clustering scenario (optional) or Load to load a previously saved one.

5. Click Hierarchical clustering and select one of the nine available algorithms. For more info about each algorithm, see Clustering algorithms.

6. The results of the agglomerative clustering are displayed in two different windows: the Similarity matrix and the Tree view.

Note: the result of large selections of records can lead to large computation times; especially on slow computers or on those not having enough memory.

https://www.youtube.com/watch?v=G38_WGUtcx4

Movie can also be found on YouTube, BioloMICS: Agglomerative clustering tree.

This movie shows how to make an agglomerative clustering tree in BioloMICS.

1. Select records - transfer to clustering (0:09)

2. Include/exclude fields (0:40)

3. Choose algorithm (0:53)

4. Set weight (1:03)

5. Set tolerance (1:09)

6. Choose options target records (1:20)

7. Merge subfields (1:36)

8. Save/load scenario (1:56)

9. Choose clustering algorithm (2:05)

10. Display options (2:16)

11. Show extra fields (2:25)

12. Colorize data in tree (2:44)

13. Save tree (2:59)

14. View records in grid (3:05)

Add records

Add selected records to the Source records section.

Top

Remove records

Remove selected records from the Source records section.

Top

Load scenario

Load a previously saved clustering scenario.

Top

Save as ...

Save current clustering scenario and enter a name.

Top

Save

Save current clustering scenario.

Top

Delete

Delete current clustering scenario.

Top

Hierarchical clustering

To produce a tree, select one of the nine available algorithms. For more info about each algorithm, see Clustering algorithms.

Top

MDS

Start a Multi-Dimensional Scaling based on the given scenario.

Top

Computed

Check to include the fields for the computation.

To include or exclude multiple fields in the same time, select them all and set the value in this column to either include or exclude using right-click.

Top

Algorithm

Depending on the type of field, different algorithms are available. Please check the algorithms sections of the character/field descriptions (Field types) to understand the way the software works. A different algorithm could strongly affect the final result.

Top

Settings

In the settings window you can set the right values for the selected field(s).

When multiple fields are selected then only the data in common are shown, otherwise a - is displayed.

Title This text will be visible on top of the clustering tree (tab name) as well as at the bottom of the matrix grid (sheet name).
Field name Name of selected field as displayed to the end-user
Field name in database Name of the selected field as stored in the database
Field type The type of field including the description
Weight A real number multiplying the given field similarity. A value of ‘2’ duplicates the importance of the given field in the final identification. A value of 0.25 divides it by 4.
Algorithm Depending on the type of field, different algorithms are available. Please check the algorithms sections of the character/field descriptions (Field types) to understand the way the software works. A different algorithm could strongly affect the final result.
Computed Check to include the fields for the computation.

To include or exclude multiple fields in the same time, select them all and set the value in to either include or exclude using right-click.
Displayed Check to see the value of this field in the results of the identification analysis.
Merge subfields Merge subfields (only for A, C and M fields). More...
Target table For link fields only. The name of the table where the field is pointing to.
Target records processing For link fields only. Choose the way the target records are processed (undefined, best, worst, average).

When comparing parent records on the basis of the attached or linked records such as DNA sequences and when there is more than 1 record attached, 3 options are available:

1. Best (default): where the similarity of the most similar pairs (of DNA sequences, for example) of target records values is kept.

2. Average: where the average similarity of all the pairs (of DNA sequences, for example) of target records values is used.

3. Worse: where the similarity of the least similar pairs (of DNA sequences, for example) of target records values is kept.
Max number of target records For link fields only. When the number of target records should be limited. In some case, it is interesting to use only the first target records, with respect to the order they appear in the main BioloMICS window.
Target field For specific link fields only. The field in the target table that is used to point to.

Top

The contents of Agglomerative clustering

Clustering algorithms

Similarity matrix window

Tree view window