Selected Reading

SciPy - fclusterdata() Method



The SciPy fclusterdata() method perform the operation of distance matrix based on hierarchical clustering and forms flat clusters.

The flat cluster is based upon unsupervised machine learning which identifies its features. For example In bio-labrotrary different types of cells specify their features.

Syntax

Following is the syntax of the SciPy fclusterdata() method −

fclusterdata(inp_data, t = int_value, criterion = 'metric_name')

Parameters

This method accepts the following parameters −

  • inp_data: This parameter store the value of given input array.
  • t = int_value: This parameter specifies how heirarchical tree can be cut to create the flat clusters.
  • criterion = 'name': This parameter is used to set the different criterion like distance, ward, etc. These can be represented using string.

Return value

This method returns the n-dimensional array. Note that, criterion is used to form a flat cluster.

Example 1

Following is the basic example that shows the usage of SciPy fclusterdata() method.

from scipy.cluster.hierarchy import fclusterdata
import numpy as np

inp_data = np.array([[10, 20], [30, 40], [50, 60], [70, 80], [10, 0]])
res_clusters = fclusterdata(inp_data, t = 1.5, criterion='distance')
print(res_clusters)

Output

The above code produces the following result −

[1 3 4 5 2]

Example 2

This program illustrates the linkage method which is ward(minimize the variance of a cluster) and sets the maxclust to the criterion which specifies the 3 clusters based on given input.

from scipy.cluster.hierarchy import fclusterdata
import numpy as np

inp_data = np.array([[10, 20], [30, 40], [50, 60], [70, 80], [10, 0]])
res_clusters = fclusterdata(inp_data, t = 3, criterion='maxclust', method='ward')
print(res_clusters)

Output

The above code produces the following result −

[1 2 2 3 1]

Example 3

This example sets the criterion to inconsistent and metric to cosine that determines the high dimensional data and can be used to create cluster and the result matrix generated as the less inconsistent value.

from scipy.cluster.hierarchy import fclusterdata
import numpy as np

inp_data = np.array([[10, 20], [30, 40], [50, 60], [70, 80], [10, 0]])
res_clusters = fclusterdata(inp_data, t=1.15, criterion='inconsistent', metric='cosine')
print(res_clusters)

Output

The above code produces the following result −

[1 1 1 1 1]
scipy_reference.htm
Advertisements