聚类模型及评价

问题:

  • 识别具有相同购买模式的顾客群体
  • 识别在相同地区或者相同顾客群里受欢迎的商品
  • 识别所有讨论的类似事情的新闻项

方法:

  • kmeans

评判:

  • WSS
  • BSS
  • WSS/BSS

示例:

wget https://archive.ics.uci.edu/ml/machine-learning-databases/00236/seeds_dataset.txt
seeds<- read.table(path="seeds_dataset.txt")
set.seed(1)
str(seeds)

# Group the seeds in three clusters
km_seeds <- kmeans(seeds, 3)

# Color the points in the plot based on the clusters
plot(length ~ compactness, data = seeds,col=km_seeds$cluster)

# Print out the ratio of the WSS to the BSS
km_seeds$tot.withinss/km_seeds$betweenss

results matching ""

    No results matching ""