open3d k-means 聚类

news/发布时间2024/5/18 15:48:36

k-means 聚类

- - 一、算法原理
  - - - 1、介绍
      - 2、算法步骤
  - 二、代码
  - - - 1、机器学习生成`kmeans`聚类
      - 2、点云学习生成聚类
  - 三、结果
  - - - 1、原点云
      - 2、机器学习生成`kmeans`聚类
      - 3、点云学习生成聚类
  - 四、相关链接

一、算法原理

1、介绍

K-means聚类算法是一种无监督学习算法，主要用于数据聚类。该算法的主要目标是找到一个数据点的划分，使得每个数据点与其所在簇的质心（即该簇所有数据点的均值）之间的平方距离之和最小。

在K-means聚类算法中，首先需要预定义簇的数量K，然后随机选择K个对象作为初始的聚类中心。接着，算法会遍历数据集中的每个对象，根据对象与各个聚类中心的距离，将每个对象分配给距离它最近的聚类中心。完成一轮分配后，算法会重新计算每个簇的聚类中心，新的聚类中心是该簇所有对象的均值。这个过程会不断重复，直到满足某个终止条件，如没有（或最小数目）对象被重新分配给不同的簇，没有（或最小数目）簇的中心再发生变化，或者误差平方和局部最小。

2、算法步骤

在这里插入图片描述

二、代码

1、机器学习生成`kmeans`聚类

# -*- coding: utf-8 -*-import open3d as o3d
import numpy as np
from copy import deepcopy
from sklearn import clusterdef KMeans():# KMeans聚类，非监督pcd_path = r"res/bunny.pcd"pcd = o3d.io.read_point_cloud(pcd_path)pcd = o3d.geometry.PointCloud(pcd)o3d.visualization.draw_geometries([pcd])print(pcd)# 对点云数据进行着色操作，使其所有点的颜色相同，颜色为 [0, 0, 0]pcd.paint_uniform_color(color=[0, 0, 0]) n_clusters = 3  # 聚类簇数# 将点云数据转换为 numpy 数组，并使用 sklearn 的 KMeans 进行聚类points = np.asarray(pcd.points)print(points)kmeans = cluster.KMeans(n_clusters=n_clusters, random_state=42, n_init=10, init="k-means++")kmeans.fit(points)  # 获取聚类结果，这里主要是每个点的类别标签labels = kmeans.labels_# 随机生成一些颜色，然后根据类别标签将这些颜色分配给对应的点colors = np.random.randint(0, 255, size=(n_clusters, 3)) / 255colors = colors[labels]# 对原始的点云数据做一个深度拷贝，并将这个拷贝的每个点的位置向下移动50个单位。这是为了在可视化时更清楚地看到聚类效果pcd_cluster = deepcopy(pcd)pcd_cluster.translate([50, 0, 0])# 将新生成的颜色赋值给拷贝的点云数据pcd_cluster.colors = o3d.utility.Vector3dVector(colors)o3d.visualization.draw_geometries([pcd_cluster])if __name__ == "__main__":KMeans()

2、点云学习生成聚类

import numpy as np
import open3d as o3d
import copy
from matplotlib import pyplot as plt# 在点云上添加分类标签
def draw_labels_on_model(pcl, labels):cmap = plt.get_cmap("tab20")pcl_temp = copy.deepcopy(pcl)max_label = labels.max()colors = cmap(labels / (max_label if max_label > 0 else 1))pcl_temp.colors = o3d.utility.Vector3dVector(colors[:, :3])o3d.visualization.draw_geometries([pcl_temp])# 计算欧氏距离
def euclidean_distance(one_sample, X):# 将one_sample转换为一纬向量one_sample = one_sample.reshape(1, -1)# 把X转换成一维向量X = X.reshape(X.shape[0], -1)# 这是用来确保one_sample的尺寸与X相同distances = np.power(np.tile(one_sample, (X.shape[0], 1)) - X, 2).sum(axis=1)return distancesclass Kmeans(object):# 构造函数def __init__(self, k=2, max_iterations=1500, tolerance=0.00001):self.k = kself.max_iterations = max_iterationsself.tolerance = tolerance# 随机选取k个聚类中心点def init_random_centroids(self, X):# save the shape of Xn_samples, n_features = np.shape(X)# make a zero matrix to store valuescentroids = np.zeros((self.k, n_features))# 因为有k个中心点，所以执行k次循环for i in range(self.k):# 随机选取范围内的值centroid = X[np.random.choice(range(n_samples))]centroids[i] = centroidreturn centroids# 查找距离样本点最近的中心def closest_centroid(self, sample, centroids):distances = euclidean_distance(sample, centroids)# np.argmin 返回距离最小值的下标closest_i = np.argmin(distances)return closest_i# 确定聚类def create_clusters(self, centroids, X):# 这是为了构造用于存储集群的嵌套列表clusters = [[] for _ in range(self.k)]for sample_i, sample in enumerate(X):centroid_i = self.closest_centroid(sample, centroids)clusters[centroid_i].append(sample_i)return clusters# 基于均值算法更新质心def update_centroids(self, clusters, X):n_features = np.shape(X)[1]centroids = np.zeros((self.k, n_features))for i, cluster in enumerate(clusters):centroid = np.mean(X[cluster], axis=0)centroids[i] = centroidreturn centroids# 获取标签def get_cluster_labels(self, clusters, X):y_pred = np.zeros(np.shape(X)[0])for cluster_i, cluster in enumerate(clusters):for sample_i in cluster:y_pred[sample_i] = cluster_ireturn y_pred# 预测标签def predict(self, X):# 随机选取中心点centroids = self.init_random_centroids(X)for _ in range(self.max_iterations):# 对所有点进行聚类clusters = self.create_clusters(centroids, X)former_centroids = centroids# 计算新的聚类中心centroids = self.update_centroids(clusters, X)# 判断是否满足收敛diff = centroids - former_centroidsif diff.any() < self.tolerance:breakreturn self.get_cluster_labels(clusters, X)if __name__ == "__main__":#  加载点云pcd = o3d.io.read_point_cloud('res/bunny.pcd')points = np.asarray(pcd.points)o3d.visualization.draw_geometries([pcd])# 执行K-means聚类clf = Kmeans(k=3)labels = clf.predict(points)# 可视化聚类结果draw_labels_on_model(pcd, labels)