Part Segmentation of 3D Point Clouds with OpenVINO™

This tutorial is also available as a Jupyter notebook that can be cloned directly from GitHub. See the installation guide for instructions to run this tutorial locally on Windows, Linux or macOS.


This notebook demonstrates how to process point cloud data and run 3D Part Segmentation with OpenVINO. We use the PointNet pre-trained model to detect each part of a chair and return its category. ## PointNet PointNet was proposed by Charles Ruizhongtai Qi, a researcher at Stanford University in 2016: arXiv:1612.00593 <PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation>. The motivation behind the research is to classify and segment 3D representations of images. They use a data structure called point cloud, which is a set of points that represents a 3D shape or object. PointNet provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing. It is highly efficient and effective, showing strong performance on par or even better than state of the art.


import os
import sys
from pathlib import Path
import numpy as np
import matplotlib.pyplot as plt
from openvino.runtime import Core

from notebook_utils import download_file

Prepare the Model

Download the pre-trained PointNet ONNX model

# Set the data and model directories, model source URL and model filename
MODEL_DIR = "model"
os.makedirs(MODEL_DIR, exist_ok=True)
download_file("", directory=Path(MODEL_DIR), show_progress=False)
onnx_model_path = Path(MODEL_DIR) / "chair_100.onnx"

Convert the ONNX model to OpenVINO IR. An OpenVINO IR (Intermediate Representation) model consists of an .xml file, containing information about network topology, and a .bin file, containing the weights and biases binary data.

ir_model_xml = onnx_model_path.with_suffix(".xml")
ir_model_bin = onnx_model_path.with_suffix(".bin")

if not ir_model_xml.exists():
    !mo --input_model $onnx_model_path --output_dir $MODEL_DIR --compress_to_fp16
[ INFO ] The model was converted to IR v11, the latest model format that corresponds to the source DL framework input/output format. While IR v11 is backwards compatible with OpenVINO Inference Engine API v1.0, please use API v2.0 (as of 2022.1) to take advantage of the latest improvements in IR v11.
Find more information about API v2.0 and IR v11 at
[ SUCCESS ] Generated IR version 11 model.
[ SUCCESS ] XML file: /opt/home/k8sworker/cibuilds/ov-notebook/OVNotebookOps-358/.workspace/scm/ov-notebook/notebooks/224-3D-segmentation-point-clouds/model/chair_100.xml
[ SUCCESS ] BIN file: /opt/home/k8sworker/cibuilds/ov-notebook/OVNotebookOps-358/.workspace/scm/ov-notebook/notebooks/224-3D-segmentation-point-clouds/model/chair_100.bin

Data Processing Module

def load_data(point_file):
    Load the point cloud data and convert it to ndarray

        point_file: string, path of .pts data

    point_set = np.loadtxt(point_file).astype(np.float32)

    # normailization
    point_set = point_set - np.expand_dims(np.mean(point_set, axis=0), 0)  # center
    dist = np.max(np.sqrt(np.sum(point_set ** 2, axis=1)), 0)
    point_set = point_set / dist  # scale

    return point_set

def visualize(point_set):
    Create a 3D view for data visualization

        point_set: ndarray, the coordinate data in X Y Z format

    fig = plt.figure(dpi=192, figsize=(4, 4))
    ax = fig.add_subplot(111, projection='3d')
    X = point_set[:, 0]
    Y = point_set[:, 2]
    Z = point_set[:, 1]

    # Scale the view of each axis to adapt to the coordinate data distribution
    max_range = np.array([X.max() - X.min(), Y.max() - Y.min(), Z.max() - Z.min()]).max() * 0.5
    mid_x = (X.max() + X.min()) * 0.5
    mid_y = (Y.max() + Y.min()) * 0.5
    mid_z = (Z.max() + Z.min()) * 0.5
    ax.set_xlim(mid_x - max_range, mid_x + max_range)
    ax.set_ylim(mid_y - max_range, mid_y + max_range)
    ax.set_zlim(mid_z - max_range, mid_z + max_range)

    ax.set_xlabel('X', fontsize=10)
    ax.set_ylabel('Y', fontsize=10)
    ax.set_zlabel('Z', fontsize=10)

    return ax

Visualize the original 3D data

The point cloud data can be downloaded from ShapeNet, a large-scale dataset of 3D shapes. Here we select the 3D data of a chair for example.

point_data = "../data/pts/chair.pts"
points = load_data(point_data)
X = points[:, 0]
Y = points[:, 2]
Z = points[:, 1]
ax = visualize(points)
ax.scatter3D(X, Y, Z, s=5, cmap="jet", marker="o", label='chair')
ax.set_title('3D Visualization')
plt.legend(loc='upper right', fontsize=8)

Run inference

Run inference and visualize the results of 3D segmentation. - The input data is a point cloud with 1 batch size3 axis value (x, y, z) and arbitrary number of points (dynamic shape). - The output data is a mask with 1 batch size and 4 classifcation confidence for each input point.

# Parts of a chair
classes = ['back', 'seat', 'leg', 'arm']

# Preprocess the input data
point = points.transpose(1, 0)
point = np.expand_dims(point, axis=0)

# Read model
ie = Core()
model = ie.read_model(model=ir_model_xml)

print(f"input shape: {model.input().partial_shape}")
print(f"output shape: {model.output(0).partial_shape}")
input shape: [1,3,?]
output shape: [1,?,4]
# Inference
compiled_model = ie.compile_model(model=model, device_name="CPU")
output_layer = compiled_model.output(0)
result = compiled_model([point])[output_layer]

# Find the label map for all points of chair with highest confidence
pred = np.argmax(result[0], axis=1)
ax = visualize(point)
for i, name in enumerate([0, 1, 2, 3]):
    XCur = []
    YCur = []
    ZCur = []
    for j, nameCur in enumerate(pred):
        if name == nameCur:
    XCur = np.array(XCur)
    YCur = np.array(YCur)
    ZCur = np.array(ZCur)

    # add current point of the part
    ax.scatter(XCur, YCur, ZCur, s=5, cmap="jet", marker="o", label=classes[i])

ax.set_title('3D Segmentation Visualization')
plt.legend(loc='upper right', fontsize=8)