## 3D perception for robotics

Posted by Łukasz on 15 Feb 2018.

Hello everyone! Today I would like to go with you through the steps of filtering a camera image, clustering for segmentation and doing object recognition, all that for the famous PR2 robot. We will use the Point Cloud Library (PCL) to help us with that along with Gazebo, RViz and Python.

In the following sections, I will introduce each mention aspect in detail.

### The PR2 robot

The PR2 is an open and robust robot platform designed from the ground up for software developers and researchers. By eliminating the need to first build a hardware system and then re-implement code, the PR2 allows software experts to immediately create new functionality on the robot.

The PR2 Robot is fully integrated with ROS, providing the power of all the ROS developer tools and out-of-the-box functionality for everything from full system calibration to manipulation.

To allow development without a physical PR2, we provide a simulator, with tutorials to get you started. The ROS interfaces to the simulator and to the real robot are the same, so code written for one will generally run on the other.

Features:

• Robust enough that you can develop and test code on the robot itself,
• Ready for development out-of-the-box,
• Wireless Bluetooth Joystick,
• Wireless Run/Stop,
• Speaker.

### Calibration, filtering and segmentation

The first step was to create a `Point Cloud` from the RGBD camera mounted on the PR2 robot. The data was received from the `/pr2/world/points` topic in ROS, which contained noise (the data wasn’t clear). I had to calibrate and filter the data, to get a satisfying result.

Here are the techniques I used...

#### Statistical Outlier Filtering

The Statistical Outlier Filter performs a statistical analysis in the neighborhood of each point, and remove those points which do not meet a certain criteria. By assuming a Gaussian distribution, all points whose mean distances are outside of an interval defined by the global distances mean+standard deviation are considered to be outliers and removed from the point cloud.

``````def statistical_outlier_filtering(data, k=20, thresh=0.5):
sof = data.make_statistical_outlier_filter()
sof.set_mean_k(k)
sof.set_std_dev_mul_thresh(thresh)
return sof.filter()
``````

#### Voxel Grid Downsampling

A voxel grid filter allows you to downsample the data by taking a spatial average of the points in the cloud confined by each voxel. You can adjust the sampling size by setting the voxel size along each dimension. The set of points which lie within the bounds of a voxel are assigned to that voxel and statistically combined into one output point. The resulting Point Cloud is `smaller`.

``````def voxel_grid_downsampling(data, leaf_size=0.01):
vox = data.make_voxel_grid_filter()
vox.set_leaf_size(leaf_size, leaf_size, leaf_size)
return vox.filter()
``````

#### Pass Through Filtering

The Pass Through Filter works much like a cropping tool, which allows you to crop any given 3D point cloud by specifying an axis with cut-off values along that axis. The region you allow to pass through, is often referred to as region of interest.

``````def passthrough_filter(data, axis, min, max):
passthrough = data.make_passthrough_filter()
passthrough.set_filter_field_name(axis)
passthrough_z.set_filter_limits(min, max)
return passthrough.filter()
``````

In the project, the passthrough filter was used twice. First in the `z` axis, and next in the `x` axis (to take care of the place boxes).

``````pcl_data = passthrough_filter(pcl_data, 'z', 0.6, 1.1)
pcl_data = passthrough_filter(pcl_data, 'x', 0.3, 1.0) # Remove boxes
``````

#### RANSAC algorithm

The RANSAC algorithm assumes that all of the data in a dataset is composed of both inliers and outliers, where inliers can be defined by a particular model with a specific set of parameters, while outliers do not fit that model and hence can be discarded. Like in the example below, we can extract the outliners that are not good fits for the model.

In our case `inliners` are the objects placed on the table, while `outliners` include the table and everything else.

``````def ransac_segmentation(data, max_distance=0.01):
ransac = data.make_segmenter()
ransac.set_model_type(pcl.SACMODEL_PLANE)
ransac.set_method_type(pcl.SAC_RANSAC)
ransac.set_distance_threshold(max_distance)
return ransac.segment()

# Getting the objects and table example
inliners, coefficients = ransac_segmentation(data)
objects = data.extract(inliners, negative=True) # Please notice the negative value
table = data.extract(inliners, negative=False) # And here... :)
``````

### Clustering for segmentation

Once we get the Point Clouds associated with objects, we can begin the clustering and segmentation process. We need this, to be able to recognize objects placed on the table.

The `k-d tree` data structure is used in the Euclidian Clustering algorithm to decrease the computational burden of searching for neighboring points. While other efficient algorithms/data structures for nearest neighbor search exist, PCL’s Euclidian Clustering algorithm only supports k-d trees.

``````def euclidean_clustering(white_cloud, tolerance=0.01, min=30, max=5000):
tree = white_cloud.make_kdtree()
ec = white_cloud.make_EuclideanClusterExtraction()
ec.set_ClusterTolerance(tolerance)
ec.set_MinClusterSize(min)
ec.set_MaxClusterSize(max)
ec.set_SearchMethod(tree)
return ec.Extract()
``````

I found out that using the above values presented good results for clustering even the smallest objects (for example `glue` in the 3rd world scene).

Once we have that going, we can visualize the cluster in a way, that is readable to the human eye:

``````def get_cluster_cloud(cluster_indices, white_cloud):
cluster_color = get_color_list(len(cluster_indices))
color_cluster_point_list = []

for j, indices in enumerate(cluster_indices):
for i, indice in enumerate(indices):
color_cluster_point_list.append([white_cloud[indice],
white_cloud[indice],
white_cloud[indice],
rgb_to_float(cluster_color[j])])
cluster_cloud = pcl.PointCloud_PointXYZRGB()
cluster_cloud.from_list(color_cluster_point_list)
return cluster_cloud
``````

### Object recognition

Before doing any actual object recognition, I had to create a labeled dataset first. I have used the `capture_features.py` script with `100 iterations` for each object to get a good feature list. The result was saved into `training_set.sav`.

``````Features in Training Set: 800
Invalid Features in Training set: 0
Scores: [ 0.96875  0.9875   0.975    0.98125  0.96875]
Accuracy: 0.98 (+/- 0.01)
accuracy score: 0.97625
``````

Next I have run the `train_svm.py` script. SVMs work by applying an iterative method to a training dataset, where each item in the training set is characterized by a feature vector and a label. In the image above, each point is characterized by just two features, A and B. The color of each point corresponds to its label, or which class of object it represents in the dataset. Below you can see the Confusion Matrix with and without normalization.

### Sending the yaml file

To communicate with the robot, we will generate `yaml` files, that contain the following:

• `Test scene number`, depending on what world we are currently in,
• `Cluster name` from the `rospy.get_param('/object_list')`,
• `Arm name` from the `rospy.get_param('/dropbox')`,
• `Pickup pose position` taken from the cloud centroid,
• `Place pose position` of the dropbox.

Here is the code needed for creating the `yaml` file responsible for robot movement:

``````# These variables will be used by all objects
test_scene_num = Int32()
test_scene_num.data = SCENE_NUMBER
output = []

# Get object list and dropbox parameters from config files
object_list_param = rospy.get_param('/object_list')
dropbox_list_param = rospy.get_param('/dropbox')

# Loop through all the objects we detected previously
for object in object_list:

# Create response parameters
object_name = String()
arm = String()
pick_pose = Pose()
place_pose = Pose()

# Get object name
object_name.data = str(object.label)

# Get the centroid of the object cloud using numpy
cloud = ros_to_pcl(object.cloud).to_array()
x, y, z = np.mean(cloud, axis=0)[:3]
pick_pose.position.x = np.asscalar(x)
pick_pose.position.y = np.asscalar(y)
pick_pose.position.z = np.asscalar(z)

# Get to what group the item belongs
target_group = None
for param in object_list_param:
if param['name'] == object_name.data:
target_group = param['group']
break

# Check for box information
for param in dropbox_list_param:
if param['group'] == target_group:
arm.data = param['name']
x, y, z = param['position']
place_pose.position.x = np.float(x)
place_pose.position.y = np.float(y)
place_pose.position.z = np.float(z)
break

# Create yaml for object using helper method
dict = make_yaml_dict(test_scene_num, arm, object_name, pick_pose, place_pose)

# Add yaml to output
output.append(dict)

# Send data to the PR2 robot
rospy.wait_for_service('pick_place_routine')
try:
pick_place_routine = rospy.ServiceProxy('pick_place_routine', PickPlace)
# Pass the data to the response
resp = pick_place_routine(
test_scene_num,
object_name,
arm,
pick_pose,
place_pose)

print ("Response: ", resp.success)
except rospy.ServiceException, e:
print "Service call failed: %s" % e
``````

### Discussion

The presented methods allowed me to successfully recognize the objects in 3 test worlds. I had to change the calibration values for filters from the exercises, because they weren’t giving me satisfying results - mostly because of the noise present in the PR2 `/pr2/world/points` topic.

It amazed me, that I had to do so many feature runs (100 iterations), before I got good results that allowed recognition of items. With previous try, which was 50 iterations, some items weren’t recognized at all.

Also `Test World 1` required a higher value for the Euclidean Clustering minimum cluster size - 100 instead of 30 - compared to `Test World 2 and 3`.

What can be improved in future work:

• The way the passthrough filter is handled, won’t fit other camera settings,
• When the objects would be closer, there would be problems with creation of clusters,
• Adding new (not learned) objects would require running the SVM again,
• The initial point cloud is quite big, reducing the speed of the algorithm.