This is the title of the webpage!

Parallel Rapidly Exploring Random Trees

Team Members

Zixin Shen, Haoran Zhang

Checkpoint

Click for our checkpoint report

checkpoint

Final

Click for our final report

final

Summary

We propose to implement a parallel version of RRT algorithm, compare its performance against the serial version and benchmark it on different number of cores.

Background

RRT is Rapidly Exploring Random Trees algorithm typically used in motion planning in autonomous vehicles therefore it's quite important to speed up the computation time to reduce the decision time of autonomous vehicle. RRT-based algorithms are relatively easy to implement without establishing a detailed map and also they are efficient for high dimensional state space.

Challenge

1. Difficult to parallelize outer iterative loop because each iteration has strong data dependency with the previous iteration because they need to update the kd-tree, insert the node.

2. Hard to parallelize both finding nearby nodes and collision detection at the same time due to the nature of the algorithm.

Resources

We will start by a serial implementation of RRT and we will use the OpenMP to accelerate the RRT and analyze the speedup of the algorithm. And also we found some implementations using Cuda or OpenMP without optimal data structure, we would like to evaluate their performance and compare our performance with theirs.

Goals and Deliverables

Basic Goal

Our basic goal is to have a working version of the RRT algorithm, as well as the benchmark script to measure the performance, then implement a basic parallel version RRT and compare the performance differences.

Ideal Goal

Our ideal goal is to improve our parallel version RRT with better parallel strategy as well as using data structures such as k-d tree and compare the performance difference with version 1 and speedup with different number of cores.

Extra Goal

If things go well and we still have time, we will try to mitigate our algorithm on GPU using cuda and measure performance difference. We will also try to make our k-d tree lock free and achieve even better parallelism.

Platform Choice:

We will use the multi-core CPU (the GHC machines) for our basic implementation. And we will try to benchmark it on the PSC machine as well. Then if we have time, we will try to mitigate it on the GPU.

Timeline

W1: 11/09-11/15	Implement Serial RRT and reproduce the parallelized version of RRT other paper shows
W2: 11/16-11/22	Devise the Parallelized RRT, analyze the performance, figure out the bottleneck
W3: 11/23-11/29	Improve our Parallelized RRT with k-d tree, compared with cuda version, and initial parallelized RRT version.
11/30	Milestone Report
W4: 11/30-12/08	Visualize the path generation and results, finalize the final report.
11/09	Poster Session