Advanced Lane Finding Project
The goals / steps of this project are the following:
- Compute the camera calibration matrix and distortion coefficients given a set of chessboard images.
- Apply a distortion correction to raw images.
- Use color transforms, gradients, etc., to create a thresholded binary image.
- Apply a perspective transform to rectify binary image ("birds-eye view").
- Detect lane pixels and fit to find the lane boundary.
- Determine the curvature of the lane and vehicle position with respect to center.
- Warp the detected lane boundaries back onto the original image.
- Output visual display of the lane boundaries and numerical estimation of lane curvature and vehicle position.
Rubric Points
Here I will consider the rubric points individually and describe how I addressed each point in my implementation.
1. Briefly state how you computed the camera matrix and distortion coefficients. Provide an example of a distortion corrected calibration image.
In order to undistort an image and compute the camera matrix and distortion coefficients as parameters to undistort other images, I used OpenCV's calibrateCamera function. The function requires an array of 3D coordinates that correspond to the location of points in the real world, as well as an array of 2D coordinates that correspond to the location of those points in a 2D image. For the 3D coordinates, I created an array called objp
that is contains the x, y, z coordinates of each corner of a 9x6 square chessboard starting from (0,0,0) and the going across and down the chessboard. The z coordinate is always 0. Next, I read in 20 images of a black-white chessboard taped to a wall. These images are still distorted. Each image is an input to the OpenCV function that finds the corners within the image. If the corners are detected, then their x, y coordinates are appended to an array called imgpoints
and a copy of objp
is appended to objpoints
. These arrays can be input to cv2.calibrateCamera()
to compute the calibration and distortion matrices. I use these matrices as input to cv2.undistort()
throughout the project in order to undistort images. I have included an example of applying the undistort function to correct an image below.
I applied the distortion-correction to one of the images from the vehicle, where the distortion effect was obvious:
2. Describe how (and identify where in your code) you used color transforms, gradients or other methods to create a thresholded binary image. Provide an example of a binary image result.
I will demonstrate the methods I used to create a threshold binary image on the following image from the vehicle:
To create the threshold binary image, I used three channels: The 'Red' channel from the RGB image, the 'Saturation' channel from the HLS image, and the 'Hue' channel from the HLS image. For the Red channel, I applied an absolute Sobel gradient operation in the X direction, with a kernel of size 9 and threshold values between 30 and 100. I did a logical and
of this gradient with a directional gradient on the Red channel, of kernel size 9 and threshold values between 0.7 and 1.3. This can be found in 330 to 335 of p3_submission.py.
The Sobel operation on the Red channel in the X direction did a good job of finding the white lane lines, however it would also pickup extraneous marks on the road. I found that if I performed a logical and
of the output of the Sobel operation with that of the directional gradient, I could remove some of the extraneous marks. See the following for an example. The top image shows the threshold binary for a Sobel operation in the X direction on the Red channel alone. The bottom image shows the output of that operation 'anded' with the directional gradient. This can be found in line 296 of p3_submission.py.
Performing a channel threshold on the HLS Saturation and Hue channels can pick up the yellow lane better, as well as fill in parts of the lanes that the operations on the Red channel did not pick up. This can be found in line 297 of p3_submission.py. See the image below for an example.
Finally, I do a logical or
of the Red gradient output and the HLS threshold output since both complement each other. The Red channel picks up certain white lane portions better and the HLS threshold output picks up yellow lanes better even if the lanes have shadow on them. I do this in line 302 of p3_submission.py. The top image shows the combined binary output and the bottom image shows the Red channel contribution in red and the HLS channel contribution in green.
3. Describe how (and identify where in your code) you performed a perspective transform and provide an example of a transformed image.
I used the OpenCV functions cv2.getPerspectiveTransform()
and cv2.warpPerspective()
to transform an image from the vehicle to a top-down perspective. In order to do so, I had to specify the x, y coordinates of points from the original image into the x, y coordinates of those points in the transformed image. Originally, I used my code from Project 1 - Lane Finding, to find the endpoint coordinates of two lines going through both lanes, setting those as the source points, and then transforming them into fixed destination points. However, when I performed the transform, the lanes would shift around since the source points were always changing. This made it difficult for the program to keep track of the lane positions. Later, I decided to fix the source points using endpoint coordinates that roughly correspond to the points of the lane closest to the car camera and then to a position in the road about midway down the line-of-sight. This resulted in much more stable perspective transforms.
I created a helper function called transform_perspective()
to do the transform. This is in lines 306 to 317 of p3_submission.py. When the pipeline is actually run, it calls a function named process_image()
in which I define the source and destination points and use them as input into transform_perspective()
in lines 358 to 368 of the same file.
y_max = 450 # The y value of where the lane lines end in the distance
x_offset = 45 # The number of pixels to the left and right of the horizontal middle of the image
imshape = image.shape (720, 1280, 3)
src = np.float32([[210, imshape[0]],
[imshape[1]/2-x_offset, y_max],
[imshape[1]/2+x_offset, y_max],
[1100, imshape[0]]])
dst = np.float32([[300, 720],
[300, 0],
[1000, 0],
[1000, 720]])
This resulted in the following source and destination points:
Source | Destination |
---|---|
210, 720 | 320, 720 |
625, 450 | 320, 0 |
725, 450 | 1000, 0 |
1100, 720 | 1000, 720 |
I verified that my perspective transform was working as expected by drawing the src
and dst
points onto a test image and its warped counterpart to verify that the lines appear parallel in the warped image.
4. Describe how (and identify where in your code) you identified lane-line pixels and fit their positions with a polynomial?
I use the lane-centroid search method to detect lane pixels in order to fit their positions with a 2-degree polynomial curve. If the centroid method successfully detects lane pixels and a polynomial is computed, then I use the computed coefficients to extrapolate the lane in successive frames instead of using the centroid search again. The entire method is defined as a helper function named find_lane_centroids
from lines 412 to 481 of p3_submission.py.
The lane centroid search method starts by vertically summing the values in the bottom left and right quadrants of the image. It convolves the result with a 1D array of a predefined length and all the values in this array are 1. This 1D array acts as a window that can slide across the length of the image and the argmax
of this convolution process returns the position of the lane center (lines 425 to 428). The algorithm repeats this convolution process by moving up the image in predefined increments and records the left and right center lanes in an array (lines 434 to 451). Once all the lane centers are found, the algorithm finds all the pixels that are within a margin surrounding each center of each lane, from the bottom to the top of the image (lines 454 to 466). The image below shows a top-down view of a threshold binary image and the green windows represents the lane pixels the centroid search detected.
After I have the warped binary image and the centroid windows, I and
them together to get an image showing only the pixels of the lanes. If the left and right lane pixels are in left_lane
and right_lane
respectively, then we can get the x and y values of these pixels using the following code (lines 576 to 582 or p3_submission.py):
left_lane_inds = left_lane.nonzero()
right_lane_inds = right_lane.nonzero()
# Extract left and right lane pixel positions
leftx = left_lane_inds[1]
lefty = left_lane_inds[0]
rightx = right_lane_inds[1]
righty = right_lane_inds[0]
The program uses the x and y values of the pixels in Numpy's polyfit()
function to compute the coefficients of a 2-degree curve that best fits the pixel positions. I store each frame's coefficients into an array and then compute the moving average of the coefficients across the 3 most recent frames (lines 615 to 624).
# Fit a second order polynomial to the extracted left and right lane pixels
# Set this new fit as the current fit
left_line.current_fit = np.polyfit(lefty, leftx, 2)
right_line.current_fit = np.polyfit(righty, rightx, 2)
# Add the new poly coeffs to the list of all poly coeffs
left_line.poly_coeffs.append(left_line.current_fit)
right_line.poly_coeffs.append(right_line.current_fit)
# Calculate the average poly coeffs over the last n frames
left_line.best_fit = np.average(np.array(left_line.poly_coeffs[max(0, left_line.d + 1 - n):]), axis=0)
right_line.best_fit = np.average(np.array(right_line.poly_coeffs[max(0, right_line.d + 1 - n):]), axis=0)
I use this "best" fit to calculate the x values of the curve that fits the two lanes, given a set of predefine y values running from 0 to 719, the height of the image (lines 627 to 632).
# Create a 1D array from 0 to 719
ploty = np.linspace(0, binary_warped.shape[0]-1, binary_warped.shape[0])
# Use the average poly fit coeffs for further calculations
left_fit = left_line.best_fit
right_fit = right_line.best_fit
# Use the fit coeffs to generate the x values of the curve for both lanes
left_line.bestx = left_fit[0]*ploty**2 + left_fit[1]*ploty + left_fit[2]
right_line.bestx = right_fit[0]*ploty**2 + right_fit[1]*ploty + right_fit[2]
5. Describe how (and identify where in your code) you calculated the radius of curvature of the lane and the position of the vehicle with respect to center.
I created a function called calc_curve
to take in the x and y values of the lane pixels to calculate the radius of the curve. The code, taken from lines 381 to 400 of p3_submission.py is below:
def calc_curve(y_eval, lefty, leftx, righty, rightx):
'''
Takes a set of points from the right and left lanes and then calculates the curvature
of the curve that fits those points.
'''
# Define conversions in x and y from pixels space to meters
ym_per_pix = 30/720 # meters per pixel in y dimension
xm_per_pix = 3.7/700 # meters per pixel in x dimension
# Fit new polynomials to x,y in world space
left_fit_cr = np.polyfit(lefty*ym_per_pix, leftx*xm_per_pix, 2)
right_fit_cr = np.polyfit(righty*ym_per_pix, rightx*xm_per_pix, 2)
# Calculate the new radii of curvature
left_curverad = ((1 + (2*left_fit_cr[0]*y_eval*ym_per_pix +
left_fit_cr[1])**2)**1.5) / np.absolute(2*left_fit_cr[0])
right_curverad = ((1 + (2*right_fit_cr[0]*y_eval*ym_per_pix +
right_fit_cr[1])**2)**1.5) / np.absolute(2*right_fit_cr[0])
# Now our radius of curvature is in meters
return left_curverad, right_curverad
In order to calculate the vehicle position with respect to the lane center, I used the x values of the left and right lane curves at the bottom of the image to calculate the average between the two. I then subtracted this number from the center of the image, which is 640. Then, I multiply it by a factor to convert the number from pixels to meters. Finally, if the number is negative, I say that the vehicle is to the left of center. Otherwise, it is to the right of center. The code for this is from lines 684 to 687 of p3_submission.py.
# Calculate the point midway between the lanes at the bottom of the image
midpt = (left_fitx[y_eval] + right_fitx[y_eval])/2
# Calculate the offset of the midpoint to the center of the image, in meters
offset = (midpt - 640)*xm_per_pix
6. Provide an example image of your result plotted back down onto the road such that the lane area is identified clearly.
I implemented the pipeline in the function run_centroid_search_pipeline()
from line 548 of p3_submission.py. The pipeline undistorts an image frame from the video and creates a warped treshold binary from it. Then it uses the centroid search method to find the lane pixels. It fits a polynomial to the pixels and uses the coefficients to calculate curves that are fit onto the lanes. A green polygon is filled in between the curves. The algorithm performs an inverse perspective transform of the polygon and overlays it back onto the undistorted image of the road. The coefficients are stored and future iterations will use a moving average over the most recent 3 frames to calculate the "best" fit. If a polynomial curve is successfully generated, then the algorithm uses that fit to extrapolate the lane curves for future frames. The code checks for major shifts in the positions of the lane curves or if the lane pixels are not detected. If a frame results in major shifts or no lane pixels detected, then the previous averaged best fit is used for computing the curve. If over 3 frames consecutively contains these issues, then the code starts the centroid search over and drops all the stored fits so far.
Here are 3 pipeline output images from various stages of the video
1. Provide a link to your final video output. Your pipeline should perform reasonably well on the entire project video (wobbly lines are ok but no catastrophic failures that would cause the car to drive off the road!).
Here's a link to my video result
1. Briefly discuss any problems / issues you faced in your implementation of this project. Where will your pipeline likely fail? What could you do to make it more robust?
I created a video that shows only the binary thresholds of each image frame. It shows that sometimes the centroid search detects a center that is not part of the lane. This happens in particular with the dashed lanes, where there are empty spaces in between the dashes. Also, since the algorithm hardcodes the source and destination points of the perspective transform, if there are any sharp turns then the perspective transform will not work well and this will result in poorly fitted lanes. The threshold gradients still tend to pickup marks on the road and this may be detected as a lane pixel, which can also result in a poorly fitted curve. One of the ways I have tried to resolve this is by smoothing over a few frames and also comparing the positions of the lanes over current and previous frames. If there are any wild fluctuations then the code will decide that it was a bad frame. Another thing I could do is use the lane detection algorithms from the first project, in which we use hough lines to fit the best lines to the lane so that I could have dynamic source and destination points that can keep up with curved lines.