Computer Vision Fundamentals

Computer Vision Fundamentals - Lesson 5: Object Detection and Recognition

Overview:

This lesson covers object detection and recognition techniques, which are fundamental for identifying and classifying objects within images. Students will learn about various methods and algorithms used for detecting and recognizing objects in computer vision tasks.

Objectives:

By the end of this lesson, students should be able to:

  • Understand the key concepts and algorithms for object detection and recognition.
  • Implement object detection using traditional methods and modern deep learning approaches.
  • Evaluate and compare different object detection techniques.

Topics Covered:

  1. Introduction to Object Detection and Recognition:

    • Definition and importance of object detection and recognition.
    • Applications in real-world scenarios (e.g., surveillance, autonomous vehicles).
  2. Traditional Object Detection Methods:

    • Haar Cascades:

      • Overview and use cases.
      • Code implementation and example.

      Code Example:

      import cv2
      
      # Load the Haar cascade classifier
      cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
      
      # Load an image
      image = cv2.imread('image.jpg')
      gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
      
      # Detect objects
      objects = cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30))
      
      # Draw rectangles around detected objects
      for (x, y, w, h) in objects:
          cv2.rectangle(image, (x, y), (x+w, y+h), (255, 0, 0), 2)
      
      # Display results
      cv2.imshow('Detected Objects', image)
      cv2.waitKey(0)
      cv2.destroyAllWindows()
      
    • Histogram of Oriented Gradients (HOG):

      • Concept and application.
      • Code implementation and example.

      Code Example:

      from skimage.feature import hog
      from skimage import exposure
      import matplotlib.pyplot as plt
      import cv2
      
      # Load an image
      image = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)
      
      # Compute HOG features and visualization
      features, hog_image = hog(image, visualize=True, multichannel=False)
      hog_image_rescaled = exposure.rescale_intensity(hog_image, in_range=(0, 10))
      
      # Display the HOG image
      plt.figure(figsize=(10, 5))
      plt.subplot(121)
      plt.imshow(image, cmap='gray')
      plt.title('Input Image')
      plt.subplot(122)
      plt.imshow(hog_image_rescaled, cmap='gray')
      plt.title('HOG Image')
      plt.show()
      
  3. Deep Learning-Based Object Detection:

    • YOLO (You Only Look Once):

      • Concept and architecture.
      • Code implementation using pre-trained YOLO models.

      Code Example:

      import cv2
      import numpy as np
      
      # Load YOLO model
      net = cv2.dnn.readNet('yolov3.weights', 'yolov3.cfg')
      layer_names = net.getLayerNames()
      output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]
      
      # Load image
      image = cv2.imread('image.jpg')
      height, width, channels = image.shape
      
      # Prepare image for YOLO
      blob = cv2.dnn.blobFromImage(image, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
      net.setInput(blob)
      outs = net.forward(output_layers)
      
      # Post-process the detections
      class_ids = []
      confidences = []
      boxes = []
      
      for out in outs:
          for detection in out:
              for obj in detection:
                  scores = obj[5:]
                  class_id = np.argmax(scores)
                  confidence = scores[class_id]
                  if confidence > 0.5:
                      center_x = int(obj[0] * width)
                      center_y = int(obj[1] * height)
                      w = int(obj[2] * width)
                      h = int(obj[3] * height)
                      x = int(center_x - w / 2)
                      y = int(center_y - h / 2)
                      boxes.append([x, y, w, h])
                      confidences.append(float(confidence))
                      class_ids.append(class_id)
      
      # Apply non-max suppression to remove overlapping boxes
      indices = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
      
      for i in indices:
          x, y, w, h = boxes[i[0]]
          cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2)
      
      # Display results
      cv2.imshow('YOLO Object Detection', image)
      cv2.waitKey(0)
      cv2.destroyAllWindows()
      
    • SSD (Single Shot MultiBox Detector) and Faster R-CNN:

      • Overview and comparisons with YOLO.
      • Brief code examples using pre-trained models.
  4. Object Recognition:

    • Introduction to Object Recognition:

      • Concepts and approaches.
      • Key algorithms (e.g., template matching, feature-based recognition).
    • Template Matching:

      • Concept and application.
      • Code implementation and example.

      Code Example:

      import cv2
      
      # Load images
      image = cv2.imread('image.jpg')
      template = cv2.imread('template.jpg')
      
      # Convert to grayscale
      gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
      gray_template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
      
      # Template matching
      result = cv2.matchTemplate(gray_image, gray_template, cv2.TM_CCOEFF_NORMED)
      min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)
      
      # Draw rectangle around the detected object
      top_left = max_loc
      bottom_right = (top_left[0] + template.shape[1], top_left[1] + template.shape[0])
      cv2.rectangle(image, top_left, bottom_right, (0, 255, 0), 2)
      
      # Display results
      cv2.imshow('Template Matching', image)
      cv2.waitKey(0)
      cv2.destroyAllWindows()
      
  5. Evaluating Object Detection and Recognition Models:

    • Metrics: Precision, recall, F1 score, and mean average precision (mAP).
    • Choosing the right model for specific applications.
  6. Challenges in Object Detection and Recognition:

    • Handling occlusions, varying lighting, and complex backgrounds.

Activities and Quizzes:

  • Activity: Implement object detection using YOLO or SSD on a given dataset. Visualize and analyze the results.
  • Quiz: Multiple-choice questions on object detection algorithms and recognition techniques.

Assignments:

  • Assignment 5: Develop a project where you implement object detection and recognition on a custom dataset (e.g., detecting and classifying objects in a series of images). Submit the code along with a detailed report discussing the techniques used and the performance of your system.

This lesson provides students with a comprehensive understanding of object detection and recognition, equipping them with the skills needed to tackle complex computer vision problems.