Binary Search

When implementing binary search, the choice between using l <= r vs. l < r, and how to update l and r (l = mid + 1 vs. l = mid, etc.), depends on the specific problem and whether you are looking for an exact match, the lower bound, or the upper bound of a target value. Here's a breakdown of each scenario:

Use case: Finding the exact position of a target element.

  • Loop condition: while l <= r

  • Mid calculation: mid = l + (r - l) // 2

  • Updates:

    • If the target is found: return mid

    • If the target is less than the middle element: r = mid - 1

    • If the target is greater than the middle element: l = mid + 1

def binary_search(arr, target):
    l, r = 0, len(arr) - 1
    while l <= r:
        mid = l + (r - l) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            l = mid + 1
        else:
            r = mid - 1
    return -1  # target not found
  • Why l <= r? The loop needs to continue as long as there are elements to check. When l equals r, mid will still represent a valid index to check.

  • Why l = mid + 1 and r = mid - 1? To move past the middle element after determining which side of the array the target could be in.

Use case: Finding the first occurrence of a target value or the smallest value that is greater than or equal to the target.

  • Loop condition: while l < r

  • Mid calculation: mid = l + (r - l) // 2

  • Updates:

    • If the mid value is greater than or equal to the target: r = mid

    • If the mid value is less than the target: l = mid + 1

def lower_bound(arr, target):
    l, r = 0, len(arr)
    while l < r:
        mid = l + (r - l) // 2
        if arr[mid] < target:
            l = mid + 1
        else:
            r = mid
    return l
  • Why l < r? This ensures that when the loop exits, l will be the smallest index such that arr[l] is greater than or equal to the target.

  • Why r = mid and not mid - 1? When searching for the lower bound, if arr[mid] is not less than the target, we cannot exclude mid from being the potential lower bound.

Use case: Finding the last occurrence of a target value or the largest value that is less than or equal to the target.

  • Loop condition: while l < r

  • Mid calculation: mid = l + (r - l + 1) // 2 (or mid = (l + r + 1) // 2)

  • Updates:

    • If the mid value is less than or equal to the target: l = mid

    • If the mid value is greater than the target: r = mid - 1

def upper_bound(arr, target):
    l, r = 0, len(arr) - 1
    while l < r:
        mid = l + (r - l + 1) // 2
        if arr[mid] > target:
            r = mid - 1
        else:
            l = mid
    return l
  • Why l < r? The loop exits when l equals r, and l will point to the largest index where arr[l] is less than or equal to the target.

  • Why l = mid and not mid + 1? When finding the upper bound, if arr[mid] is less than or equal to the target, we cannot exclude mid from potentially being the upper bound.

  • while l <= r: Use when you need to check every element and the target could be at any position, including when l == r. This is common when searching for an exact match.

  • while l < r: Use when searching for a boundary condition (like lower or upper bounds). This setup ensures that l ends at the boundary condition.

  • l = mid + 1 vs. r = mid - 1: Use when you can safely exclude mid from the next search range (e.g., exact search when the target is not at mid).

  • l = mid vs. r = mid: Use when mid could still be the target or part of the desired result (e.g., when looking for a boundary).