## Sunday, February 9, 2020

### Binary Search

In this blog post we will not explain what binary search is or how it works. Instead, we will focus on how to detect a binary search solution in a coding interview question and how to implement it properly.

## When to binary search in a coding interview?

We can use binary search almost every time we deal with a monotonic function. A monotonic function is a function between ordered sets that preserves or reverses the original order.

A function f(x) = y that satisfies f(x') <= f(x'') for any x' < x'' is monotonic. We used "<=" as the order relationship, but we can also use ">=" or a custom defined order relationship.

f(x) = 3x+1 is monotonically increasing.
f(x) = -x+5 is monotonically decreasing.
a sorted array arr[] = {2, 5, 9, 10, 12, 17} is a monotonically increasing discrete function!

We should consider using binary search whenever we are given a sorted input, a timeline, or generally speaking, a monotonic function.

Some coding interview practice questions could look like this:

Given a sorted array of integers A and a random number Q, find out if Q exists in A.
Find the minimum of a given U-shaped function or the peak in a mountain array.

## The textbook binary search is flawed

Let's start with the simple problem of finding a certain number Q in the sorted array A. In the programming interview, most candidates will implement something like this:

def find(arr, q):
left = 0
right = len(arr)
while (left < right):
mid = (left + right) / 2
if arr[mid] == q:
return true
if arr[mid] > q:
right = mid - 1
else
left = mid + 1

if right >= 0 and arr[right] == q:
return true
if arr[left] == q:
return true
return false

The problem with this implementation is that at the end of the while loop we don't know precisely if left == right or left > right. We also don't know if we are still within the boundaries of the array. Let's take a few examples:

[left, right] = [3, 4] => mid = 3
we may end up with [left, right] = [3, 2]

[left, right] = [0, 1] => mid = 0
we may end up with [left, right] = [0, -1]

This is where almost every candidate struggles in the programming interview, and, in most cases fails to handle all the possible edge cases.

## Lower bound / upper bound

If, instead of asking to find an exact value Q, we now ask for an upper bound or lower bound (nearest value in the array that is >= Q or <= Q) then we end up with even more complex edge cases.

Specifically, when we move the pointers to mid-1 or mid+1, we leave arr[mid] out of the potential solutions space. But arr[mid] may still be a viable solution. In our interviewing experience, almost every attempt the candidate made to address this has resulted in an infinite loop.

Also, once we are out of the while loop, we won't be able to tell whether arr[left], arr[right], or neither of them is the value we are looking for.

The source of all these corner cases comes from the way we move the pointers:

left = mid + 1
right = mid - 1

Let's try to fix this so that we never leave mid out and we never end up with the confusing end state where left > right.

• First, keep in mind that whenever we move the pointers, our goal is to generate a smaller interval than the original [left, right]. This is obvious when left and right are far apart, but can be confusing when they get very close.
• Second, the new binary search interval should be roughly half the size of the previous one (to achieve the logarithmic time complexity).
• Third, notice that the following relationship stands no matter what:

left <= mid < right
or even better:
left <= mid < mid + 1 <= right

So, if we split the [left, right] interval into [left, mid] and [mid+1, right], then we check all 3 conditions from above, and, according to the third condition, it's guaranteed that left <= right always, and left == right at the end!

Let's implement it:

def find(arr, q):
left = 0
right = len(arr)
while (left < right):
mid = (left + right) / 2
if arr[mid] <= q:
right = mid
else
left = mid + 1

# at this point we know that left == right
return arr[left] == q

Short, clean, without edge cases! Easy to adapt when looking for an upper or lower bound. Precisely what we need in a software engineering coding interview.

Of course, we can still check and return early if we find arr[mid] == q. However, this is an optimistic optimization and only provides a real benefit in particularly lucky situations.

def find(arr, q):
left = 0
right = len(arr)
while (left < right):
mid = (left + right) / 2
if arr[mid] == q:
return true
if arr[mid] < q:
right = mid
else
left = mid + 1
return arr[left] == q

## Binary search coding interview practice questions

• Find first or last occurrence of a given number in a sorted array
• Count occurrences of a number in a sorted array with duplicates
• Find smallest missing element from a sorted array
• Search an element in a circular sorted array