Relearning Binary Search

Tan Jun Rong avatar
Tan Jun Rong

Recently I've been learning data structure and algorithm. Having only taken one semester of it in college, I haven't had much experience in this. I picked up many different topics, some that I already knew, some completely new to me - hashmap, set, tree, merge sort, quick sort, dynamic programming, sliding window, binary search, graph, disjoint set, union find, topological sort, etc.

One of the topic that I learned in the school is Binary Search. It's pretty intuitive to remember. Say you have a sorted list of numbers:

[1,2,3,4,5,6,7,8]

You are asked to find a target number, say 3.

Brute force, O(n)

The easiest way is to make a simple loop and compare them 1 by 1.

Check the 1st number, is it 3?

[1,2,3,4,5,6,7,8]
 ^ 

Not found, check the 2nd number, is it 3?

[1,2,3,4,5,6,7,8]
   ^ 

Not found, check the 3rd number, is it 3?

[1,2,3,4,5,6,7,8]
     ^ 

Found!

In the worst case, this will take O(n) time, since we have to loop through the number if the target is the largest number in the list: 9.

Binary Search, O(log(n))

A simpler approach would be to eliminate half of the list every time we make a check.

Let's say we wanted to check if 7 is in the list. We start by checking with the middle element of the list.

[1,2,3,4,5,6,7,8]
         ^ 

Since 5 is smaller than 7, we can elimate everything before 5 since this list is sorted.

[x,x,x,x,x,6,7,8]

Then we repeat the same thing until we find the target number, everytime eliminating half of the list everytime.

Due to this the time complexity will become improve from O(log(n)) instead of O(n).

Searching More Than A Number

Up to this point, I'm showing some pretty typical usage of binary search, the basic concept is by halving search pool everytime. But when I stumbled upon this question while crunching through LeetCode: https://leetcode.com/problems/search-in-rotated-sorted-array/ (medium difficulty). This question is about searching a number in a sorted array, nothing too different from the question I just solved above. Except that the array is randomly rotated at a certain point. So the array actually looks like:

[7,8,0,1,2,3,4,5,6]
     ^ pivoted here

So if I can just find the point of pivot, I can perform the same binary search.

My intuition told me that I can just make a simple loop to find the pivot point. That is by checking 7, 8, then 8,0... Pair by pair until I find a drop in the pair, then I will know that it's the pivot point. That will make the bottle neck of the search to become O(n).

After reading the discussion section in LeetCode, I notice that Binary Search can actually be used to find the pivot point too! 😮

To revisit the basic concept of binary search, that is to ask the question:

"Can we eliminate half of the list everytime we make a search?"

The answer is actually yes!

Let's take a look at how we solve this example using binary search:

First, we start with checking the middle point

[7,8,0,1,2,3,4,5,6]
         ^  

Since we are not finding a single number, that's no target for us to check against. So let's think about what do we actually wanted to check for. If we draw the numbers into a graph:

   x
 x 
                 x
               x
             x
           x
         x
       x
     x
[7,8,0,1,2,3,4,5,6]

You can see that from 0 to 6 we it's always increasing, what we want to find it's a decrease. Our target is to find the decrease from 8 to 0.

We can do this check, cut the array by half, let's call it m, which stands for mid.

[7,8,0,1,2,3,4,5,6]
       ^ ^ ^  
         m

We will eliminate the half of the array that is increasing, from m + 1, we know that it's increasing, so let's eliminate that area. New array looks like this:

[7,8,0,1,2,x,x,x,x]

Now, let's do the same:

[7,8,0,1,2,x,x,x,x]
   ^ ^ ^
     m

m + 1 is greater than m, eliminate those that is increasing, new array becomes:

[7,8,0,x,X,x,x,x,x]

Repeat the same:

[7,8,0,x,x,x,x,x,x]
 ^ ^ ^
   m

Now we have found it! m is actually larger than m + 1, it's decreasing! So we have found our index of the pivot to be 2.


More similar questions

There are a few variants of this question:

Arming with a knowledge I obtained from finding the pivot point in a rotated array, I manage to solve these similar variants of the question.


Use of Binary Search Hidden in Plain Sight

Interestingly, I thought I knew all about binary search while solving diffirent variants of the question, but I stumbled upon yet another question solvable using binary search, yet disguised as a Dynamic Programming question.

👇
https://leetcode.com/problems/split-array-largest-sum/ (hard difficulty)

Here's a brief description of the question (I'll use normal English) -

Given an unsorted array like [7,2,5,10,8], it can be divided into m. Let's say m is 2, it can be divided into [7] & [2,5,10,8] or [7,2] & [5,10,8] or so on...

Divide it in a way that the largest sum of each sub-array is the smallest. From the above example case, let's list down all the different ways we can split the array:

  1. [7] & [2,5,10,8] , sum is 7 & 25
    • largest sum of this subarray = 25
  2. [7,2] & [5,10,8], sum is 9 & 23
    • largest sum of this subarray = 23
  3. [7,2,5] & [10,8], sum is 14 & 18
    • largest sum of this subarray = 18
  4. [7,2,5,10] & [8], sum is 24 & 8
    • largest sum of this subarray = 24

[7,2,5] & [10,8] is the answer, since 18 is the largest sum of the subarray that is the smallest.

Initially when I looked at the problem, it looks like a dynamic programming problem. Since it can be solved recursively. It can indeed be solved this way, but not the most efficient one. The best solution is actually binary search.

Let's try to apply Binary Search to this problem. First we need to model the question into 'Binary Searchable'. We need this few things to get started:

  1. lower boundary
  2. upper boundary
  3. able to cut the search pool by half
  4. able to check the target easily

Let's consider [7,2,5,10,8].

  1. lower boundary
    The lowest sum that be produced is when we cut it into the smallest size, when m = length. That will be m = 5. When m = 5, the array becomes [7],[2],[5],[10],[8],. The smallest largest sum would be 10, that will be the lowest bound.
  2. upper boundary
    The upper bound can be produced when m = 1, that will be [7,2,5,10,8] and the sum is 32.
  3. & 4.
    This two points can be done together. Now we have to find a number between the lower to the upper bound. Range: 10..32. We can start at the middle, mid = 21. In our question m = 2, so we can test if we can make 21 with m = 2. We can start adding up from the beginning, and make the cut when it is larger than 21. That will result in [7,2,5|10,8] and m is still not larger than 2, so it's valid. So we cut the range by half, new range: 10..21.

Now we repeat, mid = 15, check if it's valid by adding up from the beginning. [7,2|5,10|8], m has to be at least 3 to make it happen. So I change the lower bound, the new range: 15..21.

If I keep doing this, eventually we will reach the answer 18!

As we can see, we are cutting the search pool half and half each time, and that is binary search!


I'm still curious how other people can spot patterns like this, but I supposed it only comes with experience after doing more questions. 😆

Hope you enjoy the post, see you next time! 👋